AI agent planning is the process by which an agent takes a user request, breaks it into steps, selects which tools to call, and executes those steps in sequence. Most enterprise teams have learned that planning quality does not come from the model alone. It comes from the context the model is planning against. A state-of-the-art agent given ambiguous context will build a confident, wrong plan. The same agent given precise organizational context will build a tight, accurate one.
| What it covers | Most discussed paradigms | Key dependency | Observed inefficiency | Measured improvement | When humans should review |
|---|---|---|---|---|---|
| Decomposition, tool selection, execution monitoring | ReAct, Chain-of-Thought, Tree-of-Thought | Context quality | Unnecessary tool calls from uncertain agents | 39% fewer tool calls (Snowflake, 2026) | High-stakes decisions above governance threshold |
This page explains how planning paradigms work, why context poverty produces over-calling and hallucination, and what the evidence says about governed context as the fix.
Get the blueprint for implementing AI context graphs across your enterprise.
Get the Stack GuideHow do chain-of-thought, tree-of-thought, and reactive planning work?
Permalink to “How do chain-of-thought, tree-of-thought, and reactive planning work?”Planning paradigms differ in how the agent structures its reasoning before acting. The choice of paradigm shapes how the agent handles ambiguity, but none of them compensate for missing context.
| Paradigm | How it works | Strengths | Context dependence |
|---|---|---|---|
| Reactive (ReAct) | The agent interleaves reasoning steps with tool calls, observing results after each action and adjusting the next step accordingly | Fast for well-defined tasks; adapts to real-time feedback from tool responses | High — each reasoning step inherits whatever context is in the prompt; bad context produces bad reasoning at every step |
| Chain-of-Thought | The agent generates an explicit reasoning chain before taking any action, decomposing the problem into a sequence of logical steps | Reduces errors on multi-step problems; makes reasoning auditable | High — the chain is only as good as the definitions and constraints the agent is working with; ambiguous terms produce plausible but wrong chains |
| Tree-of-Thought | The agent explores multiple reasoning branches simultaneously, evaluates each branch, and selects the most promising path | Better performance on problems with high uncertainty or multiple valid approaches | Very high — branching multiplies the context dependency; each branch can diverge confidently in the wrong direction if foundational definitions are unclear |
For data leaders: a practical four-layer architecture from metadata foundation to agent orchestration.
Get the CIO GuideWhy is over-calling a symptom of context poverty?
Permalink to “Why is over-calling a symptom of context poverty?”When an agent does not know which table is authoritative, it queries several. When it cannot tell which definition of “pipeline” applies to the CFO’s question, it hedges by calling multiple tools. The result is a plan that is technically executing but doing unnecessary work at every step.
Snowflake Engineering’s 2026 research on the agent context layer documented the downstream effect of this directly. Adding an organizational ontology — structured definitions of business entities and their relationships — to the same underlying model produced a 39% reduction in tool calls and a 20% improvement in answer accuracy. The model did not change. The context did.
Over-calling also compounds the hallucination risk. Each tool call produces a response that goes back into the agent’s context window. More tool calls mean more intermediate results to reconcile. When those results conflict — because the agent queried overlapping sources — the agent must choose between them without a reliable signal for which to trust. The result is a confident synthesis of contradictory data. Better context means fewer calls, fewer conflicts, and fewer opportunities for the agent to construct a plausible but wrong answer.
Why planning and context are complementary, not substitutes
Permalink to “Why planning and context are complementary, not substitutes”A common assumption in early agentic deployments is that a more capable planning paradigm can compensate for weak context. The evidence does not support this. A more sophisticated planning paradigm applied to ambiguous context produces more sophisticated wrong plans.
“Better reasoning without context still fails — it just fails with more steps and more confidence.”
The relationship between planning quality and context quality is additive, not compensatory. The 2x2 below illustrates the four possible combinations:
| Scenario | Planning quality | Context quality | Real-world outcome |
|---|---|---|---|
| Strong model, strong context | High | High | Tight plans, accurate answers, minimal tool calls — production-ready |
| Strong model, weak context | High | Low | Sophisticated wrong answers; agent reasons confidently to incorrect conclusions |
| Weak model, strong context | Low | High | Simple plans that work because the agent has reliable ground truth to work from |
| Weak model, weak context | Low | Low | Frequent failures, high retry rates, low trust — pilots that never reach production |
The practical implication is that enterprises that upgrade the model without addressing context quality will move from the bottom-left quadrant to the top-left quadrant. They will get more sophisticated failures. The path to the top-right quadrant requires both. But for most production deployments, context quality is the binding constraint — it is what determines whether the plan the agent builds maps to the actual state of the business.
When should high-stakes planning surface for human review?
Permalink to “When should high-stakes planning surface for human review?”For routine analytical queries — summarize this report, pull last quarter’s numbers, classify this support ticket — autonomous execution is appropriate. The stakes are low, the reversibility is high, and the cost of an occasional error is manageable.
The threshold shifts when the plan’s execution produces an irreversible or high-consequence outcome. Deal exceptions, contract creation, policy modifications, changes to data access permissions, budget reallocations above a defined threshold — these are decisions where some business judgment lives outside the systems the agent can query. The agent may have access to every relevant metric and still lack the organizational context needed to weigh them correctly.
A practical governance pattern is to surface the decision trace, not just the conclusion, for human review on any plan above the threshold. The reviewer sees what the agent checked, what it concluded at each step, and what action it proposed. That visibility makes the review fast and the approval meaningful. It also creates an audit trail that satisfies the question every regulator now asks: how do you know the AI’s decision was based on current, approved information?
In multi-agent environments, the stakes compound. One agent’s planning output becomes another agent’s context input. A wrong plan executed autonomously by the first agent can cascade as a wrong premise for every downstream agent reading from the same shared layer. Surfacing the plan for human review before execution is the point in the pipeline where governance has the highest leverage.
How Atlan approaches planning through governed context
Permalink to “How Atlan approaches planning through governed context”The planning problem is a context problem. Atlan’s approach is to solve it at the infrastructure layer — making context available, accurate, and governed before any agent starts to plan.
The Context Engineering Studio is where this happens. It combines several capabilities that directly address the planning failures described above:
- Enterprise Data Graph: A structured representation of every data asset, its relationships, its business definitions, and its ownership. When an agent plans against the Enterprise Data Graph, it knows which table is authoritative for revenue, which definition of “pipeline” applies in which context, and which team owns the answer. Ambiguity — the root cause of over-calling — is resolved before the plan starts.
- Context agents: Specialized agents that run in advance of query-answering agents, retrieving the specific organizational context a downstream agent needs for a given request. The query-answering agent receives a pre-populated context package rather than a raw schema.
- Atlan MCP server: Exposes governed context directly to agent frameworks via the Model Context Protocol, making Atlan’s Enterprise Data Graph available as a structured context source without custom integration work.
- AI governance workflows: Approval routing, decision traces, and provenance tracking built into the platform, so that plans above the governance threshold surface for human review without requiring custom tooling.
The outcome
Permalink to “The outcome”Tighter plans. Fewer unnecessary tool calls. Agents that know what they do not know and ask rather than hallucinate. And a shared context layer that compounds in accuracy as each agent’s certified learnings propagate to every downstream agent reading from the same layer.
Learn how context engineering drove 5x AI accuracy in real customer systems.
Download E-bookHow enterprises ground planning in governed context
Permalink to “How enterprises ground planning in governed context”Workday
Permalink to “Workday”“We built a revenue analysis agent and it couldn’t answer one question. We started to realize we were missing this translation layer. All of the work that we did to get to a shared language amongst people at Workday can be leveraged by AI via Atlan’s MCP server.” — Joe DosSantos, VP Enterprise Data & Analytics, Workday
Workday’s experience captures the core planning failure precisely. The agent had access to the right data. It lacked the translation layer — the organizational ontology — that would have told it how to interpret that data in Workday’s specific business context. Adding Atlan’s governed context layer via the MCP server gave the agent the shared language that Workday’s human teams had spent years building. The planning problem resolved because the context problem resolved.
Mastercard
Permalink to “Mastercard”“When you’re working with AI, you need contextual data to interpret transactional data at the speed of transaction (within milliseconds). So we have moved from privacy by design to data by design to now context by design. We needed a tool that could scale with us. We chose Atlan, a platform that’s configurable, intuitive, and able to scale with our 100M+ data assets.” — Andrew Reiskind, Chief Data Officer, Mastercard
Mastercard’s framing — context by design — is a useful summary of the shift required. Planning that works at transaction speed, at Mastercard’s scale, requires context that is already structured, governed, and available before the agent asks. You cannot resolve ambiguity at inference time when inference must complete in milliseconds. The context must be right before the plan starts.
Why enterprise planning starts with context, not reasoning
Permalink to “Why enterprise planning starts with context, not reasoning”The teams shipping trusted agents in 2026 are not the teams with the most sophisticated planning paradigms. They are the teams that solved the context problem first. Chain-of-thought, ReAct, and tree-of-thought are all capable paradigms. None of them produce reliable plans when the foundational definitions are wrong, ambiguous, or missing.
Planning without AI Agent context is planning to the wrong destination. The agent may execute every step correctly and still deliver an answer that contradicts what the business actually intended. Governance is not an obstacle to planning capability — it is the infrastructure that makes planning capability trustworthy. The enterprises that understand this distinction are the ones that have moved from pilots to production.
For teams evaluating where to invest, the research is consistent: context quality is the binding constraint. A tighter context layer produces tighter plans. Tighter plans produce fewer errors, fewer unnecessary tool calls, and answers that the business can act on with confidence.
FAQs about AI agent planning
Permalink to “FAQs about AI agent planning”1. Why do agents over-call tools? Is that a planning failure?
Permalink to “1. Why do agents over-call tools? Is that a planning failure?”Over-calling is almost never a pure planning failure. It is context poverty in disguise. The agent does not know which table is authoritative, so it queries multiple tables. It does not know which definition applies, so it hedges by checking several sources. Better context means the agent arrives at each step already knowing what it needs, and calls only the tools that will move the plan forward. Snowflake’s research documented a 39% reduction in tool calls simply by adding an organizational ontology to the same model. The planning paradigm did not change. The context did.
2. Can a better model fix bad planning caused by weak context?
Permalink to “2. Can a better model fix bad planning caused by weak context?”No. A smarter model will produce more sophisticated wrong answers. If the agent does not know which definition of “pipeline” the CFO uses, a better model will reason more confidently to the wrong conclusion. Context quality is a prerequisite for planning quality, not a nice-to-have. The only way to fix a context problem is to fix the context. Upgrading the model without addressing the context layer moves you from simple wrong answers to elaborate wrong answers — which are harder to detect and more dangerous to act on.
3. When should planning surface to humans for review?
Permalink to “3. When should planning surface to humans for review?”For routine queries, autonomous execution is fine. The cost of an occasional error is low and the reversibility is high. The threshold shifts when the plan’s execution produces an irreversible or high-consequence outcome. Deal exceptions, contract creation, policy modifications, data access changes, budget reallocations above a defined threshold — these are decisions where some business judgment lives outside the systems the agent can query. Surface the plan before execution, show the reviewer the decision trace (what the agent checked, what it concluded at each step), and get explicit approval. That visibility makes the review fast and creates the audit trail that governance requires.
4. How much does organizational context improve planning efficiency?
Permalink to “4. How much does organizational context improve planning efficiency?”Research from Snowflake shows approximately 39% reduction in unnecessary tool calls when agents receive semantic views and organizational context, alongside a 20% improvement in answer accuracy. The same model, different context, substantially different outcomes. The efficiency gain compounds across a multi-agent system: every additional agent benefits from the context layer earlier agents helped build, and every certified correction propagates to all downstream agents without requiring any per-agent work.
5. Is planning different from reasoning? Are they the same thing?
Permalink to “5. Is planning different from reasoning? Are they the same thing?”Reasoning is how the agent thinks. Planning is how it decides what to do. An agent can reason brilliantly about a problem it misunderstands. Chain-of-thought, ReAct, and tree-of-thought are reasoning paradigms — they shape how the agent generates and evaluates steps. Planning is the output: the sequence of actions the agent commits to based on its reasoning. You need both reasoning capability and context quality for reliable plans. Strong reasoning over weak context produces a well-structured wrong plan. Weak reasoning over strong context produces a simple right plan. The goal is strong reasoning over strong context — but for most enterprise deployments, context is the binding constraint.
Sources
Permalink to “Sources”Share this article
