Tool use in AI agents: how agents interact with external systems

Q: 1. Why do two agents calling the same tool sometimes return different results?

Because the tool definition only specifies the mechanics of the call: the function name, parameters, and types. It does not specify which table is canonical for a given metric or which version of a business term is authoritative. Two agents calling the same Snowflake query function will return different results if they draw on different definitions of the same metric. Shared, governed context infrastructure is what ensures both agents construct identical calls when asking the same business question.

Q: 5. How do you control which tools an AI agent can access?

Through access control built into the context layer alongside the underlying data systems, enforced before a tool call reaches the underlying system. Obsidian Security finds that agents are typically granted 10x more access than they actually require. The architectural requirement is machine-readable access policies that the context layer enforces at the tool invocation layer, with a complete audit trail covering which tools were called, by which agent, with which parameters, and at what point in the reasoning chain.

Emily Winks

Data Governance Expert

Updated:05/14/2026

Published:05/14/2026

14 min read

Watch Context Agents Live Get the Context Layer Ebook

Key takeaways

Function calling is easy; knowing the right tool, source, and rule before the call is the real production challenge.
Correct tool use depends on context: canonical tables, trusted metric definitions, and enforced business rules.
MCP connects agents to tools and data; A2A coordinates agents with each other across distributed systems.
Shared context and strict access control prevent siloed agent actions and excessive service-account permissions.

What is tool use in AI agents?

Tool use is how AI agents take action on external systems instead of generating text: querying databases, calling APIs, running code, navigating browsers. The mechanics of function calling are well understood. What most guides omit is what makes those calls produce correct results: the governed context layer the agent draws on before constructing the call. Gravitee's 2026 survey of 900+ practitioners found 88% of organizations reported AI agent security incidents last year.

Five tool types agents use to act on external systems:

Data retrieval: Querying databases and warehouses (e.g., a Snowflake function)
API connector: Calling external services (e.g., CRM, ERP, billing)
Code executor: Running code in a sandbox and returning output
Browser tool: Navigating web interfaces and extracting content
Agent delegator: Handing tasks to specialist agents via A2A

Assess Your Context Readiness

Assess Your Readiness

The mechanics of function calling are the easy part. Knowing which function to call, with which parameters, against which data source, and according to whose definition — that is where production agents succeed or fail.

Gravitee’s State of AI Agent Security 2026 report found 88% of organizations reported confirmed or suspected AI agent security incidents last year, while only 14.4% send agents to production with full security or IT approval. Most of that exposure comes from agents calling the right function with the wrong parameters, against the wrong data source, with access they were never supposed to have.

Tool use in agentic AI gives agents the ability to act on external systems rather than generate text. What most guides skip is the context the agent draws on before the function call is constructed. This article covers both.

Build Your AI Context Stack

Get the blueprint for implementing context graphs across your enterprise. This guide walks through the four-layer architecture — from metadata foundation to agent orchestration — with practical implementation steps for 2026.

Get the Stack Guide

How does tool use work in AI agents?

Tool use, or function calling, is the mechanism by which an AI agent takes action on external systems rather than generating text responses. Instead of answering a question from its training or context window, the agent calls a function (a database query, an API endpoint, a code interpreter, or a browser action) and incorporates the result into its reasoning.

The mechanics work in three stages.

How are tools defined for an agent?

Tools are described to the model as JSON schemas: the function name, its purpose, each input parameter with its type and constraints, and which parameters are required. The model reads this schema and uses it to decide when the tool is relevant and what arguments to construct for each call.

What is the decide-call-incorporate loop?

When an agent receives a query, it reasons about whether a tool call is needed. If the query requires information outside its context window (current database values, live API responses, or code execution results), it constructs a tool call with the appropriate arguments and pauses its reasoning. The tool executes and returns a result. The agent incorporates that result into its context and continues reasoning, calling additional tools if needed before producing a final output.

How do parallel and sequential tool calls differ?

Agents call tools sequentially, where each call informs the next, or in parallel, where multiple independent calls execute simultaneously and their results are synthesized together. For a query like “compare revenue across three business units,” a well-designed agent calls three data retrieval functions simultaneously, reducing latency and intermediate reasoning errors.

Each tool type an agent can call carries its own context requirements. Getting those requirements wrong produces incorrect results regardless of how well the tool itself is built.

Tool type	What it does	Context requirement	Example
Data retrieval	Queries databases and data warehouses	Canonical table and column definitions, metric lineage	Snowflake query function
API connector	Calls external services and endpoints	Endpoint authorization, parameter definitions, rate limits	CRM opportunity pull
Code executor	Runs code and returns output	Scope constraints, authorized libraries, sandboxing rules	Python data transformation
Browser/web	Navigates and extracts from web interfaces	Access policies, session handling, scope limits	Web scraping agent
Agent delegator	Hands tasks to specialist agents via A2A	Shared context layer, delegation rules, result schema	Multi-agent orchestration

Why does context determine call quality?

The mechanics of tool use are well-documented across most implementation guides. What gets far less coverage is how context quality determines whether those mechanics produce correct outputs.

Consider a tool definition for a Snowflake query function. The schema specifies a table name, a list of columns, a filter condition, and a time range. The model can construct a valid call from that definition. From the tool definition alone, the model has no basis for knowing which table name is authoritative for revenue, which column represents what this specific business calls “ARR,” or whether the filter condition should scope to contracts or recognized transactions.

Those decisions come from context. An agent with access to a governed business glossary that maps “ARR” to a specific table, schema, and column definition will construct a correct call. An agent reasoning from training data and surface-level query patterns will construct a plausible call that returns a number — and that number may not match what the finance team considers correct. Without that translation layer, agents produce AI agent hallucination that looks authoritative because the tool call succeeded.

Snowflake Engineering published a controlled study in March 2026 finding that adding an organizational ontology to an agent receiving semantic views improved answer accuracy by 20% and reduced tool calls by roughly 39%. The tool infrastructure was unchanged. The context layer changed.

The principle generalizes: every tool call is only as good as the context that shapes the parameters passed to it. An agent calling a CRM API without knowing which opportunity stage definitions the sales team uses will pull pipeline data that is structured correctly and interpreted wrongly.

Function calling gives agents hands. The context layer gives them judgment to know which tool to call, when, and with parameters that reflect actual business definitions.

CIO Guide to Context Graphs

For data leaders evaluating where to start, Atlan's CIO guide to context graphs walks through a practical four-layer architecture — from metadata foundation to agent orchestration — with implementation priorities for 2026.

Get the CIO Guide

MCP and A2A: two standards reshaping enterprise tool use

Two open standards have emerged to address the infrastructure challenges that enterprise tool use creates, and they are frequently conflated despite solving different problems.

What is the Model Context Protocol (MCP)?

MCP handles agent-to-context connections. Launched by Anthropic in November 2024 and subsequently adopted by OpenAI, Google DeepMind, and donated to the Linux Foundation’s Agentic AI Foundation in December 2025, MCP defines how agents connect to external knowledge sources and pull that context into their reasoning before and during tool calls. MCP server downloads grew from roughly 100,000 in November 2024 to over 8 million by April 2025, with more than 5,800 servers and 300 clients now available across the ecosystem.

The protocol standardizes what would otherwise require custom integration code for every context source. Without a standard protocol, integration complexity grows quadratically as agents multiply. With MCP, integration effort scales linearly.

What is the Agent-to-Agent Protocol (A2A)?

A2A (Agent-to-Agent Protocol) handles agent-to-agent coordination. Developed by Google, A2A defines how a master agent discovers specialist agents, delegates tasks to them, and receives structured results. It governs how agents hand off work, communicate intermediate results, and signal completion across a shared task.

How do MCP and A2A fit together?

MCP and A2A operate at different layers. MCP governs what the agent knows; A2A governs how agents coordinate. An enterprise architecture can use both in parallel.

Atlan exposes its enterprise context layer via MCP, making governed business context (metric definitions, data lineage, access policies) available to any agent connecting via the protocol, including those in A2A-coordinated multi-agent systems. When an agent queries Atlan’s context layer before constructing a Snowflake query, it ensures parameters reflect the organization’s actual data definitions. Atlan’s guide to in-context vs. external memory for AI agents covers the architectural tradeoffs in more depth.

Inside Atlan AI Labs & The 5x Accuracy Factor

Learn how context engineering drove 5x AI accuracy in real customer systems. Explore real experiments, quantifiable results, and a repeatable playbook for closing the gap between AI demos and production-ready systems.

Download E-book

Tool use at scale: the enterprise consistency problem

Individual tool use is tractable. The harder problem emerges when an enterprise deploys multiple agents, each calling tools with different assumptions about which data is authoritative.

Agent A (sales operations) queries Snowflake using the CRM’s definition of “qualified opportunity.” Agent B (finance) queries the same environment using finance’s definition with different stage criteria. Both produce valid results. Neither matches the other, and neither team knows why the pipeline numbers differ.

The data silo problem has been recreated at the tool call layer, as Atlan’s research on multi-agent memory silos documents.

Ungoverned agentic tool use recreates the data silo problem at the function call layer, one agent at a time. Ungoverned Context × Agent Autonomy = Increased Risk Exposure.

The architectural fix is a shared context layer that governs which tools should be called for which data assets, and with which parameters. When both agents consult the same governed glossary, “qualified opportunity” resolves to a single canonical answer and tool calls differ only in filter criteria. See event-driven architecture for AI agents and the agent context layer guide for synchronization patterns.

How should enterprises control tool-layer access?

Function calling creates a new surface for data access, and it often bypasses the access controls built into the underlying systems.

A data warehouse may have row-level security and role-based access controls configured for human users. When an AI agent queries that warehouse through a tool call using a service account, it may have broader access than any individual human. Research from Obsidian Security found agents are granted 10x more access than they actually require and over 800 risky agents exist in the average enterprise environment.

The governance gap is significant. 88% of organizations reported AI agent security incidents in the last year, while only 14.4% send agents to production with full security or IT approval, per Gravitee’s 2026 report. Tool invocations are trusted by default in most deployments, with no audit trail tracking agent activity.

Governing tool access requires access control built into the context layer alongside the data layer, as covered in Atlan’s guide to AI agent memory governance. The context layer enforces which agents can query which data assets before a tool call reaches the underlying system, and maintains a complete decision trace.

The AI Control Plane sits alongside the context layer: the context layer establishes what agents should know; the control plane enforces what they are authorized to do, keeping access policies separate from context architecture.

How Atlan approaches tool-layer context and access

The challenge

Enterprises wire agents to tool APIs and discover only after production incidents that agents were calling the wrong tables with the wrong parameters. The tool schema tells the agent what is possible; it does not tell the agent what is correct or permitted.

The approach

Context Engineering Studio bootstraps a governed context layer from SQL history, BI dashboards, lineage, and business glossaries. The Atlan MCP server exposes that governed context to any agent speaking MCP, so a LangChain agent, a CrewAI crew, and an OpenAI Agents SDK deployment all consult the same canonical definitions. Access policies enforce what each agent is authorized to access, with a complete audit trail; context agents continuously evaluate quality.

The outcome

Tool calls reflect business definitions rather than training-data approximations. Two agents asking the same business question construct identical calls. Access is governed at the context layer rather than the service account.

How enterprises govern tool use at scale

Mastercard

Mastercard treats context as a first-class requirement alongside privacy and data design. The context layer ensures agents interpret transactional data correctly at financial-services speed.

“When you’re working with AI, you need contextual data to interpret transactional data at the speed of transaction (within milliseconds). So we have moved from privacy by design to data by design to now context by design.” — Andrew Reiskind, Chief Data Officer, Mastercard

DigiKey

DigiKey treats the context layer as operating infrastructure, exposing metadata through MCP so every agent uses the same governed interface.

“Atlan is much more than a catalog of catalogs. It’s more of a context operating system… Atlan enabled us to easily activate metadata for everything from discovery in the marketplace to AI governance to data quality to an MCP server delivering context to AI models.” — Sridher Arumugham, Chief Data & Analytics Officer, DigiKey

Why every tool call is a context decision first

The mechanics of function calling in AI agents are well-understood. The schemas work. The loops work. The parallel execution works. What most guides leave unsaid is that none of this machinery knows your business.

The choice of which tool to call, with which parameters, against which data source, under which access rules — is a context decision before it is a mechanical one. Teams treating tool use as a pure engineering problem ship agents that call the right function against the wrong data, return a clean result, and quietly mislead the people relying on it.

The teams that ship production-grade agents treat context as the layer the tool call depends on. They govern the definitions, the access, and the audit trail as infrastructure. The function call becomes the easy part.

Book a Demo

FAQs about AI agent tool calls

1. Why do two agents calling the same tool sometimes return different results?

Because the tool definition only specifies mechanics: the function name, parameters, and types. It does not specify which table is canonical for a given metric or which version of a business term is authoritative. Two agents calling the same Snowflake query function will return different results if they draw on different metric definitions. Shared, governed context infrastructure is what ensures both agents construct identical calls when asking the same business question.

2. What is the difference between a tool call that fails technically and one that fails organizationally?

A technical failure is visible: the function throws an error or times out. An organizational failure is invisible: the call executes successfully and returns a result that does not match what the business considers correct. Organizational failures are harder to catch precisely because they look like successes, and are more dangerous in autonomous deployments where no human reviews the output before it reaches production.

3. How does MCP change the way enterprises connect agents to governed data sources?

Before MCP, connecting an agent to an external governed context source required custom integration code for every source the agent needed to access. MCP standardizes that connection: any agent that speaks the protocol can connect to any MCP-compatible context source without custom work, changing the scaling equation from quadratic to linear integration complexity.

4. What does it mean for a tool call to be context-aware versus context-blind?

A context-blind tool call is technically valid but organizationally wrong. The agent constructs the call from the tool schema alone without knowing which table is canonical for the metric it is querying or whether the data source is under a quality hold. A context-aware tool call draws on a governed context layer first, consulting the business glossary and confirming the authoritative source before constructing parameters.

5. How do you control which tools an AI agent can access?

Through access control built into the context layer, enforced before a tool call reaches the underlying system. Obsidian Security finds that agents are typically granted 10x more access than they require. The architectural requirement is machine-readable access policies enforced at the tool invocation layer, with a complete audit trail covering which tools were called, by which agent, with which parameters, and at what point in the reasoning chain.

6. Are MCP and A2A competing standards?

No. They operate at different layers and are complementary. MCP governs how an agent connects to external knowledge and context. A2A governs how agents coordinate with each other. A multi-agent system can use both: A2A for inter-agent coordination while each agent uses MCP to connect to the shared context layer.

Sources

1.State of AI Agent Security 2026, Gravitee
2.Agent Context Layer Study, Snowflake
3.AI Agent Risk Report, Obsidian Security
4.MCP Statistics, MCP Evals
5.MCP Enterprise Adoption Guide, Deepak Gupta

Share this article

Atlan is the Context Layer for AI — a Leader in the Gartner Magic Quadrant for D&A Governance (2026) and the Forrester Wave for Data Governance (Q3 2025). Atlan unifies your data, business knowledge, and the meaning behind your terms into one Enterprise Data Graph that gives every team and every AI agent the trusted context they need. Trusted by Mastercard, Workday, General Motors, CME Group, HubSpot, FOX, Virgin Media O2, Elastic, and 400+ enterprises representing $10T+ in market cap.

Book a Demo Context Studio Live