AI Agents for Financial Services: MRM & Context Architecture for 2026 & Beyond

Emily Winks

Data Governance Expert

Updated:05/29/2026

Published:05/29/2026

15 min read

Watch Context Agents Live Get the Context Layer Ebook

Key takeaways

SR 26-2 formally excludes AI agents from MRM scope, yet model risk committees already apply its principles.
Decision traces are the compliance artifact second-line functions and regulators require for AI.
Context portability is a governance obligation: regulators may require decision reconstruction years later.
Financial AI agents earn autonomy through governance layers, not capability alone.

What are AI agents for financial services?

AI agents for financial services are autonomous systems that perceive data from trading platforms, risk engines, core banking infrastructure, and CRM systems, reason over it, and take action without step-by-step human direction. Unlike general-purpose enterprise agents, they operate within overlapping regulatory frameworks—SR 11-7, SS1/23, MiFID II, and the EU AI Act—making governance an architectural prerequisite, not an optional layer or afterthought.

Requirements for financial services AI agents:

Model risk management integration: AI agents that inform financial decisions increasingly require inventory, validation, and monitoring.
Financial ontology: Canonical, certified definitions for accounts, products, portfolios, and risk measures with source-to-agent lineage.
Context portability: Organizational context in open, standard formats consumable by any agent framework without vendor lock-in.
Policy enforcement at context delivery: Role, use-case, and data-sensitivity controls enforced before any context reaches an agent.
Decision traces: A full, queryable audit trail of agent reasoning linked to governed data products and policies.
Human-in-the-loop checkpoints: The most critical financial decisions must require human review before execution.

Is your data estate AI-agent ready?

Assess Your Readiness

Build Your AI Context Stack

Get the blueprint for implementing context graphs across your enterprise. This guide walks through the four-layer architecture—from metadata foundation to agent orchestration—with practical implementation steps for 2026.

Get the Stack Guide

How are AI agents being used for financial services? An overview.

Gartner reports that 57% of finance teams are already implementing or planning to implement agentic AI into finance functions. The global AI agents for the financial services market is expected to go from USD 2.1 billion in 2024 to USD 80.9 billion by 2034.

AI agents are being used to improve customer service, speed up KYC and claims processing, and automate fraud detection and response, among other use cases.

The use cases already in production span the full breadth of financial services operations:

Customer service: Agents handle account inquiries, dispute resolution, and service requests across digital channels, without human escalation for routine cases.
KYC and onboarding: Agents classify documents, extract and validate KYC data points, remediate gaps, and present data in a consistent format for compliance review.
Fraud detection and response: Agents monitor transactions in real time, flag anomalous patterns, and trigger investigation workflows, compressing detection-to-response cycles from hours to minutes.
Regulatory reporting: Agents pull data from source systems, apply certified calculation logic, and generate report submissions on schedule, with a traceable audit trail for each figure.
Claims processing: Agents assess claims against policy terms, gather supporting documentation, and route cases for human review based on risk classification.
AML screening: Agents cross-reference transaction data against sanctions lists and typology libraries, escalating cases that exceed defined risk thresholds.

The productivity gains are already measurable in production. Accenture describes a European bank deployment:

“One of the earliest uses was to classify documents, ingest and extract KYC data points, validate and remediate missing data, then present the data in a consistent format for the KYC agent to validate. This reduced ingestion time by 99% and costs by 94%.”

— Accenture on how AI agents are improving financial services workflows

Why are financial services AI’s most complex deployment environment?

The deployment constraints in financial services are severe, with AI agents having to comply with multiple simultaneous regulatory frameworks:

Prudential regulation: The Federal Reserve’s SR 26-2 is the current MRM framework governing quantitative models. Generative AI and agentic AI are explicitly excluded from its formal scope for now, but banks are still expected to apply appropriate governance. The UK PRA’s SS1/23 carries parallel expectations for UK-supervised institutions.
Market conduct regulation: ESMA’s February 2026 supervisory briefing on algorithmic trading under MiFID II requires firms to demonstrate governance, testing, and explainability for AI used in trading systems.
AI-specific regulation: The EU AI Act classifies credit scoring, insurance underwriting, and creditworthiness assessment as high-risk AI applications, requiring conformity assessment, technical documentation, and ongoing monitoring before deployment.
Basel framework: Capital adequacy calculations must follow documented, validated methodologies, a requirement that extends to any model or agent that contributes to those calculations.

Satisfying any of these frameworks requires the same foundational capability: agents that produce consistent, traceable outputs tied to authoritative definitions.

The financial ontology problem: Why wrong definitions produce wrong regulatory outputs

Every major financial institution has a definition problem at scale. “Revenue,” “customer,” “exposure,” “active account,” “limit,” and “NPL” each have multiple competing definitions encoded in different systems, built by different teams, across different lines of business.

Key financial terms require canonical governance before agents can use them reliably:

Exposure: Often defined differently for credit risk, market risk, and regulatory capital purposes within the same institution.
NPL (non-performing loan): Classifications vary across jurisdictions and internal risk frameworks.
ARR (annual recurring revenue): Definition varies by line of business and may conflict between finance and sales systems.
Customer: May be counted at the relationship level, the account level, or the legal entity level depending on the context.
Limit: Credit limits, trading limits, and regulatory limits are distinct constructs that often share field names across systems.

A retail banking agent and a capital markets agent may both query “exposure” and receive different answers, because the term resolves differently in each underlying system. This is an ontology problem.

The solution is a canonical financial ontology and semantic layer, with certified definitions, lineage tracing, and policy governance, that all agents query consistently. Without it, agents produce answers that cannot be validated against any authoritative source, and no model risk committee will approve them for production.

For Data Leaders Evaluating Where to Start

Atlan's CIO guide to context graphs walks through a practical four-layer architecture from metadata foundation to agent orchestration.

Get the CIO Guide

Model risk management and AI agents: Where SR 26-2 leaves a gap institutions must close

In April 2026, the Federal Reserve, OCC, and FDIC jointly issued SR 26-2, the Revised Guidance on Model Risk Management, superseding the long-standing SR 11-7 from 2011. SR 26-2 updates the MRM framework to reflect fifteen years of supervisory experience and industry feedback, including the rapid evolution of AI.

There is one detail institutions deploying AI agents need to understand before assuming they are covered: SR 26-2 explicitly excludes generative AI and agentic AI from its formal scope.

The guidance states directly: “Generative AI and agentic AI models are novel and rapidly evolving. As such, they are not within the scope of this guidance.”

However, the guidance does not leave institutions without direction. It adds: “Nonetheless, a banking organization’s risk management and governance practices should guide the determination of appropriate governance and controls for any tools, processes, or systems not covered in this document.”

Why the MRM principles still apply to AI agents in practice

The regulatory gap is real, documented, and unlikely to be permanent. In practice, model risk committees are already applying MRM principles to AI agents. The four categories of expectation that govern in-scope models remain the de facto standard:

Model inventory: A comprehensive register of any system with material influence on financial decisions, including agents, even where SR 26-2 requirements do not yet formally apply.
Validation: Benchmarking against documented methodologies before deployment and periodically thereafter.
Ongoing performance monitoring: Continuous assessment of agent outputs for degradation, data drift, and distributional changes in inputs.
Explainability documentation: Traceable reasoning for second-line functions and regulators, also independently required under MiFID II and the EU AI Act.

Institutions building MRM-grade governance for their AI agents now will have documented inventory, validation records, decision traces, and monitoring reports in place when formal requirements arrive.

Decision traces and context portability: The two governance requirements most teams miss

The ontology problem and the MRM gap both resolve into the same operational question: when a regulator asks what an agent did and why, can you answer? Decision traces are how you answer.

Decision traces and compliance

In a post-trade audit, a compliance officer needs to reconstruct exactly what data an agent used to flag a transaction, which rules it applied, and what threshold triggered the alert. Without a complete, queryable record of agent reasoning linked to governed data assets, that reconstruction is impossible.

Decision traces are the primary evidence required by:

Second-line risk and compliance functions conducting model validation reviews
Internal audit teams assessing AI system controls
Regulators requesting explanation of automated decisions in supervisory examinations
Courts or arbitration panels in post-dispute proceedings

A complete decision trace links input data, the canonical definitions applied, the policies in effect, the reasoning chain, and the output.

Context portability and regulatory risk

Solving the ontology problem and building decision traces only works if that context is portable.

When organizational context—including definitions, policies, ontology, and decision history—is locked into a single LLM vendor or agent platform, the institution faces three distinct risks:

Vendor dependency risk: If the provider changes its API or exits the market, the institution loses access to the context that governed its decisions.
Regulatory reconstruction risk: Regulators may require demonstration of what context was available to an agent at a specific point in time, potentially years after the fact. A proprietary context store cannot reliably support this.
Auditability risk: Internal and external auditors need to evaluate the context that governed agent decisions independently of the agent platform. Proprietary stores make that independent evaluation difficult.

The right architecture stores context in open formats with time-travel capability, so any authorized party can retrieve the exact context that governed an agent decision at any historical moment.

The autonomy ladder: How governance enables agents to do more in financial services

The return on governance infrastructure in financial services is measured directly in agent autonomy. Every layer an institution puts in place—from decision traces to context portability—expands the scope of what the model risk committee will approve.

A BCG study found that agentic AI in financial analysis reduced risk events by 60% in pilot environments. The key qualifier is “pilot environments.” Moving that performance to production requires earning each stage.

Stage 1: Assisted analysis. Every agent output reviewed by a human before action. Requires canonical definitions, freshness signals, and basic decision logging.
Stage 2: Supervised automation. Routine tasks with periodic rather than output-by-output review. Adds model inventory registration and output monitoring.
Stage 3: Conditional autonomy. The agent acts within defined parameters and escalates outside them. Requires decision traces and second-line sign-off before deployment.
Stage 4: Expanded autonomy. Broader case coverage with ongoing monitoring and a controlled process for updating decision logic when policies change.

Each stage requires a corresponding governance layer. Skip one, and the model risk committee will block deployment.

What a governed financial AI architecture looks like: 5 foundational layers

A production-grade architecture for AI agents in financial services has five layers. Each resolves one or more of the governance requirements described above.

Layer 1: Financial ontology and semantic layer

Every key term must have a canonical, certified definition with lineage tracing from source system to agent-facing view before any agent queries financial data. This layer is the prerequisite for consistent, auditable, and MRM-compliant outputs.

Layer 2: AI asset registry and model inventory

Every AI agent must be registered in a governed inventory that links each agent to the data it consumes, the policies it operates under, the definitions it applies, and its validation and monitoring status. This satisfies the model inventory requirements of SR 11-7 and SS1/23, and gives model risk and audit functions a single point of truth for every AI system in production.

Layer 3: Policy enforcement at the context delivery layer

Access controls, regulatory constraints, and firm-specific policies must be enforced at the layer that delivers context to agents, not reimplemented inside each individual agent. A centralized MCP-compatible context endpoint evaluates role, use case, and regulatory constraint before any context is delivered.

Layer 4: Decision traces and audit infrastructure

Every agent action must be linked to the data products, definitions, policies, and reasoning steps that produced it. These traces must support point-in-time reconstruction for regulatory examinations that may occur years after the fact.

Layer 5: Context repos and versioning

Governed, versioned context bundles per line of business allow agents to consume certified context, reused across use cases, versioned for audit, and updated through a controlled change process. When a policy changes, the version history preserves what was in effect at every prior point in time.

How Atlan supports financial services AI agents in production

Atlan operates as the governed context layer and AI control plane for financial services AI, connecting data systems to AI agents through a single, policy-enforced infrastructure.

Financial ontology and semantic layer: Canonical, certified definitions for financial entities, with lineage from source systems to agent-facing views. Every agent queries the same certified definition, regardless of the connecting system.
Context Engineering Studio: The workspace where financial institutions can build, test, and validate the context layer before agents reach production.
Context Repos: Packaged, versioned context bundles per line of business, maintained and distributed independently, with a full audit history for every version.
Context Agents: AI agents that automatically generate and enrich the financial context layer as data assets and regulatory definitions change.
AI asset registry: Atlan’s AI Governance module maintains a governed inventory of every model, agent, prompt, and workflow, linked to the data it consumes and the benchmarks it has been validated against.
Decision traces: A full audit trail of agent reasoning, queryable by risk, audit, and compliance teams without requiring access to the agent infrastructure itself.
MCP server and policy enforcement: Atlan’s MCP server is the governed context endpoint for financial AI agents. Before any context reaches an agent, it enforces what the asset means in the financial ontology, whether it meets the freshness threshold, and which policies apply.

Inside Atlan AI Labs & The 5x Accuracy Factor

Learn how context engineering drove 5x AI accuracy in real customer systems. Explore real experiments, quantifiable results, and a repeatable playbook for closing the gap between AI demos and production-ready systems.

Download E-book

Real stories from financial services customers building governed AI context

CME Group: context at speed across 18 million assets

"With Atlan, we cataloged over 18 million data assets and 1,300+ glossary terms in our first year, so teams can trust and reuse context across the exchange."

— Kiran Panja, Managing Director, CME Group

Watch Now

Mastercard: context by design for AI at scale

"AI initiatives require more context than ever. Atlan's metadata lakehouse is configurable, intuitive, and able to scale to hundreds of millions of assets. As we're doing this, we're making life easier for data scientists and speeding up innovation."

— Andrew Reiskind, Chief Data Officer, Mastercard

Watch Now

Moving forward with AI agents for financial services

The path to production-grade AI agents in financial services is an architectural one. Start by building the governance infrastructure first: the financial ontology, the model inventory, the decision traces, and the context portability. Then build the agents.

Identify workflows where compliance requirements are clearest and ROI is most measurable: regulatory reporting, post-trade reconciliation, AML screening. Use those deployments to establish the governance baseline—a registered model inventory, validated benchmarks, and a complete decision trace record.

Next, use that governance baseline to earn expanded autonomy for AI agents in more consequential finserv use cases: credit decisioning, risk stratification, and customer advisory.

Book a Demo

FAQs about AI agents for financial services

What is an AI agent in financial services?

An AI agent in financial services is an autonomous or semi-autonomous software system that perceives data, reasons over it, and takes action across multi-step workflows. Unlike chatbots, financial AI agents execute complex workflows spanning multiple systems, adapt their behavior based on intermediate results, and interact with external tools to complete tasks end-to-end.

What is the financial ontology problem for AI agents?

The financial ontology problem refers to the fact that major financial institutions have multiple competing definitions for the same business terms, encoded in different systems across different lines of business. Terms like “exposure,” “NPL,” “customer,” and “limit” may resolve differently depending on which system an agent queries. An agent that retrieves whichever definition responds first will produce inconsistent outputs, making regulatory reports unreliable and audit reconstruction impossible.

What are decision traces and why do financial AI agents need them?

A decision trace is a complete, queryable record of the reasoning an AI agent applied to produce a specific output, including the data it queried, the definitions it applied, the policies in effect, and the sequence of steps from input to output. Financial AI agents need decision traces because second-line risk functions, internal audit teams, and regulators must reconstruct agent reasoning for supervisory review, model validation, and post-incident investigation.

How does the EU AI Act affect AI agents in financial services?

The EU AI Act classifies several financial services AI applications as high-risk, including AI used for creditworthiness assessment, credit scoring, and insurance underwriting. High-risk AI systems under the Act require conformity assessment, technical documentation, data governance practices, human oversight measures, and registration in the EU AI database before deployment.

What is context portability and why does it matter for financial AI governance?

Context portability refers to the ability of an organization to maintain its AI decision context—including definitions, policies, ontology, and decision history—in open, standard formats consumable by any agent framework, rather than locked into a proprietary vendor platform. For financial institutions, context portability is a governance requirement. Regulators may require reconstruction of historical agent context years after the fact, and vendor-locked context stores make that reconstruction operationally difficult.

Share this article

Sources

[1]
Agentic AI Will Transform Finance: Here's What CFOs Should Do Now — Gartner, Gartner, 2026
[2]
Agentic AI for Financial Services Market — Market.us, Market.us, 2026
[3]
Agentic AI and the future of work in financial services — Accenture Banking, Accenture, 2026
[4]
Anthropic Releases New AI Agents for Financial Services Firms — WSJ, The Wall Street Journal, 2026
[5]
FRB Supervisory Letter SR 26-2 on Model Risk Management — Federal Reserve, Federal Reserve, 2026-04-17
[6]
SS1/23 – Model risk management principles for banks — PRA, Bank of England, 2023
[7]
ESMA issues a supervisory briefing on algorithmic trading — ESMA, ESMA, 2026

Atlan is the Context Layer for AI — a Leader in the Gartner Magic Quadrant for D&A Governance (2026) and the Forrester Wave for Data Governance (Q3 2025). Atlan unifies your data, business knowledge, and the meaning behind your terms into one Enterprise Data Graph that gives every team and every AI agent the trusted context they need. Trusted by Mastercard, Workday, General Motors, CME Group, HubSpot, FOX, Virgin Media O2, Elastic, and 400+ enterprises representing $10T+ in market cap.

Book a Demo Context Studio Live