LiteLLM, Portkey, and AWS Bedrock Gateway all solve the same traffic problem: routing, rate limiting, and API key management for LLM deployments. Each serves a very different operational profile. LiteLLM is an MIT-licensed open-source proxy supporting 100+ providers; Portkey is a managed SaaS platform with built-in semantic caching and observability; Bedrock Gateway is an AWS-native solution that keeps all traffic inside the AWS network. In March 2026, a supply-chain attack on LiteLLM versions 1.82.7 and 1.82.8 added a security dimension to the evaluation. The right choice depends on your deployment model, DevOps maturity, and whether you’re AWS-standardized.
| Dimension | LiteLLM | Portkey | AWS Bedrock Gateway |
|---|---|---|---|
| License | MIT (open-source) | MIT (gateway); proprietary (SaaS features) | Proprietary (AWS) |
| Deployment | Self-hosted or LiteLLM-managed cloud | Self-hosted, managed SaaS, or VPC (Enterprise) | AWS-native (API Gateway + Lambda) |
| Provider support | 100+ providers | 250+ LLMs, 40+ providers | AWS Bedrock catalog only |
| Semantic caching | Not in open-source tier | Yes (30-50% cost reduction) | No |
| Pricing | Free (self-host); $250/mo-$30K/yr Enterprise | Free tier; $49/mo Production; Enterprise custom | No gateway fee; pay-per-Bedrock-call |
| Best for | Maximum flexibility, self-hosted control | Fastest path to production observability | AWS-native, compliance-first workloads |
| Open source | Yes (full proxy) | Gateway only (since March 2026) | No |
LiteLLM vs Portkey vs Bedrock Gateway: What’s the Difference?
Permalink to “LiteLLM vs Portkey vs Bedrock Gateway: What’s the Difference?”All three belong to a product category called an LLM gateway: a proxy layer that sits between your applications and LLM providers, handling routing, rate limiting, API key management, observability, and caching. The category exists because calling LLM APIs directly at scale creates operational problems: key sprawl, vendor lock-in risk, no unified cost visibility, and no fallback when a provider has an outage.
The three tools diverge at deployment philosophy. LiteLLM is an open-source proxy built for teams that want maximum provider optionality and prefer to own their infrastructure. Portkey is a managed platform built for teams that want production-ready observability and caching without the DevOps overhead. AWS Bedrock Gateway is AWS’s answer for teams whose compliance and security posture requires all AI traffic to stay within the AWS network.
LiteLLM launched in 2023 as the early open-source unifier; Portkey emerged as the managed SaaS layer atop the same routing abstraction; Bedrock Gateway is AWS’s 2024-2025 answer for AWS-native enterprise teams. Two significant events reshaped the landscape in early 2026: AWS added OpenAI models (including Codex) to the Bedrock catalog in April, narrowing Bedrock Gateway’s provider gap; and Portkey open-sourced its gateway layer under MIT in March. Most existing comparisons substitute OpenRouter for Bedrock Gateway. This page covers the three tools most relevant to enterprise engineering teams choosing a production gateway in 2026.
If you’re evaluating these tools as part of a broader LLMOps strategy, the gateway decision is one layer of a larger operational picture.
What Is LiteLLM?
Permalink to “What Is LiteLLM?”LiteLLM is an open-source Python proxy (MIT license, ~40K GitHub stars) that provides a unified OpenAI-compatible API across 100+ LLM providers, including Anthropic, Azure OpenAI, Vertex AI, AWS Bedrock, Cohere, Hugging Face, SageMaker, vLLM, NVIDIA NIM, and Ollama. It handles load balancing, fallback chains, spend tracking, and per-team rate limiting. The practical appeal: switch providers by changing a config value, not rewriting application code.
Deployment and pricing. The open-source proxy is free; you pay only for the infrastructure you run it on. Enterprise tiers add the features regulated industries require: SSO (Okta, Azure AD, Google Workspace, OIDC/SAML), RBAC at org/team/user levels, and audit logs written to your own S3/GCS/Azure Blob in Parquet format with HIPAA-compliant six-year and financial seven-year retention. Enterprise Basic runs $250/month; Enterprise Premium is $30,000/year. The core team has a Rust performance optimization underway to address latency concerns raised in developer communities.
Routing and reliability. LiteLLM’s operational value is in its routing primitives: load balancing (least-busy, weighted round-robin), fallback chains, context-window-aware model escalation (auto-escalate to a larger model when a prompt exceeds context limits), retry with exponential backoff, and rate-limit-aware routing. Observability integrations include Langfuse, LangSmith, Helicone, Prometheus, and OpenTelemetry. For multi-provider routing at scale, LiteLLM is the most comprehensive open-source option in the category.
Security consideration: March 2026 supply-chain incident. In March 2026, a malicious .pth file was injected into LiteLLM versions 1.82.7 and 1.82.8 on PyPI. The payload included a credential harvester, a Kubernetes lateral movement toolkit, and a persistent systemd backdoor. PyPI removed the affected packages within approximately 40 minutes, and the community response was swift. Teams running pinned versions or air-gapped installs were unaffected. The incident surfaced supply-chain risk as a real operational consideration for enterprise deployments. Teams using LiteLLM in production should maintain version-pinning and a scheduled upgrade discipline as standard practice.
Core capabilities
Permalink to “Core capabilities”- 100+ provider integrations via unified OpenAI-compatible API
- Load balancing: least-busy, weighted round-robin, cost-optimized routing
- Fallback chains and context-window-aware model escalation
- Per-team spend tracking and rate limiting; token usage dashboards
- Redis-based exact-match caching (semantic caching requires additional setup)
- Enterprise: SSO, RBAC, Parquet audit logs to own cloud storage
What Is Portkey?
Permalink to “What Is Portkey?”Portkey is a managed LLM gateway platform supporting 250+ LLMs across 40+ providers, with a built-in observability dashboard, semantic caching, and guardrails covering PII redaction, jailbreak detection, and toxicity filtering. Its gateway was open-sourced under MIT in March 2026, expanding its appeal among self-hosted-first developer teams. Portkey raised a $15M Series A in February 2026.
Deployment and pricing. Portkey offers a free tier (10,000 logs per month, self-hosted gateway), a Production tier at $49/month (100,000 logs per month, plus $9 per additional 100K), and Enterprise on custom pricing with 10M+ logs per month, VPC hosting, dedicated support, and custom BAAs. SOC 2 Type 2, GDPR, and HIPAA certifications are available at the Enterprise tier. The managed SaaS path means no gateway infrastructure to maintain, which is a meaningful advantage for teams without dedicated DevOps capacity for proxy operations.
Observability and semantic caching. Portkey’s native dashboard delivers full request logs, latency metrics, cost breakdowns, and error tracking per model, provider, team, and application. Prompt versioning and a prompt playground are built in. The standout differentiator is semantic caching: vector-similarity matching that detects similar (but not identical) prompts and returns cached responses rather than calling the model. Independent benchmarks and operator reports cite 30-50% cost reduction on repetitive query workloads, notably RAG pipelines and support agents where many prompts are paraphrased reformulations of the same underlying question. This makes Portkey a strong option for teams looking to reduce LLM cost at the gateway layer without application-level changes.
Guardrails and enterprise trust signals. Built-in guardrail hooks (PII redaction, jailbreak detection, toxicity filtering) are available from the Production tier, requiring no separate vendor or custom middleware. Granular budget and rate limits apply per model, provider, team, or application. The MIT open-sourcing of the gateway layer addressed a recurring enterprise procurement concern: teams can now self-host the gateway core as an escape hatch if commercial terms change.
Core capabilities
Permalink to “Core capabilities”- 250+ LLMs across 40+ providers including OpenAI, Anthropic, Mistral, Gemini, Cohere, and Bedrock
- Semantic caching (vector-similarity): reported 30-50% cost reduction on repetitive workloads
- Native observability dashboard: request logs, cost tracking, latency, error rates
- Guardrails: PII redaction, jailbreak detection, toxicity filtering (Production tier and above)
- Code-free routing config: fallback chains and load balancing manageable via dashboard
- SOC 2 Type 2, GDPR, HIPAA, custom BAAs on Enterprise tier
What Is AWS Bedrock Gateway?
Permalink to “What Is AWS Bedrock Gateway?”AWS Bedrock Gateway is an AWS-native LLM gateway built with Amazon API Gateway, AWS Lambda, and CloudFormation. It routes requests to the Amazon Bedrock model catalog, keeps all traffic within the AWS network boundary, and inherits AWS compliance certifications (SOC 2, HIPAA, GDPR, FedRAMP) without requiring a new vendor relationship. It is not a standalone SaaS product; it is a reference architecture pattern (aws-samples/sample-ai-gateway-for-amazon-bedrock) combined with AWS managed services.
Architecture and deployment. API Gateway handles request routing and authentication; Lambda processes and forwards requests to Bedrock; CloudFormation manages deployment. It can run as a VPC-private endpoint (zero traffic leaves the AWS network) or a regional public endpoint. There is no gateway platform fee; you pay Bedrock’s on-demand model pricing plus Lambda and API Gateway compute costs, which are minimal at low-to-mid request volumes. The architecture is positioned for organizations where all AI workloads must remain inside AWS: financial services, healthcare, defense, and regulated industries generally.
Provider catalog expansion (April 2026 update). Bedrock Gateway’s historical limitation was provider lock-in. In April 2026, AWS added OpenAI models, including Codex and managed agents, to the Bedrock catalog. The catalog now includes Anthropic Claude, Meta Llama, Mistral, Amazon Titan, Stability AI, Cohere, AI21 Labs, and OpenAI models. This narrows the provider-flexibility gap significantly for AWS-standardized teams. Developer discussions note the gateway “only implements Chat Completions,” making it less feature-complete than full-featured third-party gateways in routing and observability depth.
Observability and compliance. CloudWatch delivers usage, invocation, performance, and error-rate metrics. CloudTrail provides admin-level API event auditing. There is no dedicated LLM observability dashboard; prompt-level and token-level visibility requires additional instrumentation (typically Langfuse or OpenTelemetry). Authentication is IAM-native (SigV4 signing) with Lambda Authorizer support for JWT validation and AWS WAF integration for request filtering. The compliance posture is the key advantage: AWS Enterprise Support, full AWS SLAs, and no new vendor audit scope.
Core capabilities
Permalink to “Core capabilities”- AWS-native routing to full Bedrock model catalog (including OpenAI models, added April 2026)
- IAM-native auth (SigV4) plus Lambda Authorizer for JWT; AWS WAF integration
- CloudWatch metrics and CloudTrail admin audit; inherits AWS compliance posture
- No gateway platform fee: Lambda and API Gateway compute costs only
- VPC-private deployment option: zero traffic leaves the AWS network
- Inherits SOC 2, HIPAA, GDPR, and FedRAMP certifications with no new vendor relationship
Head-to-Head Comparison
Permalink to “Head-to-Head Comparison”No gateway leads across every dimension. LiteLLM leads on provider breadth and self-hosted flexibility. Portkey leads on managed observability and semantic caching. Bedrock Gateway leads on AWS integration depth and compliance inheritance. The table below maps 10 operational dimensions.
| Dimension | LiteLLM | Portkey | AWS Bedrock Gateway |
|---|---|---|---|
| Provider support | 100+ providers (OpenAI, Anthropic, Azure, Vertex AI, Bedrock, Cohere, Hugging Face, vLLM, NVIDIA NIM, Ollama) | 250+ LLMs, 40+ providers (incl. Bedrock, Mistral, Gemini) | Bedrock catalog only (incl. OpenAI models from April 2026) |
| Deployment model | Open-source self-hosted (MIT); Enterprise: self-hosted or managed cloud | Self-hosted (MIT gateway); Managed SaaS; VPC (Enterprise) | AWS API Gateway + Lambda + CloudFormation; VPC-private option |
| Routing | Load balancing (least-busy, weighted round-robin), fallback chains, context-window escalation, retry with backoff | Fallback and load balancing via dashboard; conditional routing by cost, latency, or model capability | Basic request forwarding via Lambda; API Gateway load balancing; no native multi-provider fallback |
| Observability | Langfuse, LangSmith, Helicone, Prometheus, OpenTelemetry; per-team spend tracking; Parquet audit logs (Enterprise) | Native dashboard: request logs, cost, latency, error rates; Datadog + Langfuse; 30-day retention (Production) | CloudWatch metrics + CloudTrail admin audit; no LLM-native dashboard; requires additional instrumentation |
| Caching | Redis exact-match; semantic caching requires additional setup | Purpose-built semantic caching (vector-similarity); reported 30-50% cost reduction | API Gateway TTL-based cache; no semantic caching |
| Security / compliance | Enterprise: SSO (Okta, Azure AD, OIDC/SAML), JWT, RBAC, Parquet audit logs; HIPAA/financial retention options | RBAC, service account keys; SOC 2 T2, GDPR, HIPAA, custom BAAs (Enterprise); SSO on Enterprise | IAM/SigV4, Lambda Authorizer, AWS WAF, CloudTrail; inherits AWS SOC 2, HIPAA, GDPR, FedRAMP; no LLM-specific audit layer |
| Pricing | Free (self-host); $250/mo Basic; $30K/yr Premium | Free (10K logs/mo); $49/mo Production; Enterprise custom | No gateway fee; Lambda + API Gateway compute only |
| Enterprise readiness | Strong for DevOps-heavy teams; ~40K GitHub stars; supply-chain incident (March 2026) requires patching discipline | SLAs + dedicated onboarding (Enterprise); $15M Series A (Feb 2026); ~2K GitHub stars (gateway repo) | Full AWS Enterprise Support; no LLMOps-native features; best for AWS-standardized orgs |
| Open-source status | Yes (MIT license; full proxy code) | Gateway only (MIT, March 2026); managed platform features proprietary | No (proprietary); reference implementation open |
| Governance coverage | Infrastructure logs only (S3/Parquet): no semantic understanding of payload content | Traffic-layer audit logs: records that a call was made, not whether context was valid | CloudTrail API-event records: no prompt-level or context governance |
A practical illustration. Consider a fintech platform team running three concurrent AI products: a support agent, a contract review tool, and an internal SQL assistant. Each touches different data classifications. LiteLLM gives them one proxy across all three, with team-level spend tracking and RBAC, but requires an internal team to manage upgrades and patching discipline (especially after the March 2026 supply-chain incident). Portkey gives them a production dashboard and semantic caching for the support agent’s repetitive queries, with guardrail hooks for PII redaction, and zero gateway infrastructure to maintain. If all three workloads are already in AWS, Bedrock Gateway is the path of least resistance: no new vendor, no new audit scope, same IAM policies they already govern. The right answer depends on which constraint is binding for your team.
Map Your LLMOps Context Stack
Enterprise AI teams that govern the context layer (not just the gateway) reduce LLM spend and compliance gaps at the same time. This guide walks through the four-layer architecture from data lineage to model routing.
Get the Stack GuideHow to Choose Between LiteLLM, Portkey, and Bedrock Gateway
Permalink to “How to Choose Between LiteLLM, Portkey, and Bedrock Gateway”The decision reduces to three constraints: how much AWS standardization you’ve committed to, how much DevOps capacity you have for gateway infrastructure, and what your observability requirements look like from day one. None of the three options are mutually exclusive; combining LiteLLM and Portkey is a documented production pattern.
Choose LiteLLM if: your team has dedicated DevOps capacity; you need to route across five or more providers; data cannot leave your network perimeter; you’re cost-sensitive at high volume (five million or more requests per month) and cannot absorb a per-call SaaS markup; or you need audit logs written to your own S3 or GCS bucket. The March 2026 supply-chain incident is a real operational signal: teams choosing LiteLLM should pin versions, test upgrades on a schedule, and treat gateway patching as a recurring operational task, not a set-and-forget configuration.
Choose Portkey if: your team is mid-size (20-100 engineers) without dedicated gateway DevOps; you need a production-ready dashboard on day one; semantic caching is worth a 30-50% cost reduction on your support agent or RAG workloads; you need built-in guardrails (PII redaction, jailbreak detection) without building them in-house; or SOC 2 Type 2 and HIPAA compliance need to come pre-certified on a timeline that doesn’t allow for a lengthy LiteLLM Enterprise procurement process.
Choose Bedrock Gateway if: your organization is AWS-standardized; all data must stay within the AWS network boundary because of regulated-industry requirements (financial services, healthcare, defense); you already use CloudTrail, CloudWatch, and IAM as your operational baseline; you don’t need multi-cloud model routing today (or the April 2026 OpenAI-on-Bedrock addition covers your model requirements); or there is no organizational appetite for a new vendor relationship or expanded audit scope.
When to combine. LiteLLM plus Portkey is a documented pattern: run LiteLLM as the routing proxy and send traces and logs to Portkey for the observability layer. This gives LiteLLM’s 100+ provider breadth alongside Portkey’s dashboard and semantic caching. Bedrock Gateway plus LiteLLM is less common but viable for teams that route some workloads within AWS and others externally, using LiteLLM as the external-provider routing layer.
For teams managing multiple providers as part of a broader operational strategy, managing multiple LLM providers at scale covers the architectural patterns in more depth.
What LLM Gateways Don’t Govern: The Payload Problem
Permalink to “What LLM Gateways Don’t Govern: The Payload Problem”LiteLLM, Portkey, and AWS Bedrock Gateway all solve the traffic layer: routing requests, enforcing rate limits, logging API calls, managing keys, and caching similar prompts. None of them govern what goes through the pipe. The payload itself is opaque to all three. They can tell you that a request was made, at what cost, from which application, at what latency. They cannot tell you whether the context in that request was valid.
Five questions no LLM gateway can answer:
- Data lineage: Where did the data in this prompt come from? Is the source table still current, or was it last refreshed three days ago?
- Business definition consistency: Does “active customer” in this prompt mean the same thing it means in your data warehouse and your analytics layer?
- PII completeness: Were all PII fields actually masked before this context window was assembled, not just checked at the gateway boundary?
- Access policy enforcement at context level: Did the agent have permission to include this customer segment’s data in the prompt? The gateway received the request; it did not decide whether the context should have been built that way.
- Cross-system lineage: Can you trace an LLM output back to the source tables and transformations that produced the input context?
Gateways sit between applications and models. The context layer for enterprise AI sits between data systems and applications, upstream of the gateway. These are different problems at different architectural layers. A perfectly configured LiteLLM deployment can still send stale, ungoverned, or policy-violating context to a model and produce plausible-sounding but incorrect outputs.
This is the gap Atlan addresses. The data graph connects data lineage, business definitions, access policies, and quality signals to the prompt assembly process, before the request reaches any LLM gateway. Teams that govern the context layer reduce both compliance risk and LLM spend more reliably than teams that optimize the gateway layer alone. For the AI context stack architecture, see Atlan’s enterprise guide. The active data governance platform is what sits above these gateways.
See the Context Layer in Action
Atlan's data graph governs what goes into your LLM prompts: lineage, definitions, and access policies, before the request reaches LiteLLM, Portkey, or Bedrock Gateway.
Book a DemoReal Stories from Real Customers: Governing the Layer Above the Gateway
Permalink to “Real Stories from Real Customers: Governing the Layer Above the Gateway”"We're excited to build the future of AI governance with Atlan. All of the work that we did to get to a shared language at Workday can be leveraged by AI via Atlan's MCP server...as part of Atlan's AI Labs, we're co-building the semantic layer that AI needs with new constructs, like context products."
Joe DosSantos, VP of Enterprise Data & Analytics, Workday
"Atlan is much more than a catalog of catalogs. It's more of a context operating system...Atlan enabled us to easily activate metadata for everything from discovery in the marketplace to AI governance to data quality to an MCP server delivering context to AI models."
Sridher Arumugham, Chief Data & Analytics Officer, DigiKey
FAQs About LiteLLM vs Portkey vs AWS Bedrock Gateway
Permalink to “FAQs About LiteLLM vs Portkey vs AWS Bedrock Gateway”1. What is the main difference between LiteLLM and Portkey?
LiteLLM is an open-source self-hosted proxy optimized for provider flexibility and high-volume, DevOps-driven deployments. Portkey is a managed SaaS platform optimized for production observability, semantic caching, and minimal infrastructure overhead. LiteLLM gives you more control; Portkey gives you faster time-to-production. Both have open-source gateways under MIT license. Portkey’s managed platform features (observability dashboard, semantic caching, and guardrails) require a subscription.
2. What is AWS Bedrock Gateway and how does it work?
AWS Bedrock Gateway is a reference architecture built with Amazon API Gateway, AWS Lambda, and CloudFormation that routes LLM requests to the Amazon Bedrock model catalog. It keeps all traffic within the AWS network, uses IAM (SigV4) for authentication, and inherits AWS compliance certifications without requiring a new vendor relationship. It does not offer semantic caching or a dedicated LLM observability dashboard. In April 2026, AWS added OpenAI models to the Bedrock catalog, narrowing the provider-flexibility limitation for AWS-standardized teams.
3. Is LiteLLM free to use in production?
The open-source LiteLLM proxy is free under MIT license; you pay only for the infrastructure you run it on. Enterprise features (SSO, RBAC, Parquet audit logs, SLA support) require an Enterprise license at $250/month (Basic) or $30,000/year (Premium). Note that the March 2026 supply-chain attack affected only PyPI packages at versions 1.82.7 and 1.82.8. Teams using pinned versions or building from source were not affected.
4. Does Portkey support self-hosted deployment?
Yes. Portkey open-sourced its gateway layer under MIT in March 2026. You can self-host the gateway. The managed SaaS features (observability dashboard, semantic caching, guardrails, and 30-day log retention) require a Portkey subscription at the Production or Enterprise tier.
5. Which LLM gateway is best for enterprise compliance: SOC 2, HIPAA, FedRAMP?
For FedRAMP, Bedrock Gateway is the only option of the three (inheriting AWS’s FedRAMP certification). For SOC 2 Type 2 and HIPAA, both Portkey Enterprise and LiteLLM Enterprise support these requirements: Portkey via its own certifications, LiteLLM via bring-your-own-storage for audit logs. Bedrock Gateway inherits all AWS compliance certifications with no new vendor audit scope.
6. What is semantic caching in LLM gateways?
Semantic caching uses vector-similarity matching to detect prompts that are similar but not identical to cached prompts, and returns a cached response instead of calling the model. Unlike exact-match caching, it handles paraphrased or lightly reformulated queries. Portkey offers purpose-built semantic caching with reported 30-50% cost reduction on repetitive workloads. LiteLLM’s open-source tier uses Redis exact-match caching; semantic caching requires additional setup.
7. Can you use LiteLLM and Portkey together?
Yes. Running LiteLLM as the routing proxy with Portkey handling observability is a documented production pattern. LiteLLM handles provider routing and load balancing; Portkey’s dashboard receives the traces and logs. This combination gives teams LiteLLM’s 100+ provider coverage alongside Portkey’s observability dashboard and semantic caching without maintaining a separate observability stack.
8. What are the limitations of LLM gateways for enterprise governance?
LLM gateways handle the traffic layer: routing, rate limiting, key management, and API-level logging. They do not govern what goes into the prompt payload. Data lineage, business definition consistency, PII masking completeness, access policy enforcement at context level, and cross-system output traceability are all context-layer problems that sit upstream of the gateway, between data systems and the applications that assemble prompts.
Sources
Permalink to “Sources”- LiteLLM official documentation
- LiteLLM Enterprise features
- Portkey pricing page
- AWS Bedrock AgentCore Gateway documentation
- AWS Architecture Blog: Building an AI gateway to Amazon Bedrock
- Kong AI Gateway benchmark (Portkey and LiteLLM)
- TrueFoundry: Portkey vs LiteLLM
- AWS What’s New: OpenAI models on Bedrock (April 2026)
- LiteLLM PyPI supply-chain incident (Hacker News)
- Atlan: AI Gateway and LLM Gateway
