5 AI Gateways Developers Use to Run Claude Code with Non-Anthropic Models
TLDR: Claude Code defaults to Anthropic models, but teams often need to route coding workloads to other providers for cost control, regional compliance, or fallback coverage. This comparison covers five gateway options developers use for that setup, including Bifrost, LiteLLM, OpenRouter, Cloudflare AI Gateway, and AWS Bedrock. For production teams that want self-hosted control, MCP support, and stronger governance, Bifrost is the strongest fit. The rest of the article explains where each option fits and what tradeoffs to expect.
Claude Code ships with Anthropic's own models as the default backend, but most engineering teams eventually want the option to send requests to models from other providers. The reasons are usually practical: tighter cost control, lower latency on routine calls, or compliance requirements that rule out a specific hosting region. An AI gateway solves this by sitting between Claude Code and any underlying model provider, acting as a unified proxy that handles routing, failover, and observability.
This article walks through five AI gateways that developers currently use to run Claude Code against non-Anthropic models, with a breakdown of core features and the setup each one fits best.
What an AI Gateway Does for Claude Code
An AI gateway is an infrastructure layer that intercepts API calls from an AI tool or agent and dispatches them to one or more LLM providers. In the Claude Code context, a gateway lets you redirect traffic away from Anthropic's hosted models toward alternatives on AWS Bedrock, Google Vertex, Azure OpenAI, or open-source providers, all without touching Claude Code's core behavior.
Teams typically reach for a gateway when they need to:
- Send routine coding tasks to cheaper or faster models
- Use region-locked or compliance-approved model deployments
- Add fallback logic for when a primary provider returns errors or times out
- Unify usage tracking and cost reporting across several providers
How We Tested These Gateways for Claude Code
Before the comparison, it helps to know how each gateway behaves once Claude Code is actually pointed at it. Our team ran Claude Code against each of the five options below using an OpenAI-compatible endpoint configuration and a standard ANTHROPIC_BASE_URL override, routed to both Anthropic and non-Anthropic backends. Setup paths were not identical.
With Bifrost, the flow was: start the self-hosted gateway, register provider credentials, and set Claude Code's base URL environment variable to the gateway. Claude Code began routing through Bifrost without any change to how we invoked it. Fallback testing involved forcibly killing one provider connection mid-session; the gateway re-dispatched the in-flight request to the fallback within a few hundred milliseconds, and Claude Code continued without surfacing an error to the user.
With managed services, the integration was faster to stand up but offered less control over failure handling and logging granularity. With LiteLLM, setup required a few more Python-side configuration passes to get budget tracking and provider weights aligned with what our team needed.
The comparisons below reflect this hands-on setup work rather than feature-sheet inspection alone.
1. Bifrost
Platform Overview
Bifrost is a high-performance, open-source AI gateway built by Maxim AI. It provides a single OpenAI-compatible API endpoint that routes to 15+ providers including AWS Bedrock, Google Vertex AI, Azure OpenAI, Mistral, Cohere, Groq, and Ollama. Bifrost is designed as a drop-in replacement for native provider SDKs, making it straightforward to integrate with Claude Code without any additional code changes.
Bifrost can be self-hosted with zero configuration overhead and is purpose-built for teams that want enterprise-grade infrastructure without vendor lock-in.
For Claude Code specifically, Bifrost is useful because teams can keep the local developer workflow unchanged while moving model routing into the gateway layer. That makes it easier to test Anthropic, Bedrock, Vertex, or other providers behind the same interface while keeping fallback and usage controls centralized.
Features
- Unified provider interface: One OpenAI-compatible API for all supported providers, including Anthropic, AWS Bedrock, and Google Vertex
- Automatic fallbacks: Automatic failover between providers and models with zero downtime
- Semantic caching: Reduces redundant LLM calls by caching responses based on semantic similarity, cutting both cost and latency
- MCP gateway support: Native support for the Model Context Protocol, allowing Claude Code to use external tools like web search, databases, and file systems across any connected provider
- Budget management and governance: Hierarchical cost controls with virtual keys, team-level budgets, and usage tracking
- Custom plugins: Extensible middleware for analytics, monitoring, and request transformation
- Observability: Native Prometheus metrics, distributed tracing, and structured logging out of the box
Best For
Bifrost is the right call for engineering teams that want a self-hosted, zero-config gateway with strong enterprise controls. It works especially well for organizations running Claude Code in regulated environments, for teams consolidating multi-provider access with cost governance, and for developers who need MCP gateway support alongside standard LLM routing. Book a Bifrost demo to see how it maps to your infrastructure.
2. LiteLLM
Platform Overview
LiteLLM is an open-source proxy server and Python SDK that provides a unified interface to 100+ LLM providers. It is widely used for local development and self-hosted deployments.
Features
- OpenAI-compatible proxy with support for 100+ models
- Load balancing and fallback routing across providers
- Budget tracking and rate limiting per API key
- Logging integrations with tools like Langfuse and Helicone.
3. OpenRouter
Platform Overview
OpenRouter is a managed API aggregation service that provides a single API key for accessing dozens of models from providers like Anthropic, OpenAI, Mistral, and Meta.
Features
- Single API endpoint for 100+ models across multiple providers
- Automatic routing to the lowest-cost or lowest-latency model
- Pay-per-token pricing with no subscription required
- Model fallback configuration via request parameters
4. Cloudflare AI Gateway
Platform Overview
Cloudflare AI Gateway is a managed gateway service built into Cloudflare's global network that adds observability and caching to LLM API calls.
Features
- Caching and rate limiting for LLM requests at the edge
- Real-time request logs and usage analytics in the Cloudflare dashboard
- Support for providers including OpenAI, Anthropic, AWS Bedrock, and Hugging Face
- No self-hosting required; routes through Cloudflare's network by default
5. AWS Bedrock
Platform Overview
AWS Bedrock is a fully managed service from Amazon Web Services that provides access to foundation models from Anthropic, Meta, Mistral, Cohere, and others through a unified AWS API.
Features
- Access to Claude, Llama, Mistral, Titan, and other models through a single AWS service
- Native integration with AWS IAM for access control and VPC for private networking
- Model evaluation, fine-tuning, and guardrails built into the service
- Supports enterprise compliance requirements including HIPAA, SOC 2, and GDPR
What matters when Claude Code is not tied to Anthropic
When teams move Claude Code onto non-Anthropic models, the problem is usually not model access alone. The harder part is keeping routing, fallback behavior, and cost controls predictable once multiple providers are in the mix.
For production use, the gateway needs to do more than proxy requests. It needs to keep Claude Code usable when a provider slows down, expose clear usage and latency signals, and make provider swaps possible without reworking the developer workflow around a new SDK.
That is why the strongest production setups usually favor gateways that combine OpenAI-compatible routing, fallback handling, governance controls, and observable request paths in one place instead of spreading those responsibilities across separate tools.
Choosing the Right Gateway for Claude Code
The right gateway depends primarily on where your team sits on the self-hosted versus managed spectrum, and how much governance and observability your production environment requires.
| Gateway | Deployment | MCP Support | Best Fit |
|---|---|---|---|
| Bifrost | Self-hosted | Yes | Enterprise, regulated teams |
| LiteLLM | Self-hosted | Limited | Developer-led, Python-first teams |
| OpenRouter | Managed | No | Prototyping, lightweight production |
| Cloudflare AI Gateway | Managed | No | Cloudflare-native apps |
| AWS Bedrock | Managed (AWS) | No | AWS-native enterprise workloads |
For teams that want a production-grade, self-hosted gateway with native MCP support, automatic fallbacks, and enterprise cost controls, Bifrost covers the full stack without additional tooling layered on top. The Bifrost buyer's guide covers the decision criteria in more depth.
Conclusion
Running Claude Code with non-Anthropic models is a practical need for engineering teams optimizing for cost, compliance, or provider flexibility. Each gateway covered here addresses a different part of that problem, from OpenRouter's simplicity to Bifrost's enterprise infrastructure depth.
If your team wants Claude Code to route across multiple providers without losing control over fallback, governance, or observability, book a demo or get started with Maxim AI today.