Best AI Gateway to Use Claude Code with Gemini Models
Bifrost lets you route Claude Code requests through Google Gemini models with zero code changes, adding just 11 microseconds of overhead per request.
Claude Code is one of the most capable terminal-based agentic coding tools available today. It handles code generation, file editing, debugging, and terminal operations directly from the command line. The limitation: it only works with Anthropic's Claude models by default. For teams that want to run Claude Code against Google's Gemini models for cost optimization, latency improvements, or model benchmarking, an AI gateway is the most reliable path forward.
Bifrost, an open-source AI gateway built in Go, solves this by acting as a protocol translation layer between Claude Code and Gemini. It intercepts Anthropic API requests, translates them to Gemini's API format, routes the request to Google's endpoint, and returns the response in the format Claude Code expects. The entire process requires changing a single environment variable.
Why Route Claude Code Through Gemini?
Teams adopt a multi-provider strategy for Claude Code for several practical reasons:
- Cost management: Gemini models, particularly Gemini 2.5 Flash, offer competitive pricing for high-volume coding tasks. Routing routine operations through Gemini while reserving Claude for complex reasoning can reduce overall API spend significantly.
- Latency optimization: Depending on geographic region and workload type, Gemini endpoints may offer lower response times for specific use cases.
- Model benchmarking: Running identical prompts through both Claude and Gemini helps engineering teams make data-driven decisions about which model performs best for their codebase and task types.
- Provider redundancy: Relying on a single provider creates a single point of failure. If Anthropic's API experiences downtime or rate limiting, Claude Code sessions halt entirely without a fallback mechanism.
How Bifrost Connects Claude Code to Gemini
Bifrost unifies access to 20+ LLM providers through a single API. For the Claude Code to Gemini workflow specifically, it works as follows:
- Claude Code sends an Anthropic Messages API request to Bifrost instead of Anthropic's servers.
- Bifrost translates the request format to match Gemini's API specification.
- The translated request is routed to Google's Gemini endpoint.
- Gemini's response is translated back to Anthropic's format and returned to Claude Code.
Claude Code does not know the difference. It operates as if it is communicating with Anthropic's API. Bifrost's drop-in replacement architecture handles all the protocol translation transparently.
The setup requires two steps: start the Bifrost gateway and point Claude Code at it.
# Start Bifrost
npx -y @maximhq/bifrost
# Launch Claude Code through Bifrost
ANTHROPIC_BASE_URL=http://localhost:8080/anthropic claude
Bifrost supports both Google Gemini (direct API) and Google Vertex AI as separate provider options. Teams already running Gemini through GCP can route Vertex AI traffic through Bifrost with the same governance and observability benefits.
Per-Tier Model Overrides
Claude Code uses three internal model tiers: Sonnet (default), Opus (complex tasks), and Haiku (lightweight). With Bifrost, each tier can be overridden independently to use any model from any provider. A team could configure Gemini 2.5 Pro for Opus-level reasoning, Gemini 2.5 Flash for the default Sonnet tier, and a Groq-hosted model for fast Haiku tasks.
One important constraint: non-Anthropic models must support tool use capabilities. Claude Code depends on tool calling for file operations, code editing, and terminal commands. Models without tool calling support will not function correctly.
Beyond Translation: Governance, Observability, and Failover
Protocol translation is table stakes. What separates Bifrost from lightweight proxy solutions is the infrastructure layer it adds on top.
Governance with virtual keys: Bifrost's virtual keys let teams assign each developer or team a unique credential with configurable spend limits, rate caps, and model access permissions. One developer might have access to Gemini 2.5 Pro and Claude Sonnet, while another's key is restricted to Gemini Flash. Budget controls operate at the virtual key, team, and organization level, preventing runaway costs across a team of engineers using Claude Code concurrently.
Automatic failover: Bifrost's automatic fallback chains let teams define provider sequences. If Gemini's API returns an error or hits a rate limit, Bifrost can automatically reroute the request to Claude, Mistral, or any other configured provider with zero downtime.
Built-in observability: Every request flowing through Bifrost generates native Prometheus metrics and OpenTelemetry traces. Engineering leads gain visibility into model usage patterns, error rates, token consumption, and cost per developer across the entire team.
Semantic caching: For repeated or semantically similar queries, Bifrost's semantic caching reduces both cost and latency by returning cached responses instead of making redundant API calls.
Bifrost CLI: Interactive Agent Launcher
For teams that want to skip manual environment variable configuration, Bifrost CLI provides an interactive setup experience. It walks through gateway configuration, agent selection, and model choice in a single flow:
- Select your coding agent (Claude Code, Codex CLI, Gemini CLI, or others).
- Browse available models from all configured providers.
- Press Enter. Bifrost configures all environment variables, API keys, and provider paths automatically.
This removes a common friction point when onboarding new team members or switching between model configurations during development sessions.
Enterprise Considerations
For organizations with strict compliance requirements, Bifrost's enterprise tier adds in-VPC deployments, vault support for secure key management (HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, Azure Key Vault), audit logs for SOC 2, GDPR, and HIPAA compliance, and RBAC with identity provider integration through Okta and Entra.
These controls are particularly important for regulated industries where LLM requests must traverse approved infrastructure and all model interactions need immutable audit trails.
Start Using Claude Code with Gemini Today
Routing Claude Code through Gemini models with Bifrost takes minutes and requires no changes to Claude Code itself. Teams gain model flexibility, cost governance, automatic failover, and full observability across every coding session.
Bifrost is open source and available on GitHub. To see how Bifrost fits into your AI infrastructure at scale, book a demo with the Bifrost team.