Route Claude Code Through Google Gemini Models Using Bifrost
Claude Code is a terminal-based coding agent that communicates exclusively with Anthropic's API by default. For engineering teams running high-volume coding tasks, this creates a hard dependency on a single provider: Anthropic's pricing, rate limits, and availability. Bifrost, the open-source AI gateway built in Go by Maxim AI, breaks that dependency by acting as a protocol translation layer between Claude Code and any provider it's configured for, including Google Gemini. This post walks through how the translation works, how to configure Gemini as a provider, and how to pin Claude Code's model tiers to specific Gemini models.
Why Route Claude Code Through Gemini
Engineering teams adopt a multi-provider strategy for Claude Code for reasons that are specific to their cost and infrastructure profile:
- Cost management: Gemini models, particularly Gemini 2.5 Flash, offer competitive pricing for high-volume coding tasks. Routing routine file operations and completions through Gemini while reserving Claude Sonnet for complex multi-step reasoning can reduce overall token spend without changing developer workflows.
- Latency optimization: Depending on geographic region and workload type, Gemini endpoints may offer lower response times for specific tasks. Teams running infrastructure in Google Cloud regions benefit from reduced round-trip latency.
- Model benchmarking: Running identical coding prompts through both Claude and Gemini gives engineering teams data to make informed decisions about which model performs best for their codebase and task types.
- Provider redundancy: Relying on a single provider creates a single point of failure. If Anthropic's API experiences rate limiting during a high-traffic deploy window, Claude Code stops working. Bifrost's automatic fallback handles the switch transparently.
How Bifrost Handles the Protocol Translation
Google Gemini's API structure differs substantially from Anthropic's. Bifrost performs the full conversion at request time, with no changes required in Claude Code:
- Role remapping: Anthropic's "assistant" role is remapped to Gemini's "model" role. System messages are converted to Gemini's
systemInstructionfield format. - Parameter renaming: Anthropic's
max_completion_tokensbecomes Gemini'smaxOutputTokens;stopbecomesstopSequences. - Tool call handling: Function call structures are restructured to match Gemini's
functionDeclarationsformat, with tool choice values mapped from Anthropic'sauto/none/requiredto Gemini'sAUTO/NONE/ANY. - Response normalization: Gemini's
finishReasonvalues (STOP,MAX_TOKENS,SAFETY) are mapped back to Anthropic'sstop,length, andcontent_filtervalues before the response reaches Claude Code.
Bifrost handles this translation at 11 microseconds of added overhead at 5,000 RPS, which means the translation cost is negligible compared to Gemini API latency.
The supported operations for Gemini through Bifrost include chat completions (streaming and non-streaming), the Responses API, embeddings, image generation (both standard Gemini and Imagen formats), transcription, and speech. For Claude Code's core use cases, chat completions and streaming are what matter.
Step 1: Start Bifrost
The fastest way to run Bifrost locally is with NPX:
npx -y @maximhq/bifrost
Or with Docker, mounting a local data directory for persistent configuration:
docker run -p 8080:8080 -v $(pwd)/data:/app/data maximhq/bifrost
Both commands start the gateway at http://localhost:8080. The web UI is available at the same address and provides a visual interface for provider configuration, routing rules, and observability.
For production deployments, see the Kubernetes deployment guide.
Step 2: Configure Gemini as a Provider
Add Gemini to Bifrost using the web UI, the configuration API, or a config.json file.
Using the Web UI
Navigate to Models > Model Providers, find Google Gemini under Configured Providers (or click Add New Provider), then click Add Key. Paste your Gemini API key directly or reference it as an environment variable using the env. prefix (e.g., env.GEMINI_API_KEY). Set Allowed Models to All Models to permit any Gemini model, or specify an explicit model allowlist.
Using config.json
{
"providers": {
"gemini": {
"keys": [
{
"name": "gemini-key-1",
"value": "env.GEMINI_API_KEY",
"models": ["*"],
"weight": 1.0
}
]
}
}
}
Using the API
curl --location '<http://localhost:8080/api/providers>' \
--header 'Content-Type: application/json' \
--data '{
"provider": "gemini",
"keys": [
{
"name": "gemini-key-1",
"value": "env.GEMINI_API_KEY",
"models": ["*"],
"weight": 1.0
}
]
}'
Bifrost automatically handles Gemini's authentication: standard endpoints use the x-goog-api-key header, while Imagen and custom endpoints use query-parameter authentication. The provider configuration handles both methods without any additional setup.
Step 3: Create a Virtual Key
Virtual keys are the primary governance mechanism in Bifrost. Each developer or team receives a virtual key that encodes their access policy: which providers and models they can reach, their budget ceiling, and their rate limit. The actual Gemini API key is stored in the gateway and never distributed to individual users.
Create a virtual key in the Bifrost web UI under Virtual Keys. Configure it to restrict access to the gemini provider and set a monthly spending cap appropriate for your team's usage.
Step 4: Configure Claude Code to Route Through Bifrost
Claude Code's model routing is controlled via settings.json, located at ~/.claude/settings.json on macOS/Linux. Two configuration approaches work for Gemini.
Option A: Provider-Specific Model Pinning
Pin Claude Code's Haiku and Sonnet tiers directly to Gemini models:
"env": {
"ANTHROPIC_BASE_URL": "<http://localhost:8080/anthropic>",
"ANTHROPIC_AUTH_TOKEN": "your-virtual-key",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "gemini/gemini-2.5-flash",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "gemini/gemini-2.5-pro"
}
ANTHROPIC_AUTH_TOKEN is the recommended authentication method. Claude Code sends this value in the Authorization: Bearer header, and Bifrost uses it as the virtual key for routing and governance. No Anthropic account login is required.
Option B: Dynamic Aliasing via Routing Rules
For teams that want to change the Gemini model target without touching developer machines, use routing rules in Bifrost to map alias names to provider/model targets at request time:
- In the Bifrost dashboard, create a routing rule for
sonnet-modelwith auser-agentcondition matchingclaude-cli, mapped togemini/gemini-2.5-pro. - Create a second rule for
haiku-modelmapped togemini/gemini-2.5-flash. - Update
settings.json:
"env": {
"ANTHROPIC_BASE_URL": "<http://localhost:8080/anthropic>",
"ANTHROPIC_AUTH_TOKEN": "your-virtual-key",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "haiku-model",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "sonnet-model"
}
The alias approach allows model targets to be updated centrally in the gateway without redeploying or reconfiguring individual developer environments.
Verify the Configuration
After updating settings.json, start Claude Code and confirm the active model:
claude --model gemini/gemini-2.5-pro
Or switch mid-session:
/model gemini/gemini-2.5-flash
Run /model without arguments to confirm the current active model. Claude Code will show the Gemini model name if routing is working correctly.
Option: Use Bifrost CLI Instead of Editing settings.json
Steps 3 and 4 above require manually editing settings.json and setting environment variables. If you prefer not to touch those files, the Bifrost CLI is an interactive terminal tool that handles the Claude Code configuration automatically, once the Bifrost gateway is already running.
With the gateway running (from Step 1), launch the CLI in a second terminal:
npx -y @maximhq/bifrost-cli
The CLI walks through a four-step interactive setup:
- Base URL — enter your Bifrost gateway address (default:
http://localhost:8080) - Virtual key — enter your virtual key if authentication is enabled, or press Enter to skip
- Harness — select Claude Code from the agent list; the CLI shows installation status and will install it via npm if missing
- Model — the CLI fetches available models from your running gateway and presents a searchable list; type
gemini/gemini-2.5-proorgemini/gemini-2.5-flashto filter and select
After confirming the summary screen, the CLI launches Claude Code with all required environment variables set automatically (ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN, Gemini model identifiers) and auto-attaches Bifrost as an MCP server inside Claude Code, so configured MCP tools are available without additional setup. No settings.json edits required.
The CLI stores its configuration in ~/.bifrost/config.json and keeps virtual keys in your OS keyring. On subsequent runs it returns to the summary screen with your last selections intact. Press Enter to re-launch immediately, or press m to pick a different Gemini model first.
For teams running multiple concurrent sessions, the CLI provides a tabbed terminal UI. Press Ctrl+B to open the tab bar, n to start a new session, and number keys to jump between tabs.
The CLI and the manual settings.json approach produce identical routing behavior through Bifrost. The CLI is the faster path for individual developers; the manual approach is better suited for shared team configurations or GitOps-managed environments.
Gemini-Specific Capabilities Available Through Bifrost
Beyond standard chat completions, Bifrost exposes several Gemini capabilities through the same API surface:
- Reasoning/Thinking: Gemini's extended thinking mode maps to Bifrost's
reasoningparameter. Settingreasoning.effortto"high"maps to Gemini'sthinkingConfig.thinkingLevel = HIGH, withreasoning.max_tokenscontrolling the thinking budget. - Multimodal inputs: Image inputs in URL or base64 format, video content with fps and offset metadata, and PDF files via file references all pass through Bifrost's content conversion layer.
- Embeddings: Bifrost routes embedding requests to Gemini's
embedContentendpoint, withdimensionsmapping tooutputDimensionality. - Image generation: Both standard Gemini image generation and Imagen model requests are supported. Bifrost automatically selects the correct endpoint (
/generateContentvs./predict) based on the model name.
For Claude Code's primary use cases, chat completions and tool calling are the critical path. Bifrost maps Claude Code's tool schemas to Gemini's functionDeclarations format and handles the response conversion back to Anthropic format transparently.
Governance and Cost Controls Across Gemini Traffic
Running Claude Code against Gemini through Bifrost enables cost and access controls that aren't available when calling Gemini directly:
- Per-developer budget limits: Set daily or monthly spending caps on each virtual key. When a developer hits their budget ceiling, Bifrost rejects requests with a policy error instead of accumulating additional spend.
- Model access restrictions: Restrict a virtual key to specific models. A junior engineer's key might permit only
gemini/gemini-2.5-flash, while a senior engineer's key addsgemini/gemini-2.5-pro. - Rate limiting: Apply request-per-minute or token-per-day limits at the virtual key or provider level to distribute capacity across teams.
- Automatic fallback: Configure Bifrost to fall back from Gemini to Anthropic (or another provider) when Gemini returns 5xx errors or rate limit responses. The fallback configuration requires no changes in Claude Code.
The full governance feature set covers budget hierarchies at the customer, team, virtual key, and provider configuration levels. For enterprise teams with SSO requirements, Bifrost Enterprise adds OIDC integration and RBAC on top of the same governance stack.
For teams evaluating AI gateways for their Claude Code infrastructure, the LLM Gateway Buyer's Guide covers the full capability matrix across routing, governance, observability, and deployment options.
Production Considerations
A few configuration details are worth addressing before rolling out Gemini routing in production:
Tool calling compatibility: Claude Code relies heavily on tool use for file edits, bash commands, and code operations. Gemini 2.5 Pro and Flash both support function calling, but not all Claude-specific server-side tools (web_search, computer_use) are available on non-Claude models. Validate your specific workflow against Gemini before switching tiers in production.
Streaming finish reason timing: Gemini only includes finish_reason in the final stream chunk. Bifrost normalizes this to match Anthropic's streaming format, so Claude Code receives stop at the correct position, but it is worth noting if you are debugging stream behavior.
Consecutive tool responses: Gemini merges consecutive tool response messages into a single user message. Bifrost handles this conversion automatically, but it changes the message count in observability logs compared to direct Anthropic calls.
For deeper observability into Gemini traffic routing through Bifrost, the observability features expose per-request latency, token usage by provider, and error rates in the built-in dashboard and via OpenTelemetry and Prometheus.
Summary
Routing Claude Code through Google Gemini models requires three configuration steps: starting Bifrost, adding the Gemini provider with an API key, and updating settings.json to point Claude Code at the Bifrost endpoint with Gemini model identifiers. Bifrost handles the full Anthropic-to-Gemini protocol translation, including role remapping, parameter renaming, tool schema conversion, and response normalization.
Virtual keys add budget limits, model access restrictions, and rate limits on top of the routing, giving platform teams control over Gemini spend without changing developer workflows. The same gateway handles automatic fallback to Anthropic or any other configured provider when Gemini is unavailable.
Bifrost is open-source with full documentation and benchmarks available for teams evaluating gateway options; the LLM Gateway Buyer's Guide covers the capability comparison in detail. To explore enterprise features, including adaptive load balancing, SSO integration, and in-VPC deployment for production Claude Code workflows, book a demo with the Bifrost team.