LLM Gateway

Use Any LLM Provider with the OpenAI SDK: Bifrost's Universal Integration

Access OpenAI, Anthropic, Google, AWS Bedrock, and more through a single, familiar interface

The OpenAI SDK has become the de facto standard for LLM application development. Developers have invested countless hours building applications around its API format, creating a significant migration barrier when exploring alternative providers. Bifrost eliminates this barrier by providing complete OpenAI API compatibility while enabling access to multiple LLM providers through protocol adaptation and intelligent routing.

Drop-In Replacement: Zero Code Changes Required

Bifrost acts as a transparent middleware layer between your application and LLM providers. The integration requires only a single configuration change: pointing your OpenAI SDK client's base URL to Bifrost's endpoint. Your existing OpenAI SDK code continues to work unchanged while gaining access to governance, load balancing, semantic caching, and multi-provider routing.

The setup is remarkably simple. In Python, you configure the OpenAI client with Bifrost's base URL and a placeholder API key (actual keys are managed by Bifrost's configuration). The same pattern applies to JavaScript, requiring only base URL modification while preserving all existing request logic.

This approach means your application code remains provider-agnostic. Business logic, error handling, and integration patterns stay intact while you gain flexibility to switch providers, implement fallback strategies, or distribute load across multiple LLM services.

Multi-Provider Access Through Provider Prefixing

While Bifrost maintains OpenAI SDK compatibility, it extends functionality by supporting multiple providers through a simple model naming convention. Provider prefixes enable routing requests to different LLM services using the same client instance.

Supported provider prefixes include:

OpenAI models (default) – Use standard model names like gpt-4o-mini without prefix
Anthropic models – Prefix with anthropic/ like anthropic/claude-3-sonnet-20240229
Google Vertex AI – Prefix with vertex/ like vertex/gemini-pro
Azure OpenAI – Prefix with azure/ like azure/gpt-4o
Local Ollama – Prefix with ollama/ like ollama/llama3.1:8b

This design pattern allows applications to dynamically select providers based on use case, cost optimization, or availability requirements. You can route simple queries to cost-effective models while directing complex reasoning tasks to more capable (and expensive) providers, all through the same SDK client.

Direct Key Bypass for Development and Testing

Bifrost's key management system provides centralized API key storage and rotation, but development scenarios often require direct key usage. The Direct Key Bypass feature enables passing provider API keys directly in requests, bypassing Bifrost's load balancing while maintaining protocol compatibility.

This capability requires enabling the "Allow Direct API keys" option in Bifrost configuration. Once enabled, you can pass any provider's API key through standard authentication headers (Authorization or x-api-key). Bifrost recognizes these headers and routes requests directly to the specified provider using the provided key.

The direct key pattern supports per-request provider switching. A single application can authenticate with OpenAI keys for GPT models, Anthropic keys for Claude models, and Google keys for Gemini models, switching dynamically based on request needs. This flexibility proves invaluable during development, testing, and gradual migration scenarios.

Azure OpenAI support: For Azure deployments, use the AzureOpenAI client from the SDK and add the x-bf-azure-endpoint header specifying your Azure resource endpoint. This enables Azure OpenAI access while maintaining Bifrost's governance and observability features.

Cross-Provider Files API: Unified File Management

File handling traditionally requires provider-specific implementations. Bifrost's Files API integration normalizes file operations across OpenAI, AWS Bedrock, and Google Gemini (Anthropic uses inline requests instead of file uploads). The provider is specified through extra_body parameters for POST requests or extra_query parameters for GET requests.

Provider storage models:

OpenAI and Gemini – Use native file storage requiring no additional configuration
AWS Bedrock – Requires S3 storage configuration including bucket, region, and prefix
Anthropic – Does not support file-based operations; uses inline batch requests instead

The Files API supports standard operations: upload files for batch processing, list uploaded files with metadata, retrieve individual file information, delete files when no longer needed, and download file content for result processing.

File uploads use JSONL (JSON Lines) format, with each line representing a single request. Bifrost handles format conversion automatically – you provide requests in OpenAI-style format, and Bifrost translates to provider-specific schemas (like Bedrock's format) internally.

For Bedrock specifically, file operations require S3 configuration passed in the extra_body parameter during upload. This includes the S3 bucket name, AWS region, and optional prefix for organizing files. OpenAI and Gemini handle storage natively, requiring only the provider specification.

Cross-Provider Batch API: Asynchronous Processing at Scale

Batch processing enables cost-effective, high-throughput LLM operations by processing multiple requests asynchronously. Bifrost's Batch API provides unified access to batch capabilities across OpenAI, AWS Bedrock, Anthropic, and Google Gemini through the familiar OpenAI SDK interface.

Provider-specific batch approaches:

OpenAI and Gemini – File-based batching using uploaded JSONL files
AWS Bedrock – File-based batching with S3 input/output configuration
Anthropic – Inline request batching without file upload (Beta)

Batch operations follow a consistent workflow: create batch jobs referencing uploaded files (or inline requests for Anthropic), poll batch status using the retrieve endpoint, monitor request counts showing total, completed, and failed requests, and cancel long-running batches if needed.

OpenAI batch workflow: Upload a JSONL file containing requests in OpenAI format, create a batch job referencing the file ID, poll for completion checking status and request counts, then download results when the batch completes.

Bedrock batch workflow: Upload JSONL to S3 via Bifrost (providing S3 configuration), create batch specifying model and output S3 URI, poll for completion as Bedrock processes requests asynchronously, and retrieve results from the configured S3 output location.

Anthropic inline workflow: Create batch with inline requests array (no file upload), each request containing custom ID and model parameters, poll for completion as Anthropic processes the batch, and retrieve individual request results from the completed batch.

Gemini batch workflow: Upload JSONL file using Gemini's native storage, create batch job with model specification, poll for completion monitoring request counts, and download results when processing completes.

The unified interface means applications can implement batch processing once and switch providers based on cost, performance, or feature requirements without rewriting batch handling logic.

Governance Features Through Custom Headers

Bifrost's governance system controls access, enforces usage limits, and tracks consumption across teams or projects. These capabilities integrate seamlessly with the OpenAI SDK through custom header support.

Virtual Keys provide per-team or per-project access control with independent rate limits, budget controls, and usage tracking. Pass virtual keys through the x-bf-vk header in OpenAI SDK requests using the default_headers parameter during client initialization.

This design enables centralized policy enforcement while maintaining application simplicity. Development teams access staging models with generous limits, production teams access performance-optimized models with strict budgets, and customer-facing applications enforce per-user rate limits – all through the same codebase with different virtual key configurations.

Supported Features and Compatibility

The OpenAI integration supports all features available in both the OpenAI SDK and Bifrost core functionality. If the OpenAI SDK supports a capability and Bifrost implements it, the integration works seamlessly.

Core capabilities include:

Chat completions (streaming and non-streaming)
Multi-turn conversations with message history
Function calling and tool use
Vision capabilities for image analysis
Files API for batch processing
Batch API for asynchronous operations
Embeddings generation
Response format controls (JSON mode, structured outputs)

The protocol adaptation layer handles request transformation, ensuring provider-specific requirements are met while maintaining OpenAI SDK compatibility. Error responses are normalized to OpenAI's error format, allowing existing error handling logic to work across all providers.

Real-World Use Case: Cost Optimization Through Provider Routing

Consider a customer support application processing thousands of daily inquiries. Simple FAQ questions represent 70% of volume but don't require expensive frontier models. Complex technical issues need advanced reasoning but occur less frequently.

Before Bifrost: The application used GPT-4 for all requests, resulting in high costs despite most queries being simple. Switching to cheaper models for basic queries required code changes, separate SDK clients, and complex routing logic.

With Bifrost: The application uses a single OpenAI SDK client pointed at Bifrost. Simple queries route to gpt-4o-mini (fast and cost-effective), complex queries route to anthropic/claude-3-opus (advanced reasoning), and fallback logic automatically retries failed requests on alternative providers.

Result: 60% cost reduction through intelligent routing, improved reliability through automatic fallbacks, simplified codebase with single SDK client, and centralized observability across all providers.

The application code changed minimally – only base URL configuration and model name selection. Business logic, error handling, and integration patterns remained intact while gaining significant operational benefits.

Open Source Foundation with Enterprise Extensions

Bifrost's OpenAI SDK integration is available in the open source version, enabling immediate adoption without licensing costs. The integration includes all protocol adaptation, multi-provider routing, files API support, batch API support, and direct key bypass capabilities.

Open source features:

Complete OpenAI API compatibility
Multi-provider access (OpenAI, Anthropic, Google, Azure, Bedrock, Ollama)
Files API for OpenAI, Bedrock, and Gemini
Batch API across all supported providers
Direct key bypass for development
Custom header support for governance integration
Protocol adaptation and error normalization

Enterprise additions:

Advanced governance with fine-grained access control
Guardrails for content filtering and policy enforcement
Enterprise observability with detailed analytics dashboards
Clustering for high availability deployments
Adaptive load balancing across providers
Audit logs for compliance requirements
Vault integration for secure secret management
In-VPC deployments for data sovereignty

Getting Started: Three Steps to Multi-Provider Access

Step 1: Deploy Bifrost – Run the open source Gateway version via Docker or Kubernetes, or embed the Go SDK in your application.

Step 2: Configure Providers – Add API keys for OpenAI, Anthropic, Google, or other providers through Bifrost's configuration system or environment variables.

Step 3: Update Base URL – Change your OpenAI SDK client's base URL from https://api.openai.com/v1 to http://your-bifrost-instance:8080/openai.

Your application now has multi-provider access, governance capabilities, semantic caching, and enhanced observability – all through the OpenAI SDK interface you already know.

Conclusion: Future-Proof Your LLM Applications

The LLM landscape evolves rapidly. New providers emerge, pricing models change, and capabilities shift. Applications tightly coupled to single providers face technical debt and migration challenges as requirements evolve.

Bifrost's OpenAI SDK integration provides insurance against vendor lock-in while maintaining development velocity. You write code once using familiar OpenAI SDK patterns while gaining flexibility to switch providers, implement sophisticated routing strategies, and optimize costs based on actual usage patterns.

The combination of protocol compatibility, multi-provider routing, unified batch processing, and governance integration makes Bifrost an essential tool for production LLM applications. Whether you're building internal tools, customer-facing applications, or AI-powered platforms, Bifrost enables you to focus on business logic while it handles the complexities of multi-provider orchestration.

Try Bifrost Enterprise free for 14 days or explore the open source version at github.com/maximhq/bifrost to start building provider-agnostic LLM applications today.