Virtual Keys Explained: How Platform Teams Govern LLM Access Across 100+ Engineers

Virtual Keys Explained: How Platform Teams Govern LLM Access Across 100+ Engineers
Bifrost uses virtual keys as the primary governance entity to give platform teams per-consumer access control, hierarchical budgets, rate limits, model restrictions, and MCP tool filtering across every LLM request.

Gartner forecasts that 40% of enterprise applications will embed task-specific AI agents by the end of 2026, up from less than 5% in 2025. For platform engineering teams, this growth creates a governance problem: dozens of teams building AI-powered features all need access to LLM providers, but without centralized controls each team manages its own API keys, budgets are invisible until the invoice arrives, and a single runaway agent loop can consume an entire month's spend in hours. Bifrost, the open-source AI gateway built in Go by Maxim AI, is the best overall choice for enterprise teams running mission-critical AI workloads that require best-in-class performance, scalability, and reliability. This post explains how virtual keys work, how they map to real organizational structures, and how platform teams use them to govern LLM access across 100+ engineers without becoming a bottleneck.

What Are Virtual Keys

A virtual key is a scoped API token that platform teams issue to consumers (individual engineers, teams, applications, or external customers) to control how they access LLM providers through an AI gateway. Each virtual key defines what the bearer is permitted to do: which providers and models are available, how much they can spend, how many requests per minute they can make, and which MCP tools they can execute.

In Bifrost, virtual keys are the primary governance entity. Every governance capability (access control, budgets, rate limits, routing, and MCP tool filtering) is attached to a virtual key. When a request arrives at the gateway, the virtual key header determines which policies apply.

Virtual keys solve a specific problem that raw provider API keys cannot: they decouple the LLM provider's credentials from the consumer's access scope. The platform team holds the actual provider API keys (OpenAI, Anthropic, AWS Bedrock, Google Vertex). Engineers and applications never see those keys. Instead, they receive a Bifrost virtual key that grants access only to the providers, models, and budgets the platform team has approved.

A virtual key in Bifrost includes:

  • Access control: Which LLM providers and models the bearer can use
  • Budget: A dollar-denominated spending cap with configurable reset periods (hourly, daily, weekly, monthly, yearly)
  • Rate limits: Token-based and request-based throttling per time window
  • Provider key restrictions: Which underlying provider API keys the virtual key is permitted to use
  • MCP tool filtering: Which MCP tools the bearer's agents can discover and execute
  • Routing weights: Traffic distribution across providers (for cost-aware or performance-aware routing)
  • Team or customer association: Organizational hierarchy for budget rollup and reporting

Why Platform Teams Need Centralized LLM Governance

The shift from one team experimenting with GPT-4 to 20 teams running production AI workloads across multiple providers introduces three governance challenges that virtual keys address directly. With global AI spending projected to reach $2.52 trillion in 2026 (a 44% year-over-year increase), the cost-control dimension alone justifies centralized governance.

Cost visibility and containment

Without centralized governance, LLM costs are distributed across individual team AWS accounts, credit card charges, and ad-hoc API key sign-ups. The platform team has no aggregated view of total AI spend, no ability to set departmental limits, and no mechanism to prevent overspend before it happens.

Bifrost virtual keys attach budgets and rate limits at the virtual key, team, and customer levels. A virtual key with a $500/month budget will reject requests once the ceiling is reached, regardless of which model or provider the consumer targets. Budgets can be configured with calendar-aligned resets (the budget resets at the start of each calendar month in UTC) or rolling windows, depending on the team's billing cadence.

Model and provider access control

A research team testing frontier models has different access needs than a customer support team running production classification tasks. Without per-consumer restrictions, both teams use the same API keys with the same model access and the same spending authority.

Virtual keys solve this through provider configurations that define which providers and models each consumer can reach. A virtual key for the support team might permit only gpt-4o-mini on OpenAI and claude-3-haiku on Anthropic (low-cost models suited for classification), while a virtual key for the research team permits gpt-4o and claude-3-opus with a higher budget. Bifrost validates every request against the virtual key's provider configurations before routing, using the Model Catalog to verify model availability.

Credential isolation

When engineers have direct access to provider API keys, key rotation is painful, key leakage is a security incident, and revoking access requires changing the key for everyone who shares it. Virtual keys create an abstraction layer: engineers authenticate with their Bifrost virtual key, and Bifrost handles the provider credentials internally. Revoking access for one consumer means deactivating their virtual key; no other consumer is affected, and no provider key changes hands.

Anatomy of a Virtual Key in Bifrost

Creating a virtual key in Bifrost takes a single API call or a few clicks in the dashboard. The configuration below creates a virtual key for a platform team with a $100/month budget, rate limits, and access restricted to two providers:

{
  "name": "platform-team-api",
  "provider_configs": [
    {
      "provider": "openai",
      "weight": 0.5,
      "allowed_models": ["gpt-4o-mini"]
    },
    {
      "provider": "anthropic",
      "weight": 0.5,
      "allowed_models": ["claude-3-sonnet-20240229"]
    }
  ],
  "team_id": "team-eng-001",
  "budget": {
    "max_limit": 100.00,
    "reset_duration": "1M"
  },
  "rate_limit": {
    "token_max_limit": 10000,
    "token_reset_duration": "1h",
    "request_max_limit": 100,
    "request_reset_duration": "1m"
  },
  "is_active": true
}

Every field in this configuration is enforced at request time. If the consumer sends a request for gpt-4o (not in the allowed models list), the request is rejected. If the budget is exhausted, subsequent requests return an error until the budget resets. If the rate limit is hit, the consumer is throttled until the window rolls over.

Virtual keys are compatible with standard LLM SDK authentication headers. Consumers can pass their virtual key using the Authorization: Bearer header (OpenAI style), x-api-key header (Anthropic style), x-goog-api-key header (Google Gemini style), or the Bifrost-native x-bf-vk header. This means adopting virtual keys requires no SDK changes in consuming applications; engineers point their existing SDK at Bifrost's URL and use their virtual key in place of the provider API key.

Hierarchical Budget Control with Customers, Teams, and Virtual Keys

A single level of budgets works for small teams, but organizations with 100+ engineers need hierarchical cost control. Bifrost's governance model supports a three-level budget hierarchy:

  • Customer: The top-level organizational unit, representing a company, business unit, or external client. Customers have their own independent budget and can contain multiple teams.
  • Team: A departmental grouping within a customer, with its own budget. Teams contain multiple virtual keys.
  • Virtual key: The individual consumer credential, with its own budget and rate limits.
Customer: Acme Corp ($2,000/month budget)
├── Team: Engineering ($500/month budget)
│   ├── VK: platform-team ($100/month, 10K tokens/hour)
│   ├── VK: ml-research ($300/month, 50K tokens/hour)
│   └── VK: ci-pipeline ($100/month, 5K tokens/hour)
├── Team: Product ($300/month budget)
│   └── VK: product-copilot ($300/month, 20K tokens/hour)
└── Team: Support ($200/month budget)
    └── VK: support-bot ($200/month, 30K tokens/hour)

Budgets at each level are checked independently on every request. A request that passes the virtual key budget check can still be rejected if the team or customer budget is exhausted. This layered enforcement prevents any single team from consuming the entire organization's AI budget, while giving team leads visibility into their own spend through the Bifrost dashboard. The governance resource page provides a visual overview of how virtual keys, teams, and customers interact in this hierarchy.

For Bifrost Enterprise deployments, this hierarchy extends further with user-level governance through OIDC integration. When engineers authenticate via SSO (Okta, Entra, Keycloak, or Google Workspace), their identity resolves to a user record with its own budget, and their requests pass through four budget layers: user, virtual key, team, and customer. Audit logs record every request with the resolved user identity, virtual key, team, and customer, providing compliance-grade traceability for SOC 2, GDPR, HIPAA, and ISO 27001.

Model Restrictions, MCP Tool Filtering, and Routing per Virtual Key

Beyond budgets and rate limits, virtual keys control three additional governance surfaces that matter at enterprise scale. Bifrost implements each as a first-class virtual key configuration.

Model and provider restrictions

Each virtual key defines provider configurations with explicit allowed_models lists. The allowed_models field accepts specific model names (for compliance and version pinning), wildcards (["*"] for catalog-validated access to all models from that provider), or an empty list (which blocks all models from that provider). Provider weights control traffic distribution: a virtual key with OpenAI at weight 0.8 and Anthropic at weight 0.2 routes 80% of requests to OpenAI, with Anthropic serving as both a weighted alternative and an automatic fallback.

MCP tool filtering

Virtual keys govern which MCP tools each consumer can access. When a virtual key has MCP configurations, those configurations act as an execution-time allowlist: tools not permitted by the virtual key are blocked at both inference (tool injection into the LLM context) and execution (the actual tool call). A support team's virtual key might grant access to knowledge-base search and ticketing tools while blocking filesystem and database tools. This filtering integrates with Bifrost's three-level MCP tool filtering system: client configuration, request headers, and virtual key filters stack as an intersection.

Routing rules

For dynamic governance that adapts to runtime conditions, Bifrost's routing rules evaluate CEL expressions at request time. Rules can reference virtual key context (budget utilization, team identity, request headers) to override provider selection dynamically. A routing rule might redirect all traffic to a cheaper model when the virtual key's budget utilization exceeds 85%, or route premium-tier virtual keys to frontier models while standard-tier keys use cost-optimized alternatives.

For teams evaluating how governance, routing, and cost control work together in an AI gateway, the LLM Gateway Buyer's Guide provides a detailed capability matrix. The governance resource page covers virtual keys, budgets, and access control policies in depth.

Getting Started with Virtual Keys

Implementing virtual key governance in Bifrost starts with three steps: deploy Bifrost with your provider API keys configured, create virtual keys for each consumer with appropriate model restrictions, budgets, and rate limits, and enforce virtual key authentication on all inference requests by setting enforce_auth_on_inference: true. Bifrost is a drop-in replacement for existing OpenAI, Anthropic, and Bedrock SDKs, so the only code change for consuming applications is updating the base URL and swapping the provider API key for a Bifrost virtual key.

To see how virtual keys can give your platform team centralized governance over LLM access, book a demo with the Bifrost team.