Stopping Shadow AI: How Bifrost Centralizes Every LLM Call into One Auditable Control Plane

Stopping Shadow AI: How Bifrost Centralizes Every LLM Call into One Auditable Control Plane
Shadow AI exposes organizations to data loss, compliance failures, and ungoverned LLM spend. Bifrost closes the gap by routing every LLM call through one enforced, auditable control plane.

The 2026 Verizon Data Breach Investigations Report found that 45% of corporate employees are now regular AI users on company devices, up from 15% the prior year. That tripling happened in twelve months. Most of it was not sanctioned, governed, or observable. Bifrost, the enterprise-grade open-source AI gateway built in Go by Maxim AI, is designed specifically to close this gap: every LLM call from every application, internal tool, agent, and IDE assistant routes through a single control plane that enforces access, budgets, and audit trails before any request reaches a provider.

What Shadow AI Is and Why Policies Alone Cannot Stop It

Shadow AI is the use of AI tools, models, and services by employees and development teams without IT or security oversight. It ranges from individual employees pasting internal documents into consumer chatbots, to engineering teams calling OpenAI or Anthropic directly from production code with keys stored in environment variables, to agents executing tool calls with no governance layer between them and the underlying APIs.

The governance gap is structural, not behavioral. According to Delinea's 2025 AI in Identity Security Report, 44% of organizations struggle with business units deploying AI solutions without involving IT or security teams. Banning AI tools does not work: the same DBIR data shows that 60% of malicious-insider breaches were motivated by convenience, with employees routing around restrictions to complete their work. The shadow AI version is a developer calling an unapproved model endpoint because approved tooling has not been provisioned.

Policy-layer controls, acceptable-use guidelines and access reviews, do not reach the request level. They cannot verify which model received a specific prompt, whether credentials were logged by a third-party provider, or whether a prompt injection attempt occurred. Closing shadow AI exposure requires enforcement at the infrastructure layer, where every LLM call passes through a governed ingress point before reaching any provider. The Bifrost governance overview explains how virtual keys, budgets, and routing rules implement that enforcement layer.

Why the Gateway Is the Right Enforcement Point

An enterprise AI gateway sits between applications and LLM providers. Every request enters the gateway, which authenticates the caller, checks budgets and rate limits, applies content policies, routes to the appropriate provider, and writes a complete audit record. No request reaches a provider without passing every configured check.

This architecture has one critical precondition: the gateway must be the only path to LLM providers. Provider API keys are held exclusively by the gateway. Applications, agents, and IDE assistants receive virtual keys, gateway-issued credentials that carry their own scoped permissions, and they never see raw provider credentials. When an application is decommissioned, its virtual key is revoked in one operation. When a team needs access to a new model, an administrator updates the virtual key allowlist. No credential rotation propagates across services.

This model also contains the developer-tooling surface. Engineers running Claude Code, Cursor, or other coding agents route through the same gateway as production workloads. The same budget caps, model allowlists, and audit logs that govern production services apply to every IDE session. There is no separate governance surface for developer tooling.

How Bifrost Eliminates the Shadow AI Surface

Virtual Keys: Replacing Raw Provider Credentials

The foundation of shadow AI prevention is eliminating the ability for applications to call LLM providers directly. Bifrost's virtual key system replaces raw provider API keys with gateway-issued credentials. Each virtual key defines:

  • Which providers and models the credential can access
  • A budget cap with a configurable reset window (daily, weekly, monthly)
  • Request and token rate limits
  • An organizational attachment (team or customer) for hierarchical cost attribution

Applications authenticate using standard API key headers. Bifrost resolves the virtual key to the correct provider, model, and underlying credential. If a key is compromised or a service is retired, access is revoked at the gateway in a single operation with no downstream credential churn.

The hierarchical budget system adds a second containment layer: a customer-level monthly cap, team-level department budget, and virtual key-level per-service cap all apply simultaneously. Any single exhausted budget blocks the request at that tier, preventing runaway spend at every organizational level.

Immutable Audit Logs: Answering the Questions Regulators Ask

Shadow AI creates audit gaps. When applications call providers directly, there is no organization-level record of which models were called, what data was sent, whether a prompt injection attempt occurred, or which credential authorized the request. A GDPR inquiry, HIPAA audit, or SOC 2 review cannot be satisfied by fragmented per-application logs.

Bifrost's audit logs capture every security-relevant event with cryptographic signing that prevents post-hoc modification. Every log entry records the authenticated principal, virtual key, provider, model, budget state at request time, rate limit state, and the policy decision made. The event categories covered include:

  • Authentication events: logins, session creation, MFA verification, failed attempts, account lockouts
  • Authorization events: model access attempts, provider access checks, virtual key usage, budget limit checks, rate limit violations, permission denials
  • Configuration changes: virtual key creation and modification, team and customer updates, budget adjustments, guardrail configuration changes
  • Security events: prompt injection attempts, jailbreak attempts, unusual access patterns, API key abuse, guardrail violations

Logs export to Splunk, Datadog, Elastic, and webhook endpoints for integration with existing SIEM infrastructure. Retention duration is configurable, and the immutability guarantee, enforced through HMAC signing, satisfies the tamper-evident evidence requirements of SOC 2, GDPR, HIPAA, and ISO 27001.

RBAC and Identity Provider Integration

Shadow AI grows in environments where provisioning sanctioned AI access is slow or bureaucratic. If the approved path requires a weeks-long access request, developers route around it. Bifrost's role-based access control reduces that friction by integrating with existing identity providers.

RBAC in Bifrost Enterprise ships with Admin, Developer, and Viewer system roles, plus unlimited custom roles for QA teams, security auditors, compliance officers, and any other organizational function. Permissions are defined at the resource level: who can create virtual keys, modify budgets, change routing rules, or query audit logs.

OpenID Connect integration with Okta, Zitadel, Keycloak, and Microsoft Entra means role assignments derive directly from the organization's IdP groups. When a developer joins a team in Okta, they receive gateway access automatically, with the permissions appropriate to their role, without a separate provisioning step. When they leave, deprovisioning revokes access in both systems.

This matters for shadow AI specifically because the friction of getting sanctioned access is a primary driver of ungoverned workarounds. An identity-integrated gateway with fast, role-appropriate provisioning removes the incentive to go around the system.

Content Guardrails: Enforcing Policy at the Request Level

The IBM 2025 Cost of a Data Breach Report found that unauthorized AI usage adds $670,000 to average breach costs. A significant portion of that exposure comes from sensitive data sent to external model providers: source code, internal documents, customer records, and credentials that employees paste into prompts without awareness of the risk.

Bifrost's guardrails enforce content policies at the request level, scanning both prompt inputs and model outputs before they transit the gateway. Organizations configure validation rules using Common Expression Language (CEL) expressions, and attach provider profiles from AWS Bedrock Guardrails, Azure Content Safety, Google Model Armor, GraySwan Cygnal, and Patronus AI. Native in-process regex detection and secrets detection run without any external service call, making them suitable for air-gapped deployments.

When a guardrail fires, the gateway returns a block response with violation detail and writes a security event to the audit log. The application receives a structured error; the security team sees a timestamped, signed record of what was blocked and why.

Observability: Visibility Across the Full AI Stack

Shadow AI is partly a visibility problem. Platform teams cannot govern what they cannot see. Once traffic routes through Bifrost, every request becomes observable: provider, model, token counts, latency, virtual key identity, cost, and response status are captured per request without any instrumentation in application code.

Bifrost's built-in observability exposes native Prometheus metrics and OpenTelemetry traces. These feed directly into Grafana, Datadog, New Relic, or any OTLP-compatible collector. The Datadog connector surfaces per-key spend, rate limit utilization, and model usage in LLM Observability dashboards. For teams requiring export to data lakes, log exports push telemetry to S3, GCS, BigQuery, and other targets on a configurable schedule.

This visibility also powers cost attribution. When every call carries a virtual key identity, cost reports are accurate to the team, customer, or service level, without manual tagging or post-hoc reconciliation.

Deployment: Making the Gateway the Only Path

Governance only works if the gateway is the single ingress to LLM providers. The implementation sequence is:

  1. Register provider credentials in the gateway. Provider API keys are stored in Bifrost, optionally backed by HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, or Azure Key Vault through data access control. Applications never hold provider credentials directly.
  2. Issue virtual keys per consumer. Each application, team, or agent receives a scoped virtual key. Existing SDK integrations require only a base URL change to route through Bifrost, making migration a drop-in operation for teams using the OpenAI, Anthropic, Bedrock, or Google GenAI SDKs.
  3. Enable audit logging and observability. Export logs to the existing SIEM. Point Prometheus and OpenTelemetry collectors at the gateway.
  4. Configure guardrails for required content policies. PII redaction, secrets detection, and prompt injection defense can be applied to specific virtual keys, routes, or models.
  5. Revoke direct provider access. Once traffic has migrated to the gateway, rotate and retire any provider keys held by application services.

For regulated environments and teams with data residency requirements, Bifrost Enterprise supports in-VPC deployment where the gateway, audit logs, and all request data remain within the organization's private network perimeter. The EU AI Act, now in active enforcement, requires organizations to demonstrate documented AI usage controls. Ungoverned request flows make that demonstration impossible; a centralized gateway with immutable logs makes it straightforward.

What Changes After Centralization

Once every LLM call routes through Bifrost, the governance picture changes across four dimensions:

  • Audit completeness: Every request carries an immutable record. Regulatory inquiries have a complete answer: which model, which caller, what policy applied, what the outcome was.
  • Cost attribution: Token spend maps to organizational entities automatically. No manual tagging, no reconciliation against provider invoices.
  • Policy enforcement: Changes to model access, budget caps, or content policies apply to all consumers at once. There is no per-service rollout.
  • Incident response: A compromised key or anomalous usage pattern appears in the audit log immediately. Revocation is a single API call.

For a detailed view of how the governance model maps to specific compliance frameworks, the Bifrost governance resource page covers virtual keys, RBAC, audit logs, and SSO integration alongside the compliance frameworks each control addresses.

Getting Started

Bifrost is available as an open-source Docker image and binary that can be deployed in minutes. The full governance feature set, including virtual keys, hierarchical budgets, and route-level controls, is available in the open-source tier. Enterprise features, including RBAC, SSO integration, immutable audit logs, and in-VPC deployment, are available through Bifrost Enterprise.

To see how Bifrost can close the shadow AI exposure in your organization and walk through a governance configuration for your specific architecture, book a demo with the Bifrost team.