Test and Version Prompts in an Interactive Playground with Bifrost
Bifrost's interactive prompt playground lets teams build, test, and version prompts inside the AI gateway, then ship committed versions to production with a header.
Prompts are the control layer for every LLM application. They shape tone, guardrails, tool selection, and reasoning behavior, yet most teams still manage them as hardcoded strings scattered across application code. An interactive prompt playground fixes this by giving engineers, product managers, and QA teams a shared workspace to write, test, and version prompts before shipping them. Bifrost brings this workflow directly into the AI gateway, so the same prompt you iterate on in the UI is the one your application calls in production, no separate tool, no parallel SDK, no extra network hop.
This post walks through how Bifrost's prompt repository and playground work, how sessions and versions keep iteration safe, and how committed versions attach to live inference requests through simple HTTP headers.
What is an Interactive Prompt Playground
An interactive prompt playground is a workspace where developers compose prompt messages, run them against live LLM providers, inspect outputs, tweak parameters, and save versions without redeploying application code. It functions as a REPL for natural-language instructions: write prompt, run, inspect completion, adjust, repeat. A production-grade playground adds version control, multi-provider testing, and a path from draft to deployed prompt.
Bifrost's playground sits inside the gateway itself. That placement matters because every prompt you test runs through the same routing, governance, observability, and key management that handle your production traffic. You are not testing in a sandbox that behaves differently from production; you are testing in production infrastructure with a UI on top.
How the Bifrost Prompt Repository is Structured
The Bifrost prompt repository is organized around four concepts that mirror how engineering teams actually work:
- Folders: Logical groupings of prompts, typically by product area, feature, or use case. Each folder has a name and optional description, and prompts can live inside folders or at the root level.
- Prompts: The main unit in the repository. A prompt is a container that holds the full lifecycle of a single prompt template, from early drafts to production-ready releases.
- Sessions: Editable working copies where you experiment. You can modify messages, switch providers, adjust parameters, and run the prompt repeatedly without touching committed versions.
- Versions: Immutable snapshots of a prompt. Once committed, a version cannot be edited. Each version stores the full message history, provider and model configuration, model parameters, and a commit message.
Versions are numbered sequentially (v1, v2, v3, and so on), and you can restore any previous version from the dropdown next to the Commit Version button. This structure is the baseline every prompt versioning workflow should meet: immutable history, clear commit trails, and one-click rollback.
Workspace Layout and Getting Started
The playground uses a three-panel layout that keeps authoring, testing, and configuration visible at the same time:
- Sidebar (left): Browse prompts, manage folders, and organize items with drag-and-drop.
- Playground (center): Build and test your prompt messages.
- Settings (right): Configure provider, model, API key, variables, and model parameters.
A typical first run looks like this:
- Create a folder (optional) to group related prompts by team or feature.
- Create a new prompt and assign it to a folder.
- Add messages to the playground: system messages for instructions, user messages for input, assistant messages for few-shot examples.
- Configure the provider, model, and parameters in the settings panel.
- Click Run (or press Cmd/Ctrl + S) to execute the prompt. Use the + Add button to append a message to history without executing.
- Save the session to preserve your work, then commit a version when you are satisfied.
If a session has unsaved changes, a red asterisk appears next to the prompt name. Saved sessions can be renamed and restored from the dropdown next to the Save button, which keeps different experimental branches accessible without polluting your version history.
Multi-Provider Testing Inside the Gateway
One of the hardest parts of prompt engineering is comparing behavior across models. The same system prompt that works well on one provider can produce noticeably different outputs on another. The Bifrost playground lets you switch between providers and models directly in the settings panel, routing each run through Bifrost's unified OpenAI-compatible interface.
Because the playground runs on top of Bifrost's 20+ supported providers, a single prompt can be tested against OpenAI, Anthropic, AWS Bedrock, Google Vertex AI, Azure OpenAI, Groq, Mistral, Cohere, and more without switching tools or re-keying credentials. You can also select which API key to use:
- Auto: Uses the first available key for the selected provider.
- Specific key: Select a particular key for this run.
- Virtual key: Uses a governance-managed key with its own budgets, rate limits, and access controls.
Routing playground traffic through virtual keys means every experiment is still covered by budgets, quotas, and audit logs. Prompt experimentation stops being a governance blind spot and starts behaving like any other controlled engineering activity.
Message Types and Multimodal Support
The playground supports the message roles and artifact types that agent workflows actually require:
- System messages for behavior and instructions.
- User messages for input.
- Assistant messages for model responses or few-shot examples.
- Tool calls for function calls made by the model.
- Tool results for mock or real responses from called tools.
This is what makes the playground useful beyond single-turn chat. Teams building agents can simulate a full tool-use loop, inspect how the model decides which tool to invoke, and catch cases where the reasoning chain breaks down. For models that support multimodal input, user messages can also carry attachments like images and PDFs, enabled automatically when the selected model supports them.
Version Control for Production-Ready Prompts
Prompts in production need the same rigor as application code. An analysis of prompt versioning best practices highlights immutability, commit messages, and traceable rollback as the three foundations of a reliable workflow. Bifrost's version model matches this directly.
When you commit a version, the following are frozen into an immutable snapshot:
- The selected message history (system, user, assistant, tool calls, tool results).
- The provider and model configuration.
- Model parameters including temperature, max tokens, streaming flag, and any other settings.
- A commit message describing the change.
When the current session diverges from the last saved version, an Unpublished Changes badge appears. This removes any ambiguity about what is shipping. If someone opens a prompt tomorrow and sees v7, they know v7 is exactly what it was on the day it was committed, regardless of how much iteration happened on sessions afterward.
Using Committed Prompt Versions in Production
A playground only earns its keep when the prompts it produces run unchanged in production. Bifrost closes this loop through the Prompts plugin, which attaches committed versions to live inference requests without any client-side prompt management code.
Two HTTP headers control the behavior:
bf-prompt-id: UUID of the prompt in the repository. Required to enable injection.bf-prompt-version: Integer version number (for example,3for v3). Optional. If omitted, the latest committed version is used.
The plugin resolves the requested prompt and version, merges the stored model parameters into the request (request values win on conflicts), and prepends the version's message history to the incoming messages (Chat Completions) or input (Responses API). Your application still sends the dynamic user turn; the template comes from the repository.
A Chat Completions request looks like this:
curl -X POST <http://localhost:8080/v1/chat/completions> \\
-H "Content-Type: application/json" \\
-H "bf-prompt-id: YOUR-PROMPT-UUID" \\
-H "bf-prompt-version: 3" \\
-H "x-bf-vk: sk-bf-your-virtual-key" \\
-d '{
"model": "openai/gpt-5.4",
"messages": [
{ "role": "user", "content": "Tell me about Bifrost Gateway?" }
]
}'
Because the plugin keeps an in-memory cache that reloads when prompts are created, updated, or deleted through the gateway APIs, new commits become visible to production without a process restart. This decouples prompt releases from application deploys, which is the outcome every mature prompt management setup is aiming for.
Why a Gateway-Native Playground Matters
Most LLM teams end up stitching together three or four tools: one for prompt authoring, one for evaluation, one for routing, and one for observability. Every boundary between those tools is a place where a prompt tested in staging ends up differing from the prompt that actually runs in production. A gateway-native playground removes those boundaries:
- Identical execution path: Playground runs and production runs travel through the same routing, fallbacks, caching, and guardrails. There is no "but it worked in the playground" class of bugs.
- Shared governance: Virtual keys, budgets, rate limits, and audit logs apply to experimentation the same way they apply to production traffic.
- One source of truth: Committed versions live in the same config store that powers inference. A production request references the exact artifact you committed.
- No extra SDK: Clients use standard OpenAI-compatible APIs with two optional headers. There is no prompt-fetching library to pin, update, or monitor.
Teams that need deeper evaluation, scenario simulation, and production quality monitoring can pair the Bifrost playground with Maxim AI's evaluation stack, but the core loop of writing, testing, versioning, and serving prompts lives inside Bifrost itself.
Start Using the Bifrost Prompt Playground
The interactive prompt playground turns prompt engineering into a disciplined, collaborative workflow: folders for organization, sessions for safe iteration, versions for immutable releases, and HTTP headers for production attachment. Because it ships as part of the Bifrost AI gateway, you get this alongside multi-provider routing, governance, caching, and observability, with no second platform to operate.
To see how Bifrost can centralize prompt management alongside your AI gateway, explore the Bifrost resources hub or book a demo with the Bifrost team.