gitmyhub

openai-realtime-agents

TypeScript ★ 6.9k updated 5mo ago

This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.

This repository is a demonstration app from OpenAI showing two patterns for building voice-based AI agents using their Realtime API. The Realtime API lets a program have a low-latency, streaming spoken conversation with an AI model rather than sending one text message at a time and waiting for a response.

The first pattern, called Chat-Supervisor, uses two models working together. A faster, cheaper model handles the conversational side, greeting the user and collecting information through voice. When a question requires a tool call or more careful reasoning, the fast model quietly consults a smarter text model and then delivers that answer back to the user. The result is a voice agent that responds quickly in conversation while still getting high-quality answers on harder questions.

The second pattern is called Sequential Handoff. Here, multiple specialized agents each have their own instructions and tools. When the user's request falls outside what the current agent handles, the agent passes the conversation to a more appropriate one. This mirrors how a customer service phone system routes callers to different departments, but the routing is decided by the model rather than by a rigid menu.

The project is a Next.js web application, which is a common framework for building websites with TypeScript. You install dependencies, add an OpenAI API key, and run a local server. Opening a browser then shows the demo interface, where you can switch between the two agent configurations using a dropdown.

The README includes diagrams explaining how each pattern works, guidance on adapting the code for your own agent, and notes on trading off cost versus response quality. The full README is longer than what was shown.