
OpenAI's Realtime API is now optimized and generally available. ZDNET
OpenAI has introduced GPT-Realtime, its most powerful voice AI model to date. The model, unveiled alongside the full release of the company’s updated Realtime API, promises faster, cheaper, and more natural conversations than ever before.
The launch marks a major step forward in voice-driven artificial intelligence. OpenAI says the model is ready for large-scale production after months of beta testing.
Cutting Down on Latency
The Realtime API first appeared in beta in October 2024. It allowed developers to build voice assistants capable of responding quickly and fluidly, similar to ChatGPT’s advanced voice mode.
Before its release, creating voice assistants involved multiple steps — transcription, language model processing, and text-to-speech conversion. This approach caused delays between questions and answers.
Realtime API simplified the process by letting the AI process audio directly. Developers embraced the tool, with thousands creating applications ranging from personal assistants to education tools.
GPT-Realtime: A New Standard in Voice AI
The updated API’s centerpiece is GPT-Realtime, a speech-to-speech model designed for complex, real-world tasks. OpenAI says it can follow nuanced instructions more reliably, generate expressive speech, and even switch languages mid-sentence.
The model also introduces two fresh voice options, Cedar and Marin, expanding the range of tones available. According to the company, GPT-Realtime adapts more smoothly to context, recognizing laughter, handling images, and adjusting its tone for different situations.
Experts in education, customer support, and personal assistance collaborated with OpenAI to align the model with real-world needs. The result is an AI tool capable of sounding less like a machine and more like a conversational partner.
MCP Support: A Universal Connector for AI
A notable upgrade is MCP support, or model context protocol. MCP standardizes how AI models connect with external data sources. OpenAI compares it to a “USB port for AI.”
With MCP, developers no longer need to create custom integrations to link AI systems to business data. This flexibility is expected to be particularly useful in industries like e-commerce, travel, and customer service, where quick access to accurate information is crucial.
Lower Costs for Developers
The new Realtime API also comes with a significant price cut. The earlier version cost $40 per one million audio input tokens and $80 for one million output tokens.
The updated model reduces these prices to $32 and $64 respectively, making advanced voice AI more affordable for businesses and independent developers.
Early Reactions From Industry Leaders
Several companies secured early access to GPT-Realtime. Zillow, the real estate platform, tested the model and praised its improvements.
Josh Weisberg, head of Zillow AI, noted that GPT-Realtime displayed stronger reasoning and more natural speech than earlier models. He said it could handle multistep tasks, such as narrowing property searches by lifestyle preferences or explaining financing options.
Weisberg added that the technology may soon make home-buying conversations feel “as natural as talking with a friend.”
A New Era for Voice AI
With GPT-Realtime, OpenAI is positioning itself at the forefront of voice-based artificial intelligence. The combination of speed, expressiveness, and affordability could push AI assistants further into everyday life — from customer support desks to classrooms and personal devices.

