Building Voice Agents with Gemini Live API and Agora’s Conversational AI

Post Content

Mason from Agora walks through how to drop Gemini 3.1 Flash Live into Agora’s real-time voice and video infrastructure. Speech-to-speech with multilingual switching, sub-second latency, and tool calls wired to actual hardware.

What’s covered: cloning the Agora agent quick start, configuring App ID and certificate in the Agora console, enabling conversational AI, swapping the default chained pipeline (STT, LLM, TTS) for Gemini Live in a single SDK method, and pointing the WebSocket at Google’s server. Plus two live demos: a Reachy Mini robot calling 70+ tool emotes mapped to physical motors, and a food ordering agent (Foodgora) handling cart updates and recommendations in real time.

Grab your Gemini API key at Google AI Studio and your Agora credentials at agora.io to get started.

Resources:
Gemini Live API overview → https://goo.gle/4tFoFeK
GitHub examples → https://goo.gle/4uj3HCw

What are you building with the Gemini Live API? Drop it in the comments.

Subscribe to Google for Developers → https://goo.gle/developers

Speaker: Mason
Products Mentioned: Google AI, Gemini Read More Google for Developers