Build real-time multimodal agents with Gemini and Pipecat

Estimated read time 1 min read

Post Content

​ Chad Bailey from the Pipecat team walks through what’s possible with the new Gemini 3 multimodal real-time model: flight search, lodging lookup, Google Search grounding, trip report generation, and a language tutor agent, all in a single voice conversation.

Note: The public string for this model is gemini-3.1-flash-live. The string used in the video is for the Early Access Partner program and is now turned down.

What’s covered: Scaffolding a bot with the Pipecat CLI, configuring Gemini 3 with minimal thinking for lower latency, writing system prompts that hold up across long conversations, defining and registering tool calls, enabling Google Search grounding, saving trip reports to disk, and running multiple agents in a single bot file with Pipecat Agents.

What are you building with Gemini Live API? Drop it in the comments.

Resources:
Gemini Live API overview → https://goo.gle/47vg4Tc
Get started at pipecat.ai → https://goo.gle/4ch4LAx
Pipecat examples → https://goo.gle/4uYe93z

Subscribe to Google for Developers → https://goo.gle/developers

Speaker: Chad Bailey from the Pipecat
Products Mentioned: Google AI, Gemini   Read More Google for Developers 

You May Also Like

More From Author