Move AI workflows from test to production on Microsoft Foundry

Post Content

Power use case-specific enterprise AI systems with high-performance inference from Fireworks AI integrated with Microsoft Foundry. In this live demo, see how teams move from test to production by running high‑performance inference directly on Foundry. Walk through an end‑to‑end workflow that shows how unified infrastructure improves latency, reduces cost, and simplifies deployment for real enterprise AI use cases.

Seating for this session is first-come, first-served. Add it to your schedule to plan your day and arrive early to secure a spot.

𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀:
* Vignesh Sridhar

𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻:
This is one of many sessions from the Microsoft Build 2026 event. View even more sessions on-demand and learn about Microsoft Build at https://build.microsoft.com

DEMSP383 | English (US) | Agents & apps

Demo | (200) Intermediate

#MSBuild

Chapters:
0:00 – Introduction and session overview by Vignesh from Fireworks AI
00:00:45 – Scale and capabilities: supporting 30 trillion tokens per day and 180,000 requests per second
00:01:16 – Explanation of the Fireworks serving stack and workload-aware optimization
00:04:22 – Selecting and deploying a model for testing (Kimi K 2.6 example)
00:06:36 – Setting up a single-tenant deployment and performance validation
00:08:19 – Choosing models based on latency, quality, and token usage; saving as agent
00:09:45 – Selecting data sets, mapping evaluation fields, and configuring judge model
00:10:33 – Selecting Key Evaluation Metrics (Relevance, Groundedness, Coherence)
00:13:27 – Session Wrap-Up and Q&A Invitation Followed by Closing Remarks Read More Microsoft Developer