Inference at Scale: How DeepL Built an AI Infrastructure for Real-Time Language AI

Estimated read time 1 min read

Post Content

 

​ DeepL, a global AI product and research company focused on building secure, intelligent solutions to complex business problems, has achieved unprecedented speed and accuracy in AI translation by deploying context-dependent AI translation in milliseconds with NVIDIA and EcoDataCenter.

The company’s AI infrastructure, named “Arion”—powered by an NVIDIA DGX SuperPOD with DGX GB200 systems—enables millions of daily users across dozens of languages to receive accurate translations with near-zero latency.

This video highlights the collaboration with DeepL, NVIDIA, and EcoDataCenter to provide highly accurate, fluent, and context-aware AI language translations that feel natural to native speakers. 

In addition, DeepL is leveraging NVIDIA TensorRT LLM and NVFP4 inference on NVIDIA GB200 NVL72 systems to train Mixture of Experts (MoE) models, advancing its model architecture to improve efficiency during training and inference, setting new benchmarks for performance in AI.   Read More NVIDIA 

#Techno #nvidia

You May Also Like

More From Author