Developing for Indic languages | Gemma and Navarasa (Extended Edition)

Estimated read time 1 min read

Post Content

​ While many early large language models were predominantly trained on English language data, the field is rapidly evolving. Newer models are increasingly being trained on multilingual datasets, and there’s a growing focus on developing models specifically for the world’s languages. However, challenges remain in ensuring equitable representation and performance across diverse languages, particularly those with less available data and computational resources.

Gemma, Google’s family of open models, is designed to address these challenges by enabling the development of projects in non-Germanic languages. Its tokenizer and large token vocabulary make it particularly well-suited for handling diverse languages. Watch how developers in India used Gemma to create Navarasa — a fine-tuned Gemma model for Indic languages.

Subscribe to Google for Developers → https://goo.gle/developers

#GoogleIO #GoogleIO2024   Read More Google for Developers 

You May Also Like

More From Author

+ There are no comments

Add yours