PaliGemma – Making Gemma 2 see by adding a vision encoder

Estimated read time 1 min read

Post Content

​ Explore how PaliGemma adds a SigLIP vision encoder to Gemma 2. This model is pre-trained on captioning, question answering, object detection, and even segmentation. Varying image resolution and model size allows to scale the compute by a factor of 155. If you have data available for your task, fine-tuning PaliGemma will result in great performance, especially on text-related tasks.

Subscribe to Google for Developers → https://goo.gle/developers

#Gemma #GemmaDeveloperDay

Speaker: Andreas Steiner
Products Mentioned: Gemma   Read More Google for Developers 

You May Also Like

More From Author