Embedding Gemma: On-Device RAG Made Easy

Post Content

In this video we learn how to use Google’s Embedding Gemma (300M) to build fast, on-device RAG with ≈200MB memory and support for 100+ languages. We will look at a RAG example.

LINK:
https://developers.googleblog.com/en/introducing-embeddinggemma/
https://huggingface.co/blog/embeddinggemma
https://arxiv.org/pdf/2205.13147
https://huggingface.co/blog/matryoshka
https://github.com/google-gemini/gemma-cookbook/blob/main/Gemma/%5BGemma_3%5DRAG_with_EmbeddingGemma.ipynb
https://ai.google.dev/gemma/docs/embeddinggemma/fine-tuning-embeddinggemma-with-sentence-transformers
https://ai.google.dev/gemma/docs/embeddinggemma/model_card

Website: https://engineerprompt.ai/

RAG Beyond Basics Course:
https://prompt-s-site.thinkific.com/courses/rag

Let’s Connect:
🦾 Discord: https://discord.com/invite/t4eYQRUcXB
☕ Buy me a Coffee: https://ko-fi.com/promptengineering
|🔴 Patreon: https://www.patreon.com/PromptEngineering
💼Consulting: https://calendly.com/engineerprompt/consulting-call
📧 Business Contact: engineerprompt@gmail.com
Become Member: http://tinyurl.com/y5h28s6h

💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off).

Signup for Newsletter, localgpt:
https://tally.so/r/3y9bb0

TIMESTAMPS:
00:00 EmbeddingGemma
02:15 Comparison with Other Embedding Models
02:41 Google’s Interesting position
03:21 Dense Embeddings is Killing Retrieval
06:13 RAG with EmbeddingGemma
09:37 Fine-Tuning and Training the Model Read More Prompt Engineering

#AI #promptengineering