Post Content
This course is a comprehensive guide to understanding and implementing DeepSeek V3, a cutting-edge deep learning model. @vukrosic shares step-by-step coding instructions and theoretical insights.
paper – https://arxiv.org/pdf/2412.19437
https://github.com/deepseek-ai/DeepSeek-V3/tree/main/inference – code by DeepSeek, just a few small changes made at the end of the video to Transformer class to for training, as this is for inference, so you need to make them manually or screenshot the video and ask AI to make the changes to this code
Try interactive AI courses we love, right in your browser: https://scrimba.com/freeCodeCamp-AI (Made possible by a grant from our friends at Scrimba)
Contents
(0:00:00) Intro
(0:01:40) Attention Mechanism
(0:13:34) Query, Key, Value
(0:34:11) KV Cache
(0:39:06) Multihead Latent Attention (MLA)
(0:58:53) Coding MLA
(1:28:41) RoPE
(1:55:44) Coding KV Cache
(2:00:25) MLA forward
(2:28:24) MoE, Gate
(2:49:25) Gate code
(3:09:10) MoE code
(3:28:36) Transformer Blocks
Thanks to our Champion and Sponsor supporters:
Drake Milly
Ulises Moralez
Goddard Tan
David MG
Matthew Springman
Claudio
Oscar R.
jedi-or-sith
Nattira Maneerat
Justin Hual
—
Learn to code for free and get a developer job: https://www.freecodecamp.org
Read hundreds of articles on programming: https://freecodecamp.org/news Read More freeCodeCamp.org
#programming #freecodecamp #learn #learncode #learncoding