Post Content
โย This is the stack that gets me over 4000 tokens per second locally.
Download Docker Desktop here: https://dockr.ly/4mOdGMO to get up and running with Docker Model Runner quickly.
๐ Gear Links ๐
๐ปโ Thunderbolt 5 external SSD: https://amzn.to/3XqetZO
๐ปโ Favorite 15″ display with magnet: https://amzn.to/3zD1DhQ
๐งโก Great 40Gbps T4 enclosure: https://amzn.to/3JNwBGW
๐ ๏ธ๐ My nvme ssd: https://amzn.to/3YLEySo
๐ฆ๐ฎ My gear: https://www.amazon.com/shop/alexziskind
๐ฅ Related Videos ๐ฅ
๐ Skip M3 Ultra & RTX 5090 for LLMs | NEW 96GB KING – https://youtu.be/bAao58hXo9w
๐ป Smallest RTX Pro 6000 rig | OVERKILL – https://youtu.be/JbnBt_Aytd0
๐ง Cheap mini runs a 70B LLM ๐คฏ – https://youtu.be/xyKEQjUzfAk
๐ RAM torture test on Mac – https://youtu.be/l3zIwPgan7M
๐ FREE Local LLMs on Apple Silicon | FAST! – https://youtu.be/bp2eev21Qfo
๐ช REALITY vs Appleโs Memory Claims | vs RTX4090m – https://youtu.be/fdvzQAWXU7A
๐ฆ Set up Conda – https://youtu.be/2Acht_5_HTo
๐ค INSANE Machine Learning on Neural Engine – https://youtu.be/Y2FOUg_jo7k
-Julia Turk’s FP4 video: https://www.youtube.com/watch?v=-cRedoYETzQ
-NVIDIA post on quantization: https://developer.nvidia.com/blog/optimizing-llms-for-performance-and-accuracy-with-post-training-quantization/
* ๐ ๏ธ Developer productivity Playlist – https://www.youtube.com/playlist?list=PLPwbI_iIX3aQCRdFGM7j4TY_7STfv2aXX
๐ AI for Coding Playlist: ๐ – https://www.youtube.com/playlist?list=PLPwbI_iIX3aSlUmRtYPfbQHt4n0YaX0qw
โ โ โ โ โ โ โ โ โ
โค๏ธ SUBSCRIBE TO MY YOUTUBE CHANNEL ๐บ
Click here to subscribe: https://www.youtube.com/@AZisk?sub_confirmation=1
โ โ โ โ โ โ โ โ โ
Join this channel to get access to perks:
https://www.youtube.com/channel/UCajiMK_CY9icRhLepS8_3ug/join
โ โ โ โ โ โ โ โ โ
๐ฑ ALEX ON X: https://twitter.com/digitalix
#rtxpro6000 #llm #macbookย ย ย Read Moreย Alex Ziskindย