The moment we stopped understanding AI AlexNet
>> YOUR LINK HERE: ___ http://youtube.com/watch?v=UZDiGooFs54
Thanks to KiwiCo for sponsoring today's video! Go to https://www.kiwico.com/welchlabs and use code WELCHLABS for 50% off your first month of monthly lines and/or for 20% off your first Panda Crate. • Activation Atlas Posters! • https://www.welchlabs.com/resources/5... • https://www.welchlabs.com/resources/a... • https://www.welchlabs.com/resources/l... • https://www.welchlabs.com/resources/a... • Special thanks to the Patrons: • Juan Benet, Ross Hanson, Yan Babitski, AJ Englehardt, Alvin Khaled, Eduardo Barraza, Hitoshi Yamauchi, Jaewon Jung, Mrgoodlight, Shinichi Hayashi, Sid Sarasvati, Dominic Beaumont, Shannon Prater, Ubiquity Ventures, Matias Forti • Welch Labs • Ad free videos and exclusive perks: / welchlabs • Watch on TikTok: / welchlabs • Learn More or Contact: https://www.welchlabs.com/ • Instagram: / welchlabs • X: / welchlabs • References • AlexNet Paper • https://proceedings.neurips.cc/paper_... • Original Activation Atlas Article- explore here - Great interactive Atlas! https://distill.pub/2019/activation-a... • Carter, et al., Activation Atlas , Distill, 2019. • Feature Visualization Article: https://distill.pub/2017/feature-visu... • `Olah, et al., Feature Visualization , Distill, 2017.` • Great LLM Explainability work: https://transformer-circuits.pub/2024... • Templeton, et al., Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet , Transformer Circuits Thread, 2024. • “Deep Visualization Toolbox by Jason Yosinski video inspired many visuals: • • Deep Visualization Toolbox • Great LLM/GPT Intro paper • https://arxiv.org/pdf/2304.10557 • 3B1Bs GPT Videos are excellent, as always: • • Attention in transformers, step-by-st... • • Transformers (how LLMs work) explaine... • Andrej Kerpathy's walkthrough is amazing: • • Let's build GPT: from scratch, in cod... • Goodfellow’s Deep Learning Book • https://www.deeplearningbook.org/ • OpenAI’s 10,000 V100 GPU cluster (1+ exaflop) https://news.microsoft.com/source/fea... • GPT-3 size, etc: Language Models are Few-Shot Learners, Brown et al, 2020. • Unique token count for ChatGPT: https://cookbook.openai.com/examples/... • GPT-4 training size etc, speculative: • https://patmcguinness.substack.com/p/... • https://www.semianalysis.com/p/gpt-4-... • Historical Neural Network Videos • • Convolutional Network Demo from 1989 • • Perceptron Research from the 50's 6... • Errata • 1:40 should be: word fragment is appended to the end of the original input . Thanks for Chris A for finding this one.
#############################
![](http://youtor.org/essay_main.png)