2334 Giant Inscrutable Matrices Mechanistic Interpretability











>> YOUR LINK HERE: ___ http://youtube.com/watch?v=j5W3-Mg2384

We’ll now get into a brief intro to the inner-outer alignment dichotomy. • The basic paradigm of Deep Learning and Machine Learning in general makes things quite difficult, because of how the models are being built. • Their creation feels quite similar to evolution by natural selection which is how generations of biological organisms change. At a basic level, Machine Learning works by selecting out essentially randomly generated minds from behavioural classes; • a process taking place myriads of times during training. • We are not going into technical details on how things like Reinforcement Learning or gradient descent work, but we’ll keep it simple and try to convey the core idea of how modern AI is grown: • The model receives an input, generates an output based on its current configuration, and receives a thumbs up or thumbs down feedback. If it gets it wrong, the mathematical structures in its neurons are updated slightly in random directions hoping that in the next trial the results will be better. This process repeats again and again, trillions of times, until the algorithms that result in consistently correct results have grown. • We don’t really build it directly, the way the mind of the AI grows is almost like a mystical process and all the influence we assert is based on observations of behavior at the output. All the action is taking place on the outside! • Its inner processings, its inner world, the actual algorithms it grows inside, it’s all a complete black box. And they are bizarre and inhuman. • Recently scientists trained a tiny building block of modern AI, to do modular addition, then spent weeks reverse engineering it, trying to figure out what it was actually doing – one of the only times in history someone has understood how a generated algorithm of a transformer model works. • and This is the algorithm it had grown To basically add two numbers! • Understanding the modern AI models is a major unsolved scientific problem and The corresponding field of research has been named mechanistic interpretability. • Crucially, the implication of all this is that all we have to work with is observations of behavior of the AI during training, which typically is misleading (as we’ll demonstrate in a moment), leads to wrong conclusions and could very well in the future, with General AIs get deceitful. • Watch the full length here:    • Lethal Intelligence Guide [Part 1] - ...   • learn all about AI x-risk at https://lethalintelligence.ai/ (join the newsletter) • follow https://x.com/lethal_ai • check luminaries and notables clips at    / @lethal-intelligence-clips   • and • Go to PauseAI at https://pauseai.info/ for the best path to action!

#############################









Content Report
Youtor.org / Youtor.org Torrents YT video Downloader © 2024

created by www.mixer.tube