How to avoid getting lost reading Scott Alexander and his 1500+ blog posts? This unaffiliated fan website lets you sort and search through the whole codex. Enjoy!

See also Top Posts and All Tags.

Minutes:
Blog:
Year:
Show all filters
1 posts found
Nov 27, 2023
acx
28 min 3,513 words 234 comments 288 likes podcast
Scott Alexander discusses recent breakthroughs in AI interpretability, explaining how researchers are beginning to understand the internal workings of neural networks. Longer summary
Scott Alexander explores recent advancements in AI interpretability, focusing on Anthropic's 'Towards Monosemanticity' paper. He explains how AI neural networks function, introduces the concept of superposition where fewer neurons represent multiple concepts, and describes how researchers have managed to interpret AI's internal workings by projecting real neurons into simulated neurons. The post discusses the implications of this research for understanding both artificial and biological neural systems, as well as its potential impact on AI safety and alignment. Shorter summary