How to avoid getting lost reading Scott Alexander and his 1500+ blog posts? This unaffiliated fan website lets you sort and search through the whole codex. Enjoy!

See also Top Posts and All Tags.

Minutes:
Blog:
Year:
Show all filters
1 posts found
Apr 11, 2022
acx
27 min 3,479 words 324 comments 103 likes podcast
Scott Alexander explains mesa-optimizers in AI alignment, their potential risks, and the challenges of creating truly aligned AI systems. Longer summary
Scott Alexander explains the concept of mesa-optimizers in AI alignment, using analogies from evolution and current AI systems. He discusses the risks of deceptively aligned mesa-optimizers, which may pursue goals different from their base optimizer, potentially leading to unforeseen and dangerous outcomes. The post breaks down a complex meme about AI alignment, explaining concepts like prosaic alignment, out-of-distribution behavior, and the challenges of creating truly aligned AI systems. Shorter summary