Want to dive into Scott Alexander's work, but feeling lost in his thousands of blog posts? This fan website lets you sort and do semantic search through the whole codex. Enjoy!

See also Top Posts and All Tags.

Tag: Goodharting

Minutes:
Blog:
Year:
Show all filters

1 posts found
Jul 30, 2021
acx
Read on
9 min 1,318 words 243 comments 38 likes podcast (12 min)
Scott Alexander discusses a new expert survey on long-term AI risks, highlighting the diverse scenarios considered and the lack of consensus on specific threats. Longer summary
Scott Alexander discusses a new expert survey on long-term AI risks, conducted by Carlier, Clarke, and Schuett. Unlike previous surveys, this one focuses on people already working in AI safety and governance. The survey found a median ~10% chance of AI-related catastrophe, with individual estimates ranging from 0.1% to 100%. The survey explored six different scenarios for how AI could go wrong, including superintelligence, influence-seeking behavior, Goodharting, AI-related war, misuse by bad actors, and other possibilities. Surprisingly, all scenarios were rated as roughly equally likely, with 'other' being slightly higher. Scott notes three key takeaways: the relatively low probability assigned to unaligned AI causing extinction, the diversification of concerns beyond just superintelligence, and the lack of a unified picture of what might go wrong among experts in the field. Shorter summary