How to avoid getting lost reading Scott Alexander and his 1500+ blog posts? This unaffiliated fan website lets you sort and search through the whole codex. Enjoy!

See also Top Posts and All Tags.

Minutes:
Blog:
Year:
Show all filters
1 posts found
Jul 26, 2022
acx
50 min 6,446 words 298 comments 107 likes podcast
Scott Alexander examines the Eliciting Latent Knowledge (ELK) problem in AI alignment and various proposed solutions. Longer summary
Scott Alexander discusses the Eliciting Latent Knowledge (ELK) problem in AI alignment, which involves training an AI to truthfully report what it knows. He explains the challenges of distinguishing between an AI that genuinely tells the truth and one that simply tells humans what they want to hear. The post covers various strategies proposed by the Alignment Research Center (ARC) to solve this problem, including training on scenarios where humans are fooled, using complexity penalties, and testing the AI with different types of predictors. Scott also mentions the ELK prize contest and some criticisms of the approach from other AI safety researchers. Shorter summary