How to explore Scott Alexander's work and his 1500+ blog posts? This unaffiliated fan website lets you sort and do semantic search through the whole codex. Enjoy!

See also Top Posts and All Tags.

Tag: generalization

Minutes:
Blog:
Year:
Show all filters

1 posts found
Aug 21, 2023
acx
Read on
18 min 2,763 words 403 comments 191 likes podcast (18 min)
Scott Alexander suggests that studying human fetishes could provide insights into AI alignment challenges, particularly regarding generalization and interpretability. Longer summary
Scott Alexander explores the idea that fetish research might help understand AI alignment. He draws parallels between evolution's 'alignment' of humans towards reproduction and our attempts to align AI with human values. The post discusses how fetishes represent failures in evolution's alignment strategy, similar to potential AI alignment failures. Scott suggests that studying how humans develop fetishes could provide insights into how AIs might misgeneralize or misalign from intended goals. He proposes several speculative explanations for common fetishes and discusses how these might relate to AI alignment challenges, particularly in terms of generalization and interpretability problems. Shorter summary