Scott Alexander suggests that studying human fetishes could provide insights into AI alignment challenges, particularly regarding generalization and interpretability.
Longer summary
Scott Alexander explores the idea that fetish research might help understand AI alignment. He draws parallels between evolution's 'alignment' of humans towards reproduction and our attempts to align AI with human values. The post discusses how fetishes represent failures in evolution's alignment strategy, similar to potential AI alignment failures. Scott suggests that studying how humans develop fetishes could provide insights into how AIs might misgeneralize or misalign from intended goals. He proposes several speculative explanations for common fetishes and discusses how these might relate to AI alignment challenges, particularly in terms of generalization and interpretability problems.
Shorter summary