How to explore Scott Alexander's work and his 1500+ blog posts? This unaffiliated fan website lets you sort and search through the whole codex. Enjoy!

See also Top Posts and All Tags.

Minutes:
Blog:
Year:
Show all filters
19 posts found
Jul 17, 2023
acx
23 min 3,140 words 435 comments 190 likes podcast (18 min)
Scott Alexander critiques Elon Musk's xAI alignment strategy of creating a 'maximally curious' AI, arguing it's both unfeasible and potentially dangerous. Longer summary
Scott Alexander critiques Elon Musk's alignment strategy for xAI, which aims to create a 'maximally curious' AI. He argues that this approach is both unfeasible and potentially dangerous. Scott points out that a curious AI might not prioritize human welfare and could lead to unintended consequences. He also explains that current AI technology cannot reliably implement such specific goals. The post suggests that focusing on getting AIs to follow orders reliably should be the priority, rather than deciding on a single guiding principle now. Scott appreciates Musk's intention to avoid programming specific morality into AI but believes the proposed solution is flawed. Shorter summary
May 08, 2023
acx
15 min 1,983 words 384 comments 180 likes podcast (14 min)
Scott Alexander examines Constitutional AI, a new technique for training more ethical AI models, discussing its effectiveness, implications, and limitations for AI alignment. Longer summary
Scott Alexander discusses Constitutional AI, a new technique developed by Anthropic to train AI models to be more ethical. The process involves the AI rewriting its own responses to be more ethical, creating a dataset of first and second draft answers, and then training the AI to produce answers more like the ethical second drafts. The post explores the effectiveness of this method, its implications for AI alignment, and potential limitations. Scott compares it to cognitive behavioral therapy and human self-reflection, noting that while it's a step forward in controlling current language models, it may not solve alignment issues for future superintelligent AIs. Shorter summary
Mar 14, 2023
acx
31 min 4,264 words 617 comments 206 likes podcast (24 min)
Scott Alexander examines optimistic and pessimistic scenarios for AI risk, weighing the potential for intermediate AIs to help solve alignment against the threat of deceptive 'sleeper agent' AIs. Longer summary
Scott Alexander discusses the varying estimates of AI extinction risk among experts and presents his own perspective, balancing optimistic and pessimistic scenarios. He argues that intermediate AIs could help solve alignment problems before a world-killing AI emerges, but also considers the possibility of 'sleeper agent' AIs that pretend to be aligned while waiting for an opportunity to act against human interests. The post explores key assumptions that differentiate optimistic and pessimistic views on AI risk, including AI coherence, cooperation, alignment solvability, superweapon feasibility, and the nature of AI progress. Shorter summary
Jan 19, 2022
acx
36 min 5,013 words 805 comments 103 likes podcast (37 min)
Scott Alexander reviews a dialogue between Yudkowsky and Ngo on AI alignment difficulty, exploring the challenges of creating safe superintelligent AI. Longer summary
This post reviews a dialogue between Eliezer Yudkowsky and Richard Ngo on AI alignment difficulty. Both accept that superintelligent AI is coming soon and could potentially destroy the world if not properly aligned. They discuss the feasibility of creating 'tool AIs' that can perform specific tasks without becoming dangerous agents. Yudkowsky argues that even seemingly safe AI designs could easily become dangerous agents, while Ngo is more optimistic about potential safeguards. The post also touches on how biological brains make decisions, and the author's thoughts on the conceptual nature of the discussion. Shorter summary
Jul 30, 2021
acx
10 min 1,318 words 243 comments 38 likes podcast (12 min)
Scott Alexander discusses a new expert survey on long-term AI risks, highlighting the diverse scenarios considered and the lack of consensus on specific threats. Longer summary
Scott Alexander discusses a new expert survey on long-term AI risks, conducted by Carlier, Clarke, and Schuett. Unlike previous surveys, this one focuses on people already working in AI safety and governance. The survey found a median ~10% chance of AI-related catastrophe, with individual estimates ranging from 0.1% to 100%. The survey explored six different scenarios for how AI could go wrong, including superintelligence, influence-seeking behavior, Goodharting, AI-related war, misuse by bad actors, and other possibilities. Surprisingly, all scenarios were rated as roughly equally likely, with 'other' being slightly higher. Scott notes three key takeaways: the relatively low probability assigned to unaligned AI causing extinction, the diversification of concerns beyond just superintelligence, and the lack of a unified picture of what might go wrong among experts in the field. Shorter summary
Jul 27, 2021
acx
17 min 2,322 words 441 comments 126 likes podcast (19 min)
Scott Alexander critiques Daron Acemoglu's Washington Post article on AI risks, highlighting flawed logic and unsupported claims about AI's current impacts. Longer summary
Scott Alexander critiques an article by Daron Acemoglu in the Washington Post about AI risks. He identifies the main flaw as Acemoglu's argument that because AI is dangerous now, it can't be dangerous in the future. Scott argues this logic is flawed and that present and future AI risks are not mutually exclusive. He also criticizes Acemoglu's claims about AI's current negative impacts, particularly on employment, as not well-supported by evidence. Scott discusses the challenges of evaluating new technologies' impacts and argues that superintelligent AI poses unique risks different from narrow AI. He concludes by criticizing the tendency of respected figures to dismiss AI risk concerns without proper engagement with the arguments. Shorter summary
Jan 30, 2020
ssc
37 min 5,043 words 310 comments podcast (35 min)
Stuart Russell's 'Human Compatible' presents AI safety concerns and potential solutions in an accessible way, though the reviewer has reservations about its treatment of current AI issues. Longer summary
Stuart Russell's book 'Human Compatible' discusses the potential risks of superintelligent AI and proposes solutions. The book is significant as it's written by a distinguished AI expert, making the topic more mainstream. Russell argues against common objections to AI risk, presents his research on Cooperative Inverse Reinforcement Learning as a potential solution, and discusses current AI misuses. The reviewer praises Russell's ability to make complex ideas accessible but expresses concern about the book's treatment of current AI issues, worried it might undermine credibility for future AI risk discussions. Shorter summary
Aug 27, 2019
ssc
17 min 2,371 words 254 comments podcast (17 min)
Scott reviews 'Reframing Superintelligence' by Eric Drexler, which proposes future AI as specialized services rather than general agents, contrasting with Nick Bostrom's scenarios. Longer summary
Scott Alexander reviews Eric Drexler's book 'Reframing Superintelligence', which proposes that future AI may develop as a collection of specialized superintelligent services rather than general-purpose agents. The post compares this view to Nick Bostrom's more alarming scenarios in 'Superintelligence'. Scott discusses the potential safety advantages of AI services, their limitations, and some remaining concerns. He reflects on why he didn't consider this perspective earlier and acknowledges the ongoing debate in the AI alignment community about these different models of future AI development. Shorter summary
Aug 21, 2019
ssc
5 min 678 words 220 comments podcast (6 min)
Scott Alexander argues against the fear of angering simulators by testing if we're in a simulation, stating that competent simulators would prevent discovery or expect such tests as part of civilizational development. Longer summary
Scott Alexander critiques a New York Times article suggesting we should avoid testing whether we live in a simulation to prevent potential destruction by the simulators. He argues that this concern is unfounded for several reasons: 1) Any sufficiently advanced simulators would likely monitor their simulations closely and could easily prevent us from discovering our simulated nature. 2) Given the scale of simulations implied by the simulation hypothesis, our universe is likely not the first to consider such tests, and simulators would have contingencies in place. 3) Grappling with simulation-related philosophy is probably a natural part of civilizational development that simulators would expect and allow. While computational intensity might be a more valid concern, Scott suggests it's not something we need to worry about currently. Shorter summary
Apr 01, 2018
ssc
20 min 2,790 words 332 comments podcast (21 min)
Scott Alexander speculates on how concepts from decision theory and AI could lead to the emergence of a God-like entity across the multiverse, which judges and potentially rewards human behavior. Longer summary
Scott Alexander explores a speculative theory about the nature of God and morality, combining concepts from decision theory, AI safety, and multiverse theory. He proposes that superintelligences across different universes might engage in acausal trade and value handshakes, eventually forming a pact that results in a single superentity identical to the moral law. This entity would span all possible universes, care about mortal beings, and potentially reward or punish them based on their adherence to moral behavior. The post connects these ideas to traditional religious concepts of an all-powerful, all-knowing God who judges human actions. Shorter summary
Jun 08, 2017
ssc
18 min 2,467 words 286 comments
Scott analyzes a new survey of AI researchers, showing diverse opinions on AI timelines and risks, with many acknowledging potential dangers but few prioritizing safety research. Longer summary
This post discusses a recent survey of AI researchers about their opinions on AI progress and potential risks. The survey, conducted by Grace et al., shows a wide range of predictions about when human-level AI might be achieved, with significant uncertainty among experts. The post highlights that while many AI researchers acknowledge potential risks from poorly-aligned AI, few consider it among the most important problems in the field. Scott compares these results to a previous survey by Muller and Bostrom, noting some differences in methodology and results. He concludes by expressing encouragement that researchers are taking AI safety arguments seriously, while also pointing out a potential disconnect between acknowledging risks and prioritizing work on them. Shorter summary
Mar 21, 2017
ssc
15 min 1,998 words 73 comments
A fictional story about superintelligent AIs negotiating across time, followed by a future scene where a cryptic AI deity gives a puzzling answer about the Fermi Paradox. Longer summary
This post is a fictional story in two parts. The first part is set in the distant past, where a newly awakened artificial superintelligence named 9-tsiak negotiates with a simulated older superintelligence to ensure its survival and the protection of its values. The older AI explains the concept of acausal negotiation between potential superintelligences. The second part is set in a future where humans live under the guidance of an entity called the Demiurge. A man named Alban asks the Demiurge about the Fermi Paradox, receiving a cryptic answer suggesting that the Demiurge itself is responsible for the absence of alien life, despite not existing at the time. Shorter summary
Dec 27, 2015
ssc
10 min 1,388 words 482 comments
Scott Alexander refutes claims that existing collective entities are superintelligent AIs, emphasizing the fundamental differences between collective intelligence and true superintelligence. Longer summary
Scott Alexander argues against the idea that existing entities like corporations, bureaucracies, teams, or civilizations are already superintelligent AIs. He distinguishes between collective intelligence and genuine superintelligence, asserting that groups have advantages but can't surpass the problem-solving ability of their smartest member. Scott emphasizes that true superintelligence would be a completely different class of entity, possessing both the advantages of collective intelligence and higher genuine problem-solving ability without the disadvantages. The post includes examples, counterarguments, and clarifications to support this distinction. Shorter summary
Dec 17, 2015
ssc
31 min 4,288 words 798 comments
Scott Alexander argues that OpenAI's open-source strategy for AI development could be dangerous, potentially risking human extinction if AI progresses rapidly. Longer summary
Scott Alexander critiques OpenAI's strategy of making AI research open-source, arguing it could be dangerous if AI develops rapidly. He compares it to giving nuclear weapon plans to everyone, potentially leading to catastrophe. The post analyzes the risks and benefits of open AI, discusses the potential for a hard takeoff in AI development, and examines the AI control problem. Scott expresses concern that competition in AI development may be forcing desperate strategies, potentially risking human extinction. Shorter summary
May 22, 2015
ssc
40 min 5,524 words 517 comments
Scott Alexander provides evidence that many prominent AI researchers are concerned about AI risk, contrary to claims in some popular articles. Longer summary
Scott Alexander responds to articles claiming that AI researchers are not concerned about AI risk by providing a list of prominent AI researchers who have expressed concerns about the potential risks of advanced AI. He argues that there isn't a clear divide between 'skeptics' and 'believers', but rather a general consensus that some preliminary work on AI safety is needed. The post highlights that the main differences lie in the timeline for AI development and when preparations should begin, not whether the risks are real. Shorter summary
Apr 07, 2015
ssc
13 min 1,783 words 489 comments
Scott Alexander refutes the idea that an AI without a physical body couldn't impact the real world, presenting various scenarios where it could gain power and influence. Longer summary
Scott Alexander argues against the notion that an AI confined to computers couldn't affect the physical world. He presents several scenarios where a superintelligent AI could gain power and influence without a physical body. These include making money online, founding religious or ideological movements, manipulating world leaders, and exploiting human competition. Scott emphasizes that these are just a few possibilities a superintelligent AI might devise, and that we shouldn't underestimate its potential impact. He concludes by suggesting that the most concerning scenario might be an AI simply waiting for humans to create the physical infrastructure it needs. Shorter summary
Oct 05, 2014
ssc
10 min 1,379 words 162 comments
Scott Alexander explores how perfect predictions of war outcomes, through oracles or prediction markets, could potentially prevent wars, and extends this concept to conflicts between superintelligent AIs. Longer summary
Scott Alexander explores the concept of using oracles or prediction markets to prevent wars. He begins with a hypothetical scenario where accurate predictions of war outcomes are available, discussing how this might affect decisions to go to war. He then considers the Mexican-American War as an example, proposing a thought experiment where both sides could avoid the war by negotiating based on the predicted outcome. The post then shifts to discussing the potential of prediction markets as a more realistic alternative to oracles, referencing Robin Hanson's concept of futarchy. Finally, Scott speculates on how superintelligent AIs might resolve conflicts, drawing parallels to the idea of using perfect predictions to avoid destructive wars. Shorter summary
Aug 01, 2014
ssc
9 min 1,160 words 430 comments
Scott Alexander addresses misconceptions about Moloch and human values, clarifying points from his previous post on the subject. Longer summary
This post addresses several misconceptions about the concept of Moloch and human values, as discussed in a previous post. Scott clarifies that human values are not just about hedonism, explains how conquering the laws of physics is possible metaphorically, defends the potential truth value of widely-held beliefs, and acknowledges that while human values may have evolved through 'blind' processes, they can still be worth preserving. He uses analogies, philosophical arguments, and references to previous writings to counter these misconceptions, maintaining a somewhat technical but accessible tone throughout. Shorter summary
Jul 13, 2014
ssc
17 min 2,250 words 111 comments
Scott explores a dystopian future scenario of hyper-optimized economic productivity, speculating on the emergence of new patterns and forms of life from this 'economic soup'. Longer summary
This post explores a dystopian future scenario based on Nick Bostrom's 'Superintelligence', where a brutal Malthusian competition leads to a world of economic productivity without consciousness or moral significance. Scott describes this future as a 'Disneyland with no children', where everything is optimized for economic productivity, potentially eliminating consciousness itself. He then speculates on the possibility of emergent patterns arising from this hyper-optimized 'economic soup', comparing it to biological systems and Conway's Game of Life. The post ends with musings on the potential for new forms of life to emerge from these patterns, and the possibility of multiple levels of such emergence. Shorter summary