Tag: GPT-2 · Read Scott Alexander

Jun 10, 2020

ssc

Read on

The Obligatory GPT-3 Post

24 min • 3,643 words • 263 comments • podcast (27 min)

Scott Alexander examines GPT-3's capabilities, improvements over GPT-2, and potential implications for AI development through scaling. Longer summary

Scott Alexander discusses GPT-3, a large language model developed by OpenAI. He compares its capabilities to its predecessor GPT-2, noting improvements in text generation and basic arithmetic. The post explores the implications of GPT-3's performance, discussing scaling laws in neural networks and potential future developments. Scott ponders whether continued scaling of such models could lead to more advanced AI capabilities, while also considering the limitations and uncertainties surrounding this approach. Shorter summary

Recurring tags: AI development (19), OpenAI (18), AI capabilities (15), neural networks (11), language models (11), AI forecasting (7), GPT-2 (6), GPT-3 (4)

Jan 06, 2020

ssc

Read on

A Very Unlikely Chess Game

10 min • 1,500 words • 182 comments • podcast (10 min)

Scott Alexander plays chess against GPT-2, an AI language model, and discusses the broader implications of AI's ability to perform diverse tasks without specific training. Longer summary

Scott Alexander describes a chess game he played against GPT-2, an AI language model not designed for chess. Despite neither player performing well, GPT-2 managed to play a decent game without any understanding of chess or spatial concepts. The post then discusses the work of Gwern Branwen and Shawn Presser in training GPT-2 to play chess, showing its ability to learn opening theory and play reasonably well for several moves. Scott reflects on the implications of an AI designed for text prediction being able to perform tasks like writing poetry, composing music, and playing chess without being specifically designed for them. Shorter summary

Recurring tags: AI (100), AI capabilities (15), poetry (14), machine learning (12), pattern recognition (7), GPT-2 (6), chess (4), Gwern Branwen (3)

Jun 20, 2019

ssc

Read on

If Only Turing Was Alive To See This

1 min • 136 words • 109 comments • podcast (4 min)

Scott Alexander humorously describes AI-generated content simulating humans pretending to be robots pretending to be humans on Reddit. Longer summary

Scott Alexander humorously discusses the intersection of two subreddits: r/totallynotrobots, where humans pretend to be badly-disguised robots, and r/SubSimulatorGPT2, which uses GPT-2 to imitate various subreddits. The result is a AI-generated simulation of humans pretending to be robots pretending to be humans. Scott shares an example of this amusing output and expresses wonder at the current state of technology. Shorter summary

Recurring tags: humor (45), internet culture (29), GPT-2 (6), Turing test (4), Reddit (4), AI language models (3), simulation (2)

Mar 14, 2019

ssc

Read on

Gwern’s AI-Generated Poetry

16 min • 2,349 words • 186 comments • podcast (18 min)

Scott Alexander examines AI-generated poetry produced by Gwern's GPT-2 model trained on classical poetry, highlighting its strengths and limitations. Longer summary

Scott Alexander reviews Gwern's experiment in training GPT-2 on poetry. The AI-generated poetry shows impressive command of meter and occasionally rhyme, though it tends to degrade in quality after the first few lines. Scott provides numerous examples of the AI's output, ranging from competent imitations of classical styles to more experimental forms. He notes that while the AI sometimes produces nonsensical content, it can also generate surprisingly beautiful and coherent lines. The post concludes with a reflection on how our perceptions of poetry might be influenced by knowing whether it's human or AI-generated. Shorter summary

Recurring tags: artificial intelligence (23), OpenAI (18), poetry (14), language models (11), GPT-2 (6), literary analysis (4), Gwern Branwen (3)

Feb 19, 2019

ssc

Read on

GPT-2 As Step Toward General Intelligence

23 min • 3,491 words • 262 comments • podcast (28 min)

Scott Alexander explores GPT-2's unexpected capabilities and argues that it demonstrates the potential for AI to develop abilities beyond its explicit programming, challenging skepticism about AGI. Longer summary

This post discusses GPT-2, a language model AI, and its implications for artificial general intelligence (AGI). Scott Alexander argues that while GPT-2 is not AGI, it demonstrates unexpected capabilities that arise from its training in language prediction. He compares GPT-2's learning process to human creativity and understanding, suggesting that both rely on pattern recognition and recombination of existing information. The post explores examples of GPT-2's abilities, such as rudimentary counting, acronym creation, and translation, which were not explicitly programmed. Alexander concludes that while GPT-2 is far from true AGI, it shows that AI can develop unexpected capabilities, challenging the notion that AGI is impossible or unrelated to current AI work. Shorter summary

Recurring tags: AI capabilities (15), machine learning (12), neural networks (11), language models (11), creativity (7), pattern recognition (7), GPT-2 (6), AI limitations (3)

Feb 18, 2019

ssc

Read on

Do Neural Nets Dream Of Electric Hobbits?

17 min • 2,532 words • 188 comments • podcast (17 min)

Scott Alexander draws parallels between OpenAI's GPT-2 language model and human dreaming, exploring their similarities in process and output quality. Longer summary

Scott Alexander compares OpenAI's GPT-2 language model to human dreaming, noting similarities in their processes and outputs. He explains how GPT-2 works by predicting next words in a sequence, much like the human brain predicts sensory input. The post explores why both GPT-2 and dreams produce narratives that are coherent in broad strokes but often nonsensical in details. Scott discusses theories from neuroscience and machine learning to explain this phenomenon, including ideas about model complexity reduction during sleep and comparisons to AI algorithms like the wake-sleep algorithm. He concludes by suggesting that dream-like outputs might simply be what imperfect prediction machines produce, noting that current AI capabilities might be comparable to a human brain operating at very low capacity. Shorter summary

Recurring tags: AI (100), neuroscience (65), OpenAI (18), machine learning (12), predictive processing (11), language models (11), GPT-2 (6), Karl Friston (4)