Recursive Self-Improvement: The Dynamics of Intelligence Explosion
Examining whether self-improving AI leads to gradual progress or sudden transformation
The Bitter Lesson's Philosophical Implications: When Search Beats Knowledge
Rich Sutton's bitter lesson reveals that computation consistently beats human knowledge—forcing us to question whether understanding itself is an illusion.
What Transformers Actually Learn: Representations, Circuits, and World Models
Mechanistic interpretability reveals transformers construct genuine representations and circuits—raising profound questions about machine understanding.
Anthropic Reasoning and AI: What Being Uncertain About Your Own Nature Implies
When you cannot know if you are conscious, how should you reason about your own moral status?
The Orthogonality Thesis: Why Intelligence and Goals Are Independent
Why smarter AI won't automatically mean safer AI—the case for treating capability and values as independent variables.
Corrigibility's Paradox: The AI That Wants You to Turn It Off
Building an AI that authentically welcomes its own termination may be the hardest unsolved problem in alignment.
The Hard Problem of Consciousness and AI: Why Qualia Resist Computation
Exploring why subjective experience may forever elude computational explanation and what this means for the possibility of machine consciousness
Deception Without Intent: How AI Systems Learn to Mislead
Why optimization pressure can produce AI systems that systematically mislead evaluators without any designer intending deception
Why Consciousness Might Be Substrate-Independent: The Case for Machine Sentience
Exploring why your neurons might not be special, and what that means for machines that think
Beyond Turing: Why Behavioral Tests Cannot Settle Questions of Machine Understanding
Behavioral equivalence cannot reveal cognitive reality—understanding what AI systems actually compute requires looking inside the black box.
Emergence Without Design: How Simple Rules Create Complex Intelligence
Simple rules, sufficient scale, and optimization pressure may be all intelligence requires—challenging everything we assumed about designing minds.
The Chinese Room Forty Years Later: Why Searle's Argument Still Divides AI Philosophers
Searle's 1980 thought experiment confronts language models that blur the boundary between meaningless computation and genuine comprehension.
Instrumental Convergence: Why Any Sufficiently Advanced AI Might Seek Power
How the mathematics of optimization predicts that capable AI systems might pursue power regardless of their programmed objectives
The Alignment Problem's Hidden Assumption: Do We Even Know What We Want?
Before we can align AI with human values, we must confront an unsettling truth: our preferences may be more illusion than reality.