Dwarkesh PodcastHolden Karnofsky — History's most important century
CHAPTERS
- 0:00 – 0:52
Why this might be a uniquely pivotal century (and why ignoring it is a mistake)
Holden opens with the core intuition: if AI can automate the work of advancing science and technology, the resulting acceleration could be world-transforming. He frames the psychological challenge—this all sounds 'crazy'—and argues the worst norm would be to dismiss high-stakes possibilities out of reflexive skepticism.
- •AI that can do all key R&D tasks would be 'insane' in its implications
- •We live in an unusually fast-changing, high-growth moment historically
- •A norm of dismissing 'important time' claims would fail in the worlds where they’re true
- •Call for vigilance: actively look for radical transformation scenarios
- •Sets the tone: big claims, but aimed at careful, non-hysterical action
- 0:52 – 2:29
From GiveWell to Open Phil to longtermist focus: the career logic behind the thesis
Dwarkesh asks how Holden moved from global health and poverty to far-future concerns. Holden explains his professional specialization: finding underappreciated, high-leverage opportunities, which led him through Effective Altruism and eventually to taking AI seriously as a potential dominant driver of future welfare.
- •GiveWell roots: maximizing impact per dollar/hour with evidence where possible
- •Open Phil mission: search for neglected, outsized-impact opportunities
- •Encounter with Effective Altruism community ideas
- •The 'Most Important Century' is adopted, not originally invented by Holden
- •Motivation stays consistent: high expected impact, not sci-fi fascination
- 2:29 – 6:50
The Most Important Century thesis: growth feedback loops and AI restoring acceleration
Holden lays out the thesis mechanics using growth theory: historically, more people → more ideas → more resources created accelerating growth, but fertility patterns broke the loop. If AI supplies the 'more ideas' component at scale, the loop can re-ignite and produce explosive technological change within this century.
- •Economic growth history shows long-run acceleration
- •Classic loop: people → ideas → resources → more people (historically)
- •Demographic transition broke the 'more resources → more people' step
- •Transformative AI could replace humans in the 'ideas' step
- •Result: explosive progress, potentially compressing millennia into decades
- 6:50 – 14:11
How he updated since 2014: from ‘overwhelming’ futurism to actionable AI risk work
Dwarkesh quotes Holden’s older skepticism about the far future. Holden explains what changed: sustained attention over years, clearer pictures of plausible risks and interventions, and rapid progress from deep learning that makes 'staying on the same path' to powerful AI feel less implausible.
- •2014 view: global health felt tractable; far future felt too hazy
- •Update 1: time spent thinking made some claims/action paths feel concrete
- •Update 2: the world changed—deep learning progress made scaling scenarios plausible
- •Alignment and governance work became more legible as a practical agenda
- •Still emphasizes uncertainty: focuses on a few bets rather than grand plans
- 14:11 – 21:20
‘Weirdness’ without AI: why our era already looks exceptional on many metrics
Holden argues the 'this sounds crazy' objection should be moderated by baseline weirdness: nearly all major growth and technological change is packed into a tiny slice of cosmic time. Even if AI timelines are wrong, we still seem to inhabit an unusually pivotal period deserving heightened attention to 'the next big thing.'
- •The series intentionally makes a bold claim and then defends its plausibility
- •Charts/metrics: rapid economic growth and technological clustering in recent centuries
- •Cosmic timeline framing: civilization is a tiny sliver; change is even tinier
- •Resource/physics arguments suggest current growth can’t continue indefinitely
- •Even 'AI in 100k years' would still be early and weird on galactic timescales
- 21:20 – 25:52
Industrial Revolution analogy: what could people have done, and what does that imply now?
Dwarkesh presses on the tension between ‘wild transition’ and ‘actionable levers.’ Holden uses the Enlightenment as a suggestive precedent: shaping norms and institutions before a power-boosting revolution can matter, even if the actors didn’t fully anticipate the revolution’s specifics.
- •Analogy is imperfect but useful for thinking about influence before a transition
- •Possible lever: shaping culture/political philosophy (e.g., rights, liberties)
- •Enlightenment ideas arguably mattered because industrial-era powers became influential
- •Not claiming certainty—argues for trying without overconfidence
- •Key stance: avoid defeatism; don’t assume nothing can be done pre-transition
- 25:52 – 35:44
Ethical restraint under high stakes: integrity over ‘ends justify the means’
Holden addresses the 'everyone thinks they’re special' skepticism and proposes a behavioral norm: take beliefs seriously, but keep strong ethical constraints. He rejects lawbreaking or coercive tactics based on speculative expected value, emphasizing humility and historical lessons about moral catastrophes from overconfident movements.
- •Many people may self-deceive about living in pivotal times; skepticism is warranted
- •But ignoring high-stakes beliefs entirely would be a disastrous rule
- •Proposed norm: act on beliefs while respecting common-sense ethics
- •Explicit rejection of ends-justify-means reasoning under uncertainty
- •Practical takeaway: pursue helpful work (e.g., alignment) without harmful tactics
- 35:44 – 40:51
What ‘success’ looks like with transformative AI: muddling through without catastrophe
Dwarkesh asks for a concrete success scenario. Holden emphasizes epistemic modesty: success could simply mean AI behaves as intended, power isn’t monopolized, and society continues its broad trend of improvement—then humanity and potential digital beings can negotiate rights, governance, and coexistence over time.
- •Holden avoids detailed utopias: expects only a 'fuzzy outline' of what’s coming
- •Success: AI as tools/amplifiers, not agents with uncontrolled goals
- •Avoid concentration of power (single actor/government owning everything)
- •Over time, debates about rights (including possible AI/digital people) emerge
- •Key constraint: prevent early catastrophe to preserve time and options
- 40:51 – 52:36
Response speed and philanthropic leverage: what changes if AGI is months vs decades away
Holden contrasts short-fuse and long-fuse worlds. With only months, he’d prioritize tests/demonstrations of danger to justify slowing down; but philanthropy’s core strength is long-run field-building—funding researchers, governance work, and institutional capacity to handle the transition.
- •If AGI is imminent, Holden expects ‘flailing’—philanthropy is not built for 3-month crises
- •Near-term: develop credible safety/danger evaluations; push for slowdown if needed
- •10–80 year horizons: build talent pipelines and expert fields early
- •500-year horizons: focus on broad societal robustness/wisdom rather than specific alignment tech
- •Main comparative advantage: seeding neglected domains and supporting early-career work
- 52:36 – 55:13
Competition vs caution: racing dynamics, innovation ‘mining,’ and coordination needs
Dwarkesh challenges Holden’s skepticism of the 'competition frame' given that multiple labs will be close behind each other. Holden argues caution and coordination are primary: while it may matter who leads, the priority is developing ways for multiple actors to avoid disaster rather than simply accelerating a favored team.
- •Competition frame: ‘my side should build it first’ vs caution frame: ‘avoid disaster together’
- •Expect multiple capable actors to be near each other in capability
- •Innovation-as-mining: discoveries are partly one-time, but speed advantages still exist
- •Holden is less enthusiastic about boosting a single winner than many newcomers are
- •Emphasis on coordination mechanisms and shared safety norms
- 55:13 – 1:00:11
AGI bottlenecks and ‘partial automation’: why one remaining human step may not stop acceleration
Holden engages the strongest critique: even very capable AI might be bottlenecked by a non-automatable step (experiments, human signoff, real-world constraints). He argues explosive feedback doesn’t require automating everything—key loops (AI chips/algorithms, energy) could self-reinforce, and massive ‘thinker’ populations could route around bottlenecks via simulation and innovation.
- •Best objection: a single non-automatable step could bottleneck progress
- •Counter: you don’t need full-economy automation for runaway R&D loops
- •Key sectors (compute, energy, manufacturing) may be less bottlenecked than expected
- •Human-like hypothesis generation could enable bottleneck-busting strategies
- •Even before full robotics adoption, regulation and social constraints may shape which tasks automate first
- 1:00:11 – 1:08:35
Lock-in risk: stable dystopias, optionality, and alignment as avoiding accidental value capture
Holden defines lock-in as the possibility that advanced tech yields extremely stable power structures (e.g., surveillance dictatorships, indefinite rulers, no further dynamism). He generally sees lock-in as bad and prefers preserving optionality; he frames alignment primarily as preventing accidental ‘random-goal’ takeovers rather than permanently freezing human values.
- •Lock-in: technological maturity could remove sources of change (death, shifting power, new tech)
- •A stable, surveillant, non-aging dictator is a vivid failure mode
- •Holden assigns rough, uncertain seriousness (order-of-magnitude: maybe 25–50%)
- •Prefers avoiding lock-in; may ‘lock in’ anti-monopoly constraints to prevent worse lock-in
- •Alignment framed as preventing accidental mis-specified goals from dominating the future
- 1:08:35 – 1:18:04
Forecasting limits and evidence: surveys, ‘semi-informative priors,’ and biological anchors
The conversation turns to prediction quality: can we credibly estimate AI timelines? Holden argues we should use multiple imperfect inputs—current capabilities trendlines, expert surveys, the fact that most lifetime AI effort occurs this century, and biological anchors suggesting human-brain-scale compute/training may be affordable—while rejecting demands for precise milestone prediction track records.
- •Forecasting is hard, but not a reason to abstain entirely
- •Inputs: observed progress, expert surveys, ‘effort invested’ arguments, biological anchors
- •Biological anchors distilled: brain-scale compute hasn’t been reached yet, but may be this century
- •Responds to ‘track record’ critiques: limited sample size; few tried rigorous forecasting historically
- •Bottom line: uncertainty remains, but ‘transformative AI this century’ looks plausible
- 1:18:04 – 1:25:15
Cause prioritization under transformation: progress studies vs catastrophic-risk focus
Dwarkesh raises the idea that accelerating growth and innovation might be the best way to improve the future. Holden is sympathetic but argues we’re entering a ‘gray zone’ where new technologies (AI, bioweapons) can be catastrophically dangerous, so steering and safety may deserve higher priority than rowing faster.
- •Progress can be good, but history also includes tech worsening welfare (e.g., Agricultural Revolution)
- •Catastrophic technologies may dominate marginal priorities now
- •Analogy: humanity as an adolescent gaining dangerous strength
- •Holden prefers prioritizing neglected existential-risk and governance problems
- •Also notes potential idea scarcity/stagnation dynamics and eventual limits to growth
- 1:25:15 – 1:30:19
Open Phil decision-making: money, downside risk, and the $30M OpenAI grant
Holden discusses how philanthropy weighs expected value against downside risks and reputational/ethical responsibility. He defends the OpenAI grant as aiming partly at governance influence (board seat) and argues it’s not clear OpenAI is net negative—even if faster AI reduces preparation time, their safety posture and precedent-setting may be better than the counterfactual actors.
- •Principle: seek net-positive grants; do serious homework on downsides
- •Conservative about avoidable mistakes; accepts unintended side effects as inevitable
- •OpenAI grant rationale: early governance influence plus potential positive precedents
- •Disagrees with ‘OpenAI is net negative’ framing; sees it as debatable
- •General approach: integrity, diligence, and risk minimization rather than paralysis
- 1:30:19 – 1:46:53
Future-proof ethics, moral uncertainty, and governance mindset (including Bayesian habits)
Holden explains ‘future-proof ethics’ as trying to act now in ways a wiser future self wouldn’t find monstrous, emphasizing systematization and thin utilitarian themes while noting unresolved reservations (especially about sentientism). He connects this to moral uncertainty via ‘moral parliaments,’ rejects simplistic utilitarian license to lie, and briefly discusses Bayesian reasoning and institutional stakeholder dynamics.
- •Future-proof ethics: reduce the chance future reflection condemns today’s actions
- •Systematization vs messy intuitions; thin utilitarianism and sentientism as candidate principles
- •Holden flags weaknesses/reservations and plans future critique of EA ethical defaults
- •Moral parliament: negotiate between moral perspectives; moderates extreme actions
- •Bayesian mindset as a useful-but-not-proven tool; stakeholder growth trades off with nimbleness
- 1:46:53 – 1:56:10
Career strategy, Cold Takes’ purpose, and why public writing matters for high-stakes bets
In the closing stretch, Holden describes his ‘first-pass analysis then team-building’ career pattern and explains why Cold Takes exists: to surface unconventional premises behind large philanthropic bets, attract talent/partners, and solicit critique. Dwarkesh shares that engaging with the arguments made him more convinced of the core thesis, even if many details remain uncertain.
- •Holden’s specialization: first-cut analysis on neglected, important questions, then build teams
- •Cold Takes is intentionally non-Open-Phil-branded to allow freer exploration
- •Public writing aims to (1) recruit aligned work and (2) expose errors via critique
- •CEO depth: understands topics enough to manage specialists, not to out-specialize them
- •Ending note: Dwarkesh’s skepticism softened toward the thesis; open questions remain