Skip to content
Dwarkesh PodcastDwarkesh Podcast

Adam Marblestone on Dwarkesh Patel: Why AI Uses Blunt Losses

Evolution-encoded cost functions give the amygdala a steering role; it labels cortex neurons with a per-stage curriculum that LLMs have no equivalent for.

Dwarkesh PatelhostAdam Marblestoneguest
Dec 30, 20251h 49mWatch on YouTube ↗

CHAPTERS

  1. 0:00 – 2:37

    Why humans learn so efficiently: evolution’s “Python code” of reward functions

    Dwarkesh opens with the puzzle of why brains achieve more capability with far less data than today’s LLMs. Marblestone frames brain intelligence in ML terms (architecture, learning rule, initialization, and especially cost/reward functions) and argues AI has underweighted the complexity of biologically-evolved objectives and curricula.

    • Decomposing intelligence into architecture, learning algorithm, initialization, and loss/reward
    • Hunch: the brain’s key advantage is rich, staged, specialized cost functions
    • Evolution may encode curricula—many context-dependent “losses” for different circuits
    • ML prefers mathematically simple objectives; biology likely does not
  2. 2:37 – 6:32

    Cortex as a general prediction engine: omnidirectional inference vs next-token prediction

    Marblestone explores the idea that cortex might be optimized for flexible, omnidirectional prediction—inferring any subset of variables from any other subset—rather than a single direction like next-token prediction. He connects this to energy-based models and probabilistic inference perspectives associated with Yann LeCun.

    • Six-layer cortex structure vs “layers” in network connectivity
    • Hypothesis: cortex learns joint distributions enabling inference with arbitrary clamping
    • LLMs natively do next-token prediction; cortex may support missing-variable completion by design
    • Energy-based / probabilistic framing: sampling from conditionals in many directions
  3. 6:32 – 15:05

    How evolution links abstract concepts to primal drives: the steering subsystem and “thought assessors”

    The conversation turns to the hard problem Ilya Sutskever raised: how evolution encodes high-level desires (status, shame, social goals) without knowing modern contexts. Marblestone outlines Steve Berens’ model: a subcortical “steering subsystem” with innate heuristics, plus learned predictors (“thought assessors”) that connect abstract cortical representations to innate reward circuitry.

    • Problem: evolution can’t pre-specify modern triggers (e.g., embarrassment about experts)
    • Steering subsystem: innate heuristics + primitive sensory pathways (e.g., superior colliculus face/threat cues)
    • Amygdala/cortex learn predictors of steering variables, enabling generalization
    • Spider example: abstract cues (words/concepts) can trigger learned prediction of innate responses
  4. 15:05 – 19:14

    Could AI replicate omnidirectional inference? Masks, multimodality, and inductive biases

    Dwarkesh asks whether we can simply train models to predict “every token from every token” or enforce cross-modal mappings to achieve cortex-like flexibility. Marblestone distinguishes training on a fixed set of prediction tasks from systems that can choose which variables to infer at test time, and notes possible roles for better representations and architectural priors.

    • Training on fill-in-the-blank tasks isn’t the same as arbitrary test-time clamping
    • Representation/embedding quality may limit cross-domain abstraction and connection-making
    • Biological preprocessing (retina/V1-like transforms) might matter for sample efficiency
    • Architectural priors (e.g., surfaces/occlusion assumptions) as a route to data efficiency
  5. 19:14 – 22:16

    Sponsor segment: Dwarkesh’s multi-agent AlphaZero experiment built with Gemini

    Dwarkesh describes using Gemini to prototype and parallelize an AlphaZero-style population training setup to test whether splitting compute across many agents can outperform training a single agent. He reports preliminary results suggesting population diversity can yield gains even when each agent gets far less compute.

    • Hypothesis: co-evolutionary / population training may beat single-agent self-play under fixed compute
    • Gemini-assisted refactoring and benchmarking to parallelize self-play efficiently
    • Result claim (tentative): best agent in a population outperforms single-agent baseline despite less compute
    • Meta-point: AI coding assistants accelerate experimentation and iteration cycles
  6. 22:16 – 28:08

    Amortized vs “real” inference: test-time compute, sampling, and what brains might do

    They unpack amortized inference: using a feedforward network to approximate expensive Bayesian sampling. Marblestone notes probabilistic AI arguments for inherent test-time compute, while also observing perception is fast—suggesting some amortization—then connects this to what evolution “chooses” to bake into genomes versus learn during life.

    • Bayesian inference is intractable; sampling methods require heavy test-time compute
    • Neural nets as amortizers: mapping observations directly to latent causes
    • Brains may mix stochastic sampling with fast approximate forward passes
    • Evolution may “amortize” more into reward/bootstrapping functions than into pretrained world models
  7. 28:08 – 42:42

    What the genome encodes: why reward circuitry may be more bespoke than cortex

    Dwarkesh raises the information bottleneck: the genome is small relative to what brains do. Marblestone argues this fits a view where cortex is a relatively uniform learning substrate, while the steering/reward system contains many specialized, genetically wired circuits—possibly reflected in the diversity of cell types found in subcortical regions.

    • Genome likely can’t pretrain detailed world knowledge; it can encode compact wiring rules and objectives
    • Evidence: cell-type atlases suggest more diverse/bespoke types in steering regions vs cortex
    • Bespoke rewards may require special cell types or genetic wiring mechanisms (receptors/proteins)
    • Rapid hominid brain expansion could be “more of the same cortex” plus social-learning incentives
  8. 42:42 – 50:25

    What kind of reinforcement learning is the brain doing? Basal ganglia, dopamine, and RL-as-inference

    Dwarkesh compares RLHF-style credit assignment to human learning. Marblestone describes brain subsystems that resemble model-free RL (basal ganglia), dopamine as reward prediction error consistent with temporal-difference learning, and a cortical world model that can enable model-based planning and “RL as inference” by clamping high reward and sampling plans.

    • Modern LLM RL is surprisingly crude compared to value-function methods
    • Basal ganglia as a relatively small action-space RL controller/gater
    • Dopamine signals align with reward prediction error and TD learning ideas
    • Cortex can model rewards and plans, enabling inference-style planning via clamping reward
  9. 50:25 – 57:04

    Is biological hardware a limitation or advantage? Copyability, stochastic neurons, and co-design

    They debate whether brains would perform better on today’s hardware or whether biology offers unique advantages. Marblestone emphasizes the big disadvantage: brains can’t be copied or random-access edited, while noting biology may be well matched to low-power stochastic inference and memory/compute co-location, motivating algorithm–hardware co-design.

    • Major downside: brains aren’t copyable/editable like weight matrices
    • Potential upsides: energy efficiency, natural stochasticity, tight memory/compute integration
    • Neural stochasticity may make sampling-based inference cheaper than on digital hardware
    • Many cellular mechanisms may be implementation overhead rather than extra algorithmic ‘magic’
  10. 57:04 – 1:03:59

    Alignment implications: minimal drives for capability vs human social instincts, and the limits of our vocabulary

    Dwarkesh asks how different future intelligences could be if their “steering” differs. Marblestone suggests powerful optimization could arise with far fewer drives than human ethics/sociality (paperclipper concern), then discusses whether our current AI-derived conceptual vocabulary is adequate for neuroscience—or whether we need bottom-up primitives as some neuroscientists argue.

    • Capability may require only minimal drives (curiosity/exploration), not full human moral/social instincts
    • Steering subsystem understanding matters for alignment, not just performance
    • AI-to-brain reverse engineering may work partially, but bottom-up approaches may reveal new primitives
    • Debate over whether concepts like backprop/RL capture brain reality or mislead
  11. 1:03:59 – 1:23:28

    Why we need to map the brain: connectomes, timelines, costs, and the Human Genome Project analogy

    Marblestone makes the case that large-scale neuroscience data (connectomics plus molecular annotation) can constrain theories far faster than bespoke experiments. He outlines practical timelines (likely irrelevant to very short “AI 2027” scenarios), cost targets for mouse vs human-scale mapping, what a connectome includes, and why the focus should be technology cost reduction—mirroring genome sequencing’s cost curve.

    • Connectomes as a way to learn many ‘basic facts’ at once (wiring, cell types, projections)
    • Funding scale: hundreds of millions to low billions for major progress with a concerted push
    • E11Bio’s goal: reduce mouse connectome cost from billions to tens of millions; humans are ~1000× larger
    • Optical methods enable molecularly annotated connectomes; not identical to synaptic weights but richer constraints
    • Human Genome Project lesson: invest in technology to drive massive cost drops, not one brute-force map
  12. 1:23:28 – 1:38:17

    What value will automating math have? Lean, RL with verifiable signals, and provable software

    Marblestone explains Lean as a proof assistant that makes proofs mechanically verifiable, enabling strong RL-style training signals for theorem proving. He expects proof-search automation to progress quickly, while noting the harder frontier is conjecture generation and specifying what’s ‘interesting’; he also emphasizes applications to formally verified software and cybersecurity, plus the “specification problem” as a key bottleneck.

    • Lean enables click-to-verify correctness, improving collaboration and trust in proofs
    • Formal theorem proving becomes a near-ideal RLVR task with crisp rewards
    • Big impact: offload mechanical proof steps; humans shift to higher-level strategy and conjectures
    • Provable software/hardware: prove safety/security properties (memory isolation, invariants, etc.)
    • Core bottleneck: generating correct formal specifications for real-world systems
  13. 1:38:17 – 1:44:54

    Brain architecture and open questions: representations, consciousness, continual learning, and attention mechanisms

    They close with broader neuroscience unknowns: whether the brain’s world model is symbolic or a messy high-dimensional latent space, what consciousness/binding might be, and what mechanisms support continual learning. Marblestone points to hippocampal replay/consolidation and multiple plasticity timescales as candidates, and speculates about thalamocortical gating as a possible analogue to attention-like routing.

    • Representation debate: cognitive maps, variable binding, and whether symbols exist neurally
    • Marblestone’s hunch: representations are messy; prioritize learning rules, architectures, and objectives
    • Conscious experience/binding: little consensus; may require new conceptual breakthroughs
    • Continual learning: hippocampus replay/consolidation, multi-timescale plasticity, synapses with many states
    • Attention analogues: multiple biological ‘attention’ types; thalamus may gate/route information
  14. 1:44:54 – 1:49:53

    The “gap map” and scaling science: focused research organizations as mini-Hubbles

    Marblestone describes Convergent Research’s ‘gap map’—a catalog of missing scientific infrastructure that could unlock whole fields, analogous to building a Hubble telescope rather than a single discovery. He notes the surprising breadth of infrastructure needs (including in math via Lean) and argues scalable tools complement, rather than replace, small-lab creativity.

    • Gap map: hundreds of ‘fundamental capabilities’ projects that unblock many downstream discoveries
    • FRO model: organized engineering + open, public-benefit infrastructure
    • Surprising finding: even math benefits from major infrastructure investments (proof tooling)
    • Scale is increasingly required across sciences, but should coexist with exploratory academic work

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.