Adam Marblestone on Dwarkesh Patel: Why AI Uses Blunt Losses

Evolution-encoded cost functions give the amygdala a steering role; it labels cortex neurons with a per-stage curriculum that LLMs have no equivalent for.

Dwarkesh PatelhostAdam Marblestoneguest

Dec 30, 20251h 49mWatch on YouTube ↗

EPISODE INFO

Released: December 30, 2025
Duration: 1h 49m
Channel: Dwarkesh Podcast
Watch on YouTube: ▶ Open ↗

EPISODE DESCRIPTION

Adam Marblestone is CEO of Convergent Research. He’s had a very interesting past life; Research Scientist at Google Deepmind on their neuroscient team; has worked on brain computer interfaces to quantum computing and nanotech to formal mathematics. Where we discuss how the brain learns so much from so little, what the AI field can learn from neuroscience, and the answer to Ilya’s question: how does the genome encode abstract reward functions? Turns out, they’re all the same question. 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒
Transcript: https://www.dwarkesh.com/p/adam-marblestone
Apple Podcasts: https://podcasts.apple.com/us/podcast/adam-marblestone-ai-is-missing-something-fundamental/id1516093381?i=1000743205259
Spotify: https://open.spotify.com/episode/5RD8lxJh0mGSlpEWWExQNG?si=srfZ9QBgRFqvOtJGX8EGqg
𝐒𝐏𝐎𝐍𝐒𝐎𝐑𝐒
Gemini 3 Pro recently helped me run an experiment to test multi-agent scaling: basically, if you have a fixed budget of compute, what is the optimal way to split it up across agents? Gemini was my colleague throughout the process — honestly, I couldn’t have investigated this question without it. Try Gemini 3 Pro today https://gemini.google.com
Labelbox helps you train agents to do economically-valuable, real-world tasks. Labelbox’s network of subject-matter experts ensures you get hyper-realistic RL environments, and their custom tooling lets you generate the highest-quality training data possible from those environments. Learn more at https://labelbox.com/dwarkesh
To sponsor a future episode, visit https://dwarkesh.com/advertise. 𝐅𝐔𝐑𝐓𝐇𝐄𝐑 𝐑𝐄𝐀𝐃𝐈𝐍𝐆 Intro to Brain-Like-AGI Safety - Steven Byrnes’s theory of the learning vs steering subsystem; referenced throughout the episode. https://www.lesswrong.com/s/HzcM2dkCq7fwXBej8 A Brief History of Intelligence - Great book by Max Bennett on connections between neuroscience and AI. https://www.abriefhistoryofintelligence.com/book Adam’s blog. https://longitudinal.blog/ Convergent Research’s blog on essential technologies. https://www.essentialtechnology.blog/ A Tutorial on Energy-Based Learning by Yann LeCun. http://yann.lecun.com/exdb/publis/pdf/lecun-06.pdf What Does It Mean to Understand a Neural Network? - Kording & Lillicrap. https://arxiv.org/abs/1907.06374 E11 Bio and their brain connectomics approach. https://www.e11.bio/ Sam Gershman on what dopamine is doing in the brain. https://gershmanlab.com/pubs/GershmanUchida19.pdf Gwern’s proposal on training models on the brain’s hidden states. https://gwern.net/aunn-brain Relevant episodes: Ilya Sutskever https://youtu.be/aR20FWCCjAs Richard Sutton https://youtu.be/21EYKqUsPfg Andrej Karpathy https://youtu.be/lXUZvyajciY 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 0:00:00 – The brain’s secret sauce is the reward functions, not the architecture 0:22:20 – What the genome actually encodes 0:42:42 – What kind of RL is the brain doing? 0:50:31 – Is biological hardware a limitation or an advantage? 1:03:59 – Why we need to map the human brain 1:23:28 – What value will automating math have? 1:38:18 – Architecture of the brain

SPEAKERS

Dwarkesh Patel
host
Adam Marblestone
guest
Narrator
other

EPISODE SUMMARY

In this episode of Dwarkesh Podcast, featuring Dwarkesh Patel and Adam Marblestone, Adam Marblestone on Dwarkesh Patel: Why AI Uses Blunt Losses explores why brains outlearn AIs: hidden loss functions and steering systems Adam Marblestone argues that current AI systems miss several fundamental ingredients that make biological brains sample-efficient, aligned, and flexible. He distinguishes between a "learning subsystem" (cortex-like, general world model) and a "steering subsystem" (subcortical, innate rewards and reflexes) and suggests evolution poured most complexity into the latter—especially rich, developmentally-timed cost functions. This architecture may support omnidirectional probabilistic inference, continual learning, and the robust wiring of abstract learned concepts (like social status) to primitive drives (like shame or fear). Marblestone also discusses amortized vs non‑amortized inference, neuromorphic hardware tradeoffs, formal methods in math and software, and a concrete roadmap for large-scale connectomics to ground AI and alignment in actual brain mechanisms.

RELATED EPISODES

David Reich – Bronze Age shock, the Neanderthal puzzle, & the sudden spread of farming

Jensen Huang – TPU competition, why we should sell chips to China, & Nvidia’s supply chain moat

Dario Amodei — “We are near the end of the exponential”

Andrej Karpathy — “We’re summoning ghosts, not building animals”

Why Leonardo was a saboteur, Gutenberg went broke, and Florence was weird – Ada Palmer

Richard Sutton – Father of RL thinks LLMs are a dead end

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome

Episode Details