Lex Fridman Podcast

Edward Gibson: Human Language, Psycholinguistics, Syntax, Grammar & LLMs | Lex Fridman Podcast #426

Edward Gibson is a psycholinguistics professor at MIT and heads the MIT Language Lab. Please support this podcast by checking out our sponsors: - Yahoo Finance: https://yahoofinance.com - Listening: https://listening.com/lex and use code LEX to get one month free - Policygenius: https://policygenius.com/lex - Shopify: https://shopify.com/lex to get $1 per month trial - Eight Sleep: https://eightsleep.com/lex to get special savings TRANSCRIPT: https://lexfridman.com/edward-gibson-transcript EPISODE LINKS: Edward's X: https://x.com/LanguageMIT TedLab: https://tedlab.mit.edu/ Edward's Google Scholar: https://scholar.google.com/citations?user=4FsWE64AAAAJ TedLab's YouTube: https://youtube.com/@Tedlab-MIT PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ Full episodes playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 Clips playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOeciFP3CBCIEElOJeitOr41 OUTLINE: 0:00 - Introduction 1:13 - Human language 5:19 - Generalizations in language 11:06 - Dependency grammar 21:05 - Morphology 29:40 - Evolution of languages 33:00 - Noam Chomsky 1:17:06 - Thinking and language 1:30:36 - LLMs 1:43:35 - Center embedding 2:10:02 - Learning a new language 2:13:54 - Nature vs nurture 2:20:30 - Culture and language 2:34:58 - Universal language 2:39:21 - Language translation 2:42:36 - Animal communication SOCIAL: - Twitter: https://twitter.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - Medium: https://medium.com/@lexfridman - Reddit: https://reddit.com/r/lexfridman - Support on Patreon: https://www.patreon.com/lexfridman

Edward (Ted) GibsonguestLex Fridmanhost

Apr 17, 20242h 50mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

MIT linguist dissects language, thought, LLMs, and why legalese fails

Lex Fridman interviews MIT psycholinguist Edward (Ted) Gibson about how human language is structured, processed, and learned, and how this contrasts with large language models. Gibson argues for a dependency-grammar view of syntax and shows that across languages, people strongly prefer short dependencies because long-distance links are cognitively costly. He distinguishes language (a communication system) from thought, citing brain-imaging and neuropsychology showing that high-level language and non‑linguistic reasoning use different neural systems. The conversation ranges from Pirahã number words and Amazonian color terms to the pathology of legalese, the limits of LLM “understanding,” and speculative ideas about communicating with animals and aliens.

IDEAS WORTH REMEMBERING

5 ideas

Human languages strongly minimize dependency length.

Across ~60 typologically diverse languages with parsed corpora, actual sentences consistently have much shorter word-to-word dependency distances than randomized but grammatically plausible alternatives, indicating a universal pressure to keep related words close for easier production and comprehension.

Center-embedding is universally hard for humans and LLMs.

Nested structures like “The boy who the cat that the dog chased scratched cried” massively increase dependency distances and working-memory load; both humans and large language models struggle to complete or process such sentences, suggesting shared constraints on processing form.

Legalese is difficult primarily because of extreme center-embedding, not jargon or passives.

Corpus and behavioral studies on contracts show unusually high rates of center-embedded clauses (e.g., definitions wedged between subject and verb), which severely hurt comprehension and recall for both laypeople and lawyers; low-frequency vocabulary matters somewhat, while passive voice has negligible effect.

Language and thought are neurally dissociable systems.

fMRI work (Fedorenko et al.) finds a stable, left-lateralized “language network” activated by sentences (spoken or written) but not by math, music, programming, or other demanding cognitive tasks, while patients with severe aphasia can still reason, play chess, and do arithmetic—showing that high-level thinking doesn’t require language.

Words people invent reflect communicative needs, not perceptual limits.

Groups like the Tsimane and Pirahã see the same colors and numerosities we do but have far fewer basic color terms and lack exact number words (even for ‘one’); experiments show they use approximate quantifiers (‘few/some/many’) and can match sets perceptually but can’t perform exact counting tasks, highlighting that lexical systems track what must be talked about, not what can be perceived.

WORDS WORTH SAVING

5 quotes

Language is an invented system by humans for communicating their ideas.

— Edward Gibson

I don’t see any limits to their form. Their form is perfect.

— Edward Gibson (on large language models)

We don’t think in language.

— Edward Gibson

Legalese is massively center-embedded. About 70 percent of sentences have a center-embedded clause.

— Edward Gibson

Naively, I certainly thought that all humans would have words for exact counting. And the Pirahã don’t.

— Edward Gibson

Dependency grammar, syntax, and word order universalsCognitive cost of long-distance dependencies and center-embeddingLanguage vs. thought: brain networks, aphasia, and modularityCultural variation: Pirahã and Tsimane number and color systemsLegalese: why contracts are uniquely hard to understandLarge language models: form vs meaning, and construction grammarEvolution, learnability, and innateness of language

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.