Lex Fridman PodcastNoam Brown: AI vs Humans in Poker and Games of Strategic Negotiation | Lex Fridman Podcast #344
At a glance
WHAT IT’S REALLY ABOUT
AI Masters Poker and Diplomacy, Redefining Strategy, Trust, and Negotiation
- Noam Brown discusses his work building superhuman AI systems for complex strategic games: heads‑up and six‑player no‑limit Texas Hold’em (Libratus, Pluribus) and the negotiation-heavy board game Diplomacy (Cicero).
- He explains core ideas like Nash equilibrium, self‑play, counterfactual regret minimization, and the critical role of search, arguing that poker’s imperfect information makes it even more challenging than games like chess or Go.
- In Diplomacy, Brown’s team combines large language models with reinforcement learning and human game data to create an AI that can negotiate, form alliances, and build trust with humans in natural language at roughly top‑human level.
- They explore how such systems illuminate human irrationality, trust, deception, and the limits of self‑play, and how these ideas may transfer to future NPCs, training tools, and even real‑world negotiation and decision support.
IDEAS WORTH REMEMBERING
5 ideasGame‑theoretic ‘balanced’ play can outperform human psychological exploitation.
Libratus crushed elite heads‑up poker pros by approximating a Nash equilibrium strategy that didn’t adapt to specific opponents or do ‘mind games’, undermining the belief that reading people always beats theory.
Search is at least as important as raw neural network strength.
Across chess, Go, and poker, planning ahead via search dramatically boosts performance; removing Monte Carlo tree search from Go AIs drops them from far‑superhuman to roughly human‑grandmaster strength.
Imperfect‑information games require optimizing action probabilities, not just actions.
In poker (and rock‑paper‑scissors), the value of a move depends on how often you do it; balancing bluffing and value bets so you are unpredictable is central, and Libratus explicitly optimizes these frequencies.
Six‑player poker shows equilibrium‑style methods can generalize beyond two‑player zero‑sum.
Although theory gives no guarantees, Pluribus uses depth‑limited search and equilibrium‑inspired self‑play to achieve superhuman performance in six‑player games, where cooperation and more complex dynamics appear.
Self‑play alone fails in social, cooperative settings; you must learn from humans.
In Diplomacy, a self‑play‑only bot develops an alien ‘robot language’ and inhuman conventions and is quickly ostracized and crushed by humans; Cicero instead anchors its policies and language to large human datasets.
WORDS WORTH SAVING
5 quotesIn any finite two‑player zero‑sum game, there is an optimal strategy that, if you play it, you are guaranteed to not lose in expectation, no matter what your opponent does.
— Noam Brown
One of the key strategies in poker is to put the other person into an uncomfortable position, and if you’re doing that, then you’re playing poker well.
— Noam Brown
We played our bot against four top heads‑up no‑limit hold’em poker players, and the bot wasn’t trying to adapt to them… it was just trying to approximate the Nash equilibrium, and it crushed them.
— Noam Brown
Diplomacy is a game about trust and being able to build trust in an environment that encourages people to not trust anyone.
— Noam Brown
War is an inherently negative‑sum game. There’s always a better outcome than war for all the parties involved.
— Noam Brown
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome