Noam Brown: AI vs Humans in Poker and Games of Strategic Negotiation | Lex Fridman Podcast #344

Name: Noam Brown: AI vs Humans in Poker and Games of Strategic Negotiation | Lex Fridman Podcast #344
Uploaded: 2022-12-06T12:00:00Z
Duration: 2 h 29 min 21 s
Description: Noam Brown discusses his work building superhuman AI systems for complex strategic games: heads‑up and six‑player no‑limit Texas Hold’em (Libratus, Pluribus) and the negotiation-heavy board game Diplomacy (Cicero).

Lex Fridman PodcastDec 6, 20222h 29m

Noam Brown (guest), Lex Fridman (host)

Nash equilibrium, game theory, and imperfect-information gamesDesign and evolution of poker AIs: Libratus (heads‑up) and Pluribus (six‑player)Search vs. neural networks in games like chess, Go, and pokerCicero: a Diplomacy AI combining language models with RL and human dataTrust, deception, and human‑compatible behavior in multi‑agent systemsHuman‑like AI opponents, training tools, and cheat detection challengesPotential real‑world implications for negotiation, diplomacy, and AI ethics

In this episode of Lex Fridman Podcast, featuring Noam Brown and Lex Fridman, Noam Brown: AI vs Humans in Poker and Games of Strategic Negotiation | Lex Fridman Podcast #344 explores aI Masters Poker and Diplomacy, Redefining Strategy, Trust, and Negotiation Noam Brown discusses his work building superhuman AI systems for complex strategic games: heads‑up and six‑player no‑limit Texas Hold’em (Libratus, Pluribus) and the negotiation-heavy board game Diplomacy (Cicero).

AI Masters Poker and Diplomacy, Redefining Strategy, Trust, and Negotiation

Noam Brown discusses his work building superhuman AI systems for complex strategic games: heads‑up and six‑player no‑limit Texas Hold’em (Libratus, Pluribus) and the negotiation-heavy board game Diplomacy (Cicero).

He explains core ideas like Nash equilibrium, self‑play, counterfactual regret minimization, and the critical role of search, arguing that poker’s imperfect information makes it even more challenging than games like chess or Go.

In Diplomacy, Brown’s team combines large language models with reinforcement learning and human game data to create an AI that can negotiate, form alliances, and build trust with humans in natural language at roughly top‑human level.

They explore how such systems illuminate human irrationality, trust, deception, and the limits of self‑play, and how these ideas may transfer to future NPCs, training tools, and even real‑world negotiation and decision support.

Key Takeaways

Game‑theoretic ‘balanced’ play can outperform human psychological exploitation.

Libratus crushed elite heads‑up poker pros by approximating a Nash equilibrium strategy that didn’t adapt to specific opponents or do ‘mind games’, undermining the belief that reading people always beats theory.

Get the full analysis with uListen AI

Search is at least as important as raw neural network strength.

Across chess, Go, and poker, planning ahead via search dramatically boosts performance; removing Monte Carlo tree search from Go AIs drops them from far‑superhuman to roughly human‑grandmaster strength.

Get the full analysis with uListen AI

Imperfect‑information games require optimizing action probabilities, not just actions.

In poker (and rock‑paper‑scissors), the value of a move depends on how often you do it; balancing bluffing and value bets so you are unpredictable is central, and Libratus explicitly optimizes these frequencies.

Get the full analysis with uListen AI

Six‑player poker shows equilibrium‑style methods can generalize beyond two‑player zero‑sum.

Although theory gives no guarantees, Pluribus uses depth‑limited search and equilibrium‑inspired self‑play to achieve superhuman performance in six‑player games, where cooperation and more complex dynamics appear.

Get the full analysis with uListen AI

Self‑play alone fails in social, cooperative settings; you must learn from humans.

In Diplomacy, a self‑play‑only bot develops an alien ‘robot language’ and inhuman conventions and is quickly ostracized and crushed by humans; Cicero instead anchors its policies and language to large human datasets.

Get the full analysis with uListen AI

Controlling language models with explicit ‘intents’ makes dialogue strategic, not just imitative.

Cicero separates deciding what actions it wants (for itself and others) from generating messages, conditioning a language model on those intents and filtering out messages that would backfire or reveal too much.

Get the full analysis with uListen AI

Trust and minimal lying are crucial even in a game famous for backstabbing.

The team found that frequent lying in Diplomacy reduces long‑run performance because humans stop cooperating; Cicero is explicitly regularized to be honest or at least not obviously deceptive most of the time.

Get the full analysis with uListen AI

Notable Quotes

“In any finite two‑player zero‑sum game, there is an optimal strategy that, if you play it, you are guaranteed to not lose in expectation, no matter what your opponent does.”
— Noam Brown

“One of the key strategies in poker is to put the other person into an uncomfortable position, and if you’re doing that, then you’re playing poker well.”
— Noam Brown

“We played our bot against four top heads‑up no‑limit hold’em poker players, and the bot wasn’t trying to adapt to them… it was just trying to approximate the Nash equilibrium, and it crushed them.”
— Noam Brown

“Diplomacy is a game about trust and being able to build trust in an environment that encourages people to not trust anyone.”
— Noam Brown

“War is an inherently negative‑sum game. There’s always a better outcome than war for all the parties involved.”
— Noam Brown

Questions Answered in This Episode

If Nash‑equilibrium–style play dominates in poker, where do human psychological skills still meaningfully matter, if at all?

Get the full analysis with uListen AI

How might the techniques behind Cicero transfer to real‑world negotiations or diplomacy without amplifying manipulation or deception?

Get the full analysis with uListen AI

What new kinds of video games or NPC interactions become possible once language models can negotiate, gossip, and build long‑term trust coherently?

Get the full analysis with uListen AI

How should we balance developing human‑like strategic AIs with the risks they pose for cheating, influence operations, or ‘deep’ persuasion?

Get the full analysis with uListen AI

Could the need for human‑compatible behavior in multi‑agent systems force us to rethink how we define and measure ‘intelligence’ in AI?

Get the full analysis with uListen AI

Transcript Preview

Noam Brown

A lot of people were saying, like, "Oh, this whole idea of game theory, it's just nonsense and if you really want to make money, you gotta, like, look into the other person's eyes and read their soul and figure out what cards they have." But what happened was w- where we played our bot against four top heads of no-limit hold 'em poker players, and the bot wasn't trying to adapt to them. It wasn't trying to exploit them. It wasn't trying to do these mind games. It was just trying to approximate the Nash equilibrium, and it crushed them.

Lex Fridman

The following is a conversation with Noam Brown, research scientist at FAIR, Facebook AI Research Group at Meta AI. He co-created the first AI system that achieved superhuman level of performance in no-limit Texas hold 'em both heads up and multiplayer. And now, recently, he co-created an AI system that can strategically out-negotiate humans using natural language in a popular board game called Diplomacy, which is a war game that emphasizes negotiation. This is the Lex Fridman Podcast. To support it, please check out our sponsors in the description, and now, dear friends, here's Noam Brown. You've been a lead on three amazing AI projects. So we got Libratus that solved, or at least achieved human-level performance on, uh, no-limit Texas hold 'em poker with two players, heads up. You got Pluribus. That solved no-limit Texas hold 'em poker with six players, and just now you have Cicero. These are all names of systems that solved or achieved human-level performance on the game of Diplomacy, which, uh, for people who don't know, is a popular strategy board game. It was loved by JFK, John F. Kennedy and Henry Kissinger, and many other big famous people in the decades since. So let's talk about poker and Diplomacy today. First poker, what is the game of no-limit Texas hold 'em and how is it different from chess?

Noam Brown

Well no-limit Texas hold 'em poker is the most popular variant of poker in the world. So, you know, you go to a casino, you play, sit down at the poker table. The game that you're playing is no-limit Texas hold 'em. If you watch movies about poker like Casino Royale or Rounders, the game that they're playing is no-limit Texas hold 'em poker. Now it's very different from limit hold 'em in that you can bet any amount of chips that you want, and so the stakes escalate really quickly. You start out with, like, $1 or $2 in the pot and then by the end of the hand, you've got like $1000 in there maybe.

Lex Fridman

So the option to increase the number very aggressively and very quickly is always there?

Noam Brown

Right. The no-limit aspect is there's no limit to how much you can bet. You know, you, in limit hold 'em, there's like $2 in the pot. You- you can only bet, like, $2. But if you've got $10,000 in front of you, you're always welcome to put $10,000 into the pot.

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript

Get more from every podcast

AI summaries, searchable transcripts, and fact-checking. Free forever.

Add to Chrome