No Priors Ep. 1 | With Noam Brown, Research Scientist at Meta

AGI can beat top players in chess, poker, and, now, Diplomacy. In November 2022, a bot named Cicero demonstrated mastery in this game, which requires natural language negotiation and cooperation with humans. In short, Cicero can lie, scheme, build trust, pass as human, and ally with humans. So what does that mean for the future of AGI? This week’s guest is research scientist Noam Brown. He co-created Cicero on the Meta Fundamental AI Research Team, and is considered one of the smartest engineers and researchers working in AI today. Co-hosts Sarah Guo and Elad Gil talk to Noam about why all research should be high risk, high reward, the timeline until we have AGI agents negotiating with humans, why scaling isn’t the only path to breakthroughs in AI, and if the Turing Test is still relevant. 00:00 Introduction 01:43 What sparked Noam’s interest in researching AI that could defeat games 06:00 How the Alexa.NET and AlphaGo changed the landscape of AI research 08:09 Why Noam chose Diplomacy as the next game to work on after poker 09:51 What Diplomacy is and why the game was so challenging for an AI bot 14:50 Algorithmic breakthroughs and significance of AI bots that win in No-Limit Texas Hold'em poker 23:29 The Nash Equilibrium and optimal play in poker 24:53 How Cicero interacted with humans 27:58 The relevance and usefulness of the Turing Test 31:05 The data set used to train Cicero 31:54 Bottlenecks to AI researchers and challenges with scaling 40:10 The next frontier in researching games for AI 42:55 Domains that humans will still dominate and applications for AI bots in the real world 48:13 Reasoning challenges with AI

Sarah GuohostNoam BrownguestElad Gilhost

Apr 25, 20231h 0mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

From Poker Bots To Diplomacy: Noam Brown Redefines AI Reasoning Benchmarks

Noam Brown recounts his path from finance to AI research, focusing on game-theoretic agents that master imperfect‑information and negotiation-heavy games like poker and Diplomacy.
He explains why AlphaGo and large language models changed his expectations for AI progress, but argues that raw scale and next-token prediction are hitting limits without better reasoning and planning.
Brown details how Cicero, Meta’s Diplomacy bot, combined language models, human game data, and self-play to cooperate and negotiate with humans without being detected as an AI across 40 games.
Looking ahead, he sees the key research frontier as general-purpose reasoning and inference-time planning, with implications for theorem proving, complex code generation, negotiation agents, and real-world cooperative AI systems.

IDEAS WORTH REMEMBERING

5 ideas

Reasoning and planning are now more critical than just scaling models.

Brown argues that simply making neural networks larger has diminishing returns; massive gains (e.g., in poker and Go) came from adding search/planning at inference, not 100,000x more training compute.

Cooperation with humans is a frontier benchmark, not just competition.

Cicero’s success in Diplomacy shows that AI must model human norms, mistakes, and communication patterns to collaborate effectively, which is closer to real-world deployment than beating humans in zero-sum games.

Human data alone is insufficient for expert-level strategic play.

Supervised learning on human game logs plateaued below expert performance; combining human behavior models with self-play and planning was necessary to surpass strong humans in poker and Diplomacy.

The Turing test is no longer a useful bar for intelligence.

With language models that often avoid detection as bots in rich dialogue settings, passing a Turing-like test no longer correlates well with general intelligence or robust reasoning abilities.

Sample efficiency remains a key human advantage over current AI.

Humans can become strong at tasks like chess, Diplomacy, or market reasoning with far fewer examples than current systems, which matters in domains where data is scarce or environments are non-stationary.

WORDS WORTH SAVING

5 quotes

All research is high risk, high reward, or at least it should be.

— Noam Brown

The Turing test is no longer really a useful measure the way it was intended to be.

— Noam Brown

Everything I had done in my PhD up until that point was just a footnote compared to adding search and scaling search.

— Noam Brown

If you want truly general artificial general intelligence then this [reasoning] needs to be addressed.

— Noam Brown

It doesn’t seem crazy to me that you could have a model that can prove the Riemann hypothesis within the next five years if you can solve the reasoning problem in a truly general way.

— Noam Brown

Noam Brown’s career path from finance and economics to AI and game theorySignificance of AlphaGo, Deep Blue, and large language models for AI progressDesign and performance of AI systems for poker (Heads-up and multi-player)Cicero and Diplomacy: natural-language negotiation, cooperation, and self-playLimits of the Turing test and the need for new benchmarks and metricsScaling laws, data vs. compute bottlenecks, and inference-time planningFuture research on general reasoning, theorem proving, and code generation

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.