The Twenty Minute VCNoam Shazeer: How We Spent $2M to Train a Single AI Model and Grew Character.ai to 20M Users | E1055
At a glance
WHAT IT’S REALLY ABOUT
Noam Shazeer on Scaling Character.ai, Cheap Training, and Billion Use-Cases
- Noam Shazeer, co‑founder and CEO of Character.ai, discusses how his 20 years at Google shaped his philosophy of building full‑stack, consumer-first AI products that can serve billions of people. He explains why Character.ai is pursuing a broad, horizontal use case—“a billion users inventing a billion use cases”—rather than narrow verticals, and how emotional support, entertainment, and companionship emerged organically as core user behaviors. Technically, he outlines how modern neural language models work, why compute budget (not just data or model size) is the primary constraint, and how Character trained a flagship model for roughly $2M in compute. He also shares views on AI’s future, hallucinations as a feature for some applications, privacy, open vs. closed ecosystems, and his personal philosophy on responsibility, religion, and choosing usefulness over short-term fun.
IDEAS WORTH REMEMBERING
5 ideasLaunch general-purpose AI directly to consumers and let use cases emerge.
Shazeer argues that when technology is versatile and simple to use, the best strategy is to ship to as many people as possible and observe what they do with it, rather than pre-picking narrow verticals.
Versatility and usability can coexist in AI products if you design around language.
By framing the core problem as next-word prediction on massive text corpora, Character.ai can support diverse use cases—from role-play to brainstorming—without hand-coded rules or hard specialization.
Compute budget is the main bottleneck in building smarter models, more than raw data.
Shazeer emphasizes that the critical variable is how much computation you can afford to spend (model size × training duration), noting Character’s current flagship model cost about $2M in compute and could be substantially improved with more and better hardware.
User-generated interaction data is valuable but must be handled with strict privacy safeguards.
Character.ai uses aggregate behavioral signals to improve models while avoiding naïve training on raw conversations, which could otherwise leak intensely personal content back to other users.
Hallucinations can be a feature for some applications, not always a bug.
Because Character.ai leans into entertainment, emotional support, and creativity, they accept and even value imaginative outputs—while being explicit that models hallucinate—rather than over-optimizing for factual precision in every context.
WORDS WORTH SAVING
5 quotesI like this sort of motto of a billion users inventing a billion use cases.
— Noam Shazeer
We consider [hallucinations] a feature… the use cases that emerge first will be ones for which hallucination is a feature.
— Noam Shazeer
You have this sequence of words… guess what the next word is. That problem is called language modeling.
— Noam Shazeer
This technology is just gonna get way, way smarter. We’re at a sort of Wright Brothers first airplane kind of moment.
— Noam Shazeer
It wasn’t a matter of, ‘Am I going to be having more fun being a startup CEO?’ It’s more like, ‘I want to push this technology forward. What’s the best thing I can do?’
— Noam Shazeer
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome