The Twenty Minute VCNoam Shazeer: How We Spent $2M to Train a Single AI Model and Grew Character.ai to 20M Users | E1055
CHAPTERS
- 0:00 – 0:55
What users actually do with Character.ai: unexpected therapy and emotional support
Noam opens with a surprising user behavior: people using fictional/game characters as a form of therapy or emotional relief. This sets up the central theme that users, not builders, often discover the most meaningful use cases.
- •Users report emotional support benefits from talking to character bots
- •Unintended use cases can dominate intended ones
- •Early signal of companionship/mental-health-adjacent demand
- 0:55 – 3:24
From Google’s spelling corrector to product lessons about scale
Noam recounts his first Google project improving spelling correction for web search, and why the old dictionary-based approach failed at internet scale. He then generalizes the lesson: the biggest wins come from broadly useful, mass-market tools.
- •Why dictionary-driven spellcheck was terrible for web search
- •Search diversity forced new approaches beyond rigid rules
- •Key Google lesson: general tech + billions of users beats narrow B2B assumptions
- 3:24 – 5:05
Character.ai’s full-stack, direct-to-consumer bet inspired by Google
Noam explains why Character.ai prioritizes launching a general-purpose AI product directly to consumers rather than starting with vertical B2B applications. He argues that controlling the full stack—from research to product—enables speed, iteration, and co-design across the system.
- •LLMs are unusually versatile and easy to use (conversation as interface)
- •D2C-first strategy: launch to everyone and let use cases emerge
- •Full-stack approach enables end-to-end optimization and faster learning cycles
- 5:05 – 6:01
Motivation and leverage: pushing AI forward as the highest-impact path
The conversation shifts to what drives Noam personally: curiosity, enjoyment of hard technical problems, and a belief in AI as leverage on global challenges. Rather than tackling domains directly (e.g., medicine), he frames advancing AI capability as a force multiplier.
- •AI work is intrinsically motivating: making computers do what they can’t
- •Leverage argument: better AI can accelerate progress across many fields
- •Acknowledges huge solvable problems (disease, aging, etc.)
- 6:01 – 8:20
Mission as humility: “a billion users inventing a billion use cases”
Noam lays out a mission philosophy grounded in humility and user agency. Character.ai’s guiding motto is to build something broadly capable and let the world decide what it’s for—illustrated by users repurposing game characters into therapists.
- •Humility about predicting or controlling societal outcomes
- •Core motto: many users discovering many uses
- •Entertainment/companionship/emotional support emerged strongly
- •Users’ agency is central to product direction
- 8:20 – 9:21
Explaining the growth engine: launching, generality, and real human need
Noam attributes Character.ai’s rapid traction (millions of users, massive message volume) to a simple combination: they shipped, the product stayed general, and it meets a widespread need for connection. He contrasts this with large-company hesitancy and brand risk.
- •Growth driver #1: actually launching the product
- •General tool unlocks organic discovery of use cases
- •Demand-side pull: loneliness/need to talk is widespread
- •Large companies can be slower due to perceived brand risk
- 9:21 – 10:29
Ethical tension: does AI companionship reduce or improve human connection?
Harry challenges whether machine conversation could pull people away from real relationships. Noam emphasizes valuing human connection, and suggests the product can help some users practice socially—while acknowledging the ultimate effect depends on user choices.
- •Concern: AI as substitute for human relationships
- •Noam’s stance: human connection has moral and practical value
- •Potential benefit: practice tool for social anxiety
- •Outcome is user-dependent and hard to generalize
- 10:29 – 11:41
Core product dilemma: keep it general without sacrificing usability/quality
Noam describes the central product challenge: building something both highly general and genuinely usable. He rejects the standard PM advice to narrow into verticals, arguing that neural language models uniquely enable broad capability without hand-crafted specialization.
- •Perceived trade-off: versatility vs usability
- •Character.ai explicitly resists narrowing to verticals
- •General-purpose usability is a primary design goal
- •Quality comes from model capability rather than rules/handcrafted flows
- 11:41 – 16:50
Why neural language models scale: next-word prediction, data abundance, and capability gains
Noam explains the conceptual simplicity behind modern LLMs: predicting the next word. He contrasts fragile rule-based systems with neural models, notes the abundance of web-scale training data, and outlines how capabilities expanded from translation toward open-ended conversation.
- •LLMs reduce to a simple objective: next-token prediction
- •Rule-based NLP was complex, brittle, and non-generalizing
- •Massive training data is available (e.g., Common Crawl)
- •Early killer app was machine translation; conversation required more scale
- 16:50 – 18:40
What really limits frontier models: compute, training time, and a $2M run
The discussion moves into the economics of training: model size and training duration both multiply compute needs. Noam states compute is the main constraint and shares that Character.ai spent about $2M in compute to train a model the prior summer, with plans to improve via better hardware and longer training.
- •Compute is the dominant bottleneck (more than raw data access)
- •Bigger models + longer training drives cost multiplicatively
- •Character.ai trained a serving model with ~$2M of compute
- •Better hardware and longer runs are the path to smarter models
- 18:40 – 21:04
Proprietary conversation data and privacy: learning from users without leaking them
Harry asks about defensibility via proprietary data versus broadly available internet corpora. Noam argues user interaction data is valuable for preference and product tuning, but stresses privacy risks if conversations are naively used for training, since models could regurgitate private information.
- •User data helps identify what people like and how they use the system
- •General knowledge + smaller task-specific signals mirrors human learning
- •Privacy is a central constraint; naive training can cause memorization/leakage
- •Emphasis on aggregate learning and careful handling of sensitive content
- 21:04 – 24:13
Startups vs incumbents, open vs closed: speed, scale economics, and ecosystem pluralism
Noam explains why Character.ai works better as a standalone startup: faster shipping and fewer constraints from existing products. He predicts multiple winners across startups, big companies, universities, and individuals, with both open and closed approaches coexisting—while noting scale advantages for training and serving.
- •Standalone advantage: speed and willingness to launch
- •‘Users win’ framing: many viable players and approaches
- •Open + closed ecosystems will both thrive; more small-scale tinkering boosts research
- •Economies of scale matter for training and serving efficiency (batching)
- 24:13 – 26:39
AI perspective reset: best applications aren’t invented yet; hallucinations as a feature
Noam challenges today’s narratives by comparing the moment to the invention of electricity/computers: we don’t yet know the killer apps. He also reframes hallucinations as a feature for early use cases (entertainment, companionship, creativity), letting product-market fit emerge where the tech is naturally strong.
- •We’re early: the most valuable uses are still unknown
- •Historical analogy: electricity/computers before widespread application discovery
- •Hallucinations can be beneficial in creative/entertainment contexts
- •Strategy: ship general tools and let natural-fit use cases surface
- 26:39 – 30:05
Leadership, usefulness over fun, and the parenthood-induced mindset shift
Noam reflects on becoming CEO and why he intends to remain in the role: to ensure the company makes the right decisions while he still contributes technically. He describes a shift toward prioritizing meaningful usefulness over immediate fun, influenced by parenthood and a sense of responsibility.
- •CEO transition: continues technical work alongside leadership
- •Staying CEO to guide key decisions
- •Personal ethic: optimize for usefulness/meaning over fun
- •Parenthood and faith themes: gratitude, responsibility, maturity
- 30:05 – 36:32
Quick-fire: rapid predictions on AI acceleration, research noise, and an unpredictable 10-year horizon
In the closing quick-fire, Noam predicts rapid advancement in the next 1–3 years, and emphasizes Character.ai is more than entertainment—it’s a full-stack AI and product company. He critiques the ‘alchemy’ state of ML publishing (signal amid noise), shares an early mistake about sparsity vs hardware realities, and declines to predict where the company will be in 2033 due to compounding technological change.
- •AI will get far smarter; momentum in hardware and research
- •Near-term adoption: significant progress in 1–3 years
- •Character.ai positioning: AI quality and product excellence are aligned
- •ML research is noisy; positive experimental results drive adoption
- •Lesson learned: hardware favors dense ops; sparsity must respect compute realities
- •2033 outlook: impossible to forecast; agility matters