
No Priors Ep. 12 | With Noam Shazeer
Elad Gil (host), Noam Shazeer (guest), Sarah Guo (host), Sarah Guo (host), Sarah Guo (host)
In this episode of No Priors, featuring Elad Gil and Noam Shazeer, No Priors Ep. 12 | With Noam Shazeer explores transformer Pioneer Noam Shazeer Builds Emotional AI at Character.ai Noam Shazeer, co‑founder of Character.ai and co‑author of the Transformer paper, discusses his path from early Google AI work to building large language models and chat-based products. He explains why transformers overtook RNNs, emphasizes that language modeling is an “AI-complete” problem, and argues that scaling models, data, and compute still shows no clear saturation point. Shazeer details the origins of Google’s LaMDA (formerly Mina), why big companies hesitated to launch open-ended chatbots, and how that led him and co-founder Daniel de Freitas to start Character.ai. He also explores user behavior on Character.ai, emotional and parasocial use cases, safety tradeoffs, commercialization plans, and his broader motivation of using AI progress as a lever toward AGI and solving real-world problems like medicine.
Transformer Pioneer Noam Shazeer Builds Emotional AI at Character.ai
Noam Shazeer, co‑founder of Character.ai and co‑author of the Transformer paper, discusses his path from early Google AI work to building large language models and chat-based products. He explains why transformers overtook RNNs, emphasizes that language modeling is an “AI-complete” problem, and argues that scaling models, data, and compute still shows no clear saturation point. Shazeer details the origins of Google’s LaMDA (formerly Mina), why big companies hesitated to launch open-ended chatbots, and how that led him and co-founder Daniel de Freitas to start Character.ai. He also explores user behavior on Character.ai, emotional and parasocial use cases, safety tradeoffs, commercialization plans, and his broader motivation of using AI progress as a lever toward AGI and solving real-world problems like medicine.
Key Takeaways
Transformers won because they align with modern parallel hardware.
Shazeer explains that deep learning’s success—and transformers in particular—comes from being highly optimized for GPU/TPU-style matrix-multiply hardware, enabling massive parallelism over sequences rather than slow, stepwise RNN computation.
Get the full analysis with uListen AI
Language modeling is simple to define yet essentially AI-complete.
Predicting the next word from vast text corpora is conceptually simple but, done well, yields general-purpose capabilities like dialogue, reasoning, and task assistance, making language modeling a central route to broad AI.
Get the full analysis with uListen AI
We have not yet hit a clear capability wall for LLMs.
Between algorithmic improvements (better architectures, training, quantization) and large increases in compute budgets, Shazeer sees no obvious point where current architectures definitively “tap out” in performance.
Get the full analysis with uListen AI
Data scarcity is overstated; human and AI-generated text can fuel growth.
He notes the enormous volume of language humans produce daily and anticipates increasing interaction data with AIs themselves, suggesting that data, especially with privacy-preserving methods, is unlikely to be the fundamental bottleneck soon.
Get the full analysis with uListen AI
Multi-persona chat is a better product fit than a single ‘universal assistant.’
Shazeer argues that single corporate assistants must be bland and inoffensive, whereas allowing users to create diverse characters and personas produces richer, more engaging and human-feeling interactions.
Get the full analysis with uListen AI
Emotional and parasocial use cases are central, not peripheral, to Character.ai.
Users spend on the order of hours per active day on the service, often for role-play, companionship, and emotional support, showing strong demand for affective interaction even with systems that are explicitly fictional.
Get the full analysis with uListen AI
Building great AI products requires both cutting-edge research and extreme motivation.
In hiring and co-founder selection, Shazeer emphasizes “burning desire or childhood dream” levels of motivation alongside technical excellence, which he credits for scrappy wins like Mina/LaMDA and the rapid build-out of Character. ...
Get the full analysis with uListen AI
Notable Quotes
“The most exciting problem out there is language modeling… it’s really AI-complete.”
— Noam Shazeer
“Deep learning really took off because it runs thousands of times faster than anything else on modern hardware.”
— Noam Shazeer
“I don’t think anyone’s seen a wall in terms of how good this stuff is, so I think it’s just gonna keep getting better and I don’t know what stops it.”
— Noam Shazeer
“Basically this is a technology that’s so accessible that billions of people can just invent use cases.”
— Noam Shazeer
“I wanted to have a company that was both AGI first and product first… by making the product depend entirely on the quality of the AI.”
— Noam Shazeer
Questions Answered in This Episode
If transformers eventually do hit a capability wall, what technical directions beyond today’s architectures does Shazeer think are most promising?
Noam Shazeer, co‑founder of Character. ...
Get the full analysis with uListen AI
How should society balance the benefits of AI companionship and emotional support with potential risks of dependence or isolation?
Get the full analysis with uListen AI
What concrete methods could make large-scale conversational data collection genuinely privacy-preserving while remaining useful for training?
Get the full analysis with uListen AI
How might Character.ai’s safety approach evolve as characters become more knowledgeable and more tightly integrated into real-world tasks?
Get the full analysis with uListen AI
In what ways could an AGI-first, consumer-product-first strategy accelerate progress in domains like medicine faster than domain-specific research alone?
Get the full analysis with uListen AI
Transcript Preview
(digital music) Noam, welcome to No Priors.
Hey, Elad. Th- uh, thanks for having me on. Uh, hi, Sara.
Good to see you. Yeah, thanks for joining. So, um, you've been working on the NLP and AI for a long time. So I think you were at Google for something like 17 years off and on. And I think even your Google interview question was something around spellchecking, an approach that eventually got (laughs) implemented there. Um, and when I joined Google, one of the main, um, systems being used at the time for ads targeting was like fill, and fill clusters, and all the stuff which I think you wrote with George Herrick. And so it'd just be great to get kind of your history in terms of working on, um, AI, NLP, language models, how this all evolved, what you got started on, and what sparked your interest.
Oh, thanks, Elad. Yeah, uh, just, uh, always was naturally drawn to AI, you know? Wanted to make the computer do something smart. Seems like pretty much the most, uh, fun, uh, fun game around. Um, so, uh, yeah, was, uh, lucky to find, uh, Google early on, and, uh, it really is, uh, an AI company. So, um, yeah, I got, uh, got involved in a lot of the, uh, early projects there that, uh, that maybe you wouldn't call AI now but, uh, but seemed pretty smart, uh, at the time. Um, and then more recently was on the Google, uh, Brain team starting in 2012. It looked like a really smart group of people, uh, doing something interesting, and never... Uh, I had never done deep learning before, or neural networks I guess as it was called then, or whatever. I forget when the rebrand happened-
(laughs)
... but, uh (laughs) -
Yeah.
... but, uh, yeah, it turned out to be really fun.
That's cool. And then, you know, you were one of the main people working on the transformer paper and design in 2017, and then you worked on Mesh TensorFlow, I think, um, sometime within the following year. Could you talk a little bit about how all that got going?
Yeah, I, I mean, I, um, messed around a few years, um, on the Google Brain team and, like, utterly failed at a bunch of stuff, uh, till I kinda got the hang of it. Um, really the key insight is that what makes deep learning work is that it is, um, really well-suited to, uh, to modern hardware, um, where, you know, you have the, uh, the current generation of, uh, of chips that are great at, um, at matrix multiplies and, you know, o- other forms of things that require large amounts of, um, computation relative to communication. So, uh, so basically deep learning, like, really took off because, you know, it runs thousands of times faster than anything else. And as soon as I got the hang of that, started designing things that actually, uh, were smart and, uh, and ran fast. Um, but, you know, the most, most exciting, uh, problem out there is language modeling. It's just, like, i- i- it's like the best problem ever, because, like, there's, like, an infinite amount of data, you know, just scrape the web and you've got, like, all the training data you could ever, uh, ever hope for. And, like, the problem is super simple to define. It's just, like, um, predict the next word. The fat cat sat on the, you know, like, okay, what, you know, what comes next?
Install uListen to search the full transcript and get AI-powered insights
Get Full TranscriptGet more from every podcast
AI summaries, searchable transcripts, and fact-checking. Free forever.
Add to Chrome