Skip to content
The Twenty Minute VCThe Twenty Minute VC

Noam Shazeer: How We Spent $2M to Train a Single AI Model and Grew Character.ai to 20M Users | E1055

Noam Shazeer is the co-founder and CEO of Character.AI, a full-stack AI computing platform that gives people access to their own flexible superintelligence. A renowned computer scientist and researcher, Shazeer is one of the foremost experts in artificial intelligence (AI) and natural language processing (NLP). He is a key author for the Transformer, a revolutionary deep learning model enabling language understanding, machine translation, and text generation that has become the foundation of many NLP models. A former member of the Google Brain team, Shazeer led the development of spelling corrector capabilities within Gmail, the algorithm at the heart of AdSense. --------------------------------------------------------- Timestamps: (0:00) Intro (00:43) Noam's Google Experience and Introduction to Character (06:18) Character. AI's Vision, Growth, and Ethical Considerations (14:13) Technical and Business Aspects of AI and Machine Learning (23:19) Business Strategies and AI Philosophy (30:05) Quick-Fire Round --------------------------------------------------------- In Today’s Episode with Noam Shazeer We Discuss: 1. Entry into the World of AI and NLP: How did Noam first make his way into the world of AI and come to work on spell corrector with Google? What are 1-2 of his biggest takeaways from spending 20 years at Google? What does Noam know now that he wishes he had known when he started Character? 2. Model Size or Data Size: What is more important, the size of the data or the size of the model? Does Noam agree that “we will not use models in a year that we have today?” What is the lifespan of a model? Does Noam agree that the companies that win are those that are able to switch between models with the most ease? With the majority of data being able to be downloaded from the internet, is there real value in data anymore? 3. The Biggest Barriers: What is the single biggest barrier to Character today? What are the most challenging elements of model training? Why did they need to spend $2M to train an early model? What are the most difficult elements of releasing a horizontal product with so many different use cases? Where does the value accrue in the race for AI dominance; startups or incumbents? 4. AI’s Role on Society: Why does Noam believe that AI can create greater not worse human connections? Why is Noam not concerned by the speed of adoption of AI tools? What does Noam know about AI’s impact on society that the world does not see? --------------------------------------------------------- Subscribe on Spotify: https://open.spotify.com/show/3j2KMcZTtgTNBKwtZBMHvl?si=85bc9196860e4466 Subscribe on Apple Podcasts: https://podcasts.apple.com/us/podcast/the-twenty-minute-vc-20vc-venture-capital-startup/id958230465 Follow Harry Stebbings on Twitter: https://twitter.com/HarryStebbings Follow Noam Shazeer on Twitter: https://twitter.com/NoamShazeer Follow 20VC on Instagram: https://www.instagram.com/20vc_reels Follow 20VC on TikTok: https://www.tiktok.com/@20vc_tok Visit our Website: https://www.20vc.com Subscribe to our Newsletter: https://www.thetwentyminutevc.com/contact --------------------------------------------------------- #NoamShazeer #CharacterAI #HarryStebbings

Noam ShazeerguestHarry Stebbingshost
Aug 31, 202336mWatch on YouTube ↗

EVERY SPOKEN WORD

  1. 0:000:43

    Intro

    1. NS

      What we hear a lot more from users is, "I'm talking to a video game character who's now my new therapist, and this makes me feel better." (laughs) We- we had no idea that was going to go on.

    2. HS

      Welcome to 20VC, the show that interviews the best founders and investors in the world. And today, we're joined by one of the foremost experts in AI and NLP, natural language processing, Noam Shazeer. Noam is the co-founder and CEO at Character.AI, a full stack AI computing platform that gives people access to their own flexible superintelligence. Before we dive into the show today, it would make such a difference, if you like 20VC, if you would click the Subscribe link beneath this video. (electronic music)

  2. 0:436:18

    Noam's Google Experience and Introduction to Character

    1. HS

      Noam, I'm so excited for this. I heard so many great things from many different people. Eric Schmidt, Sarah Wang, Prajit. So thank you so much for joining me today.

    2. NS

      Thank you. Yeah, great to be on, Harry.

    3. HS

      Now, I would love to start with some context because few people spend 20 years at Google in the height of Google's scaling and trajectory. First, I wanna go back to the beginning. I heard there's a story to your joining. What happened, spelling corrector? Can you give me the story?

    4. NS

      Um, yeah, that was, uh, yeah, that was, like, the first project that I, uh, that I worked on at Google. Yeah, I guess at the time, we, you know, Google had a spelling corrector that was, uh, you know, it, it was some third-party software, it was, uh, you know, based on maybe what you'd find in a word processor at the time. So there was, like, some human compiled dictionary of maybe about 50,000 words. And any word that wasn't in the dictionary that was in the query, it would say, you know, "Did you mean such and such?" And this worked great for spelling correction but it was, like, absolutely terrible for web search because people searched for such a wide diversity of things on web search, like most of them are just not in the dictionary. So, like, you'd search for turbot, like TurboTax, and it would say, "Did you mean turbot axe?" Like, and people just learned to ignore the thing. So, first project, like, we were just looking at, like, why are people, like, not happy using Google and, like, spelling correction was, like, the, you know, number one, uh, number one issue. So I was like, okay, let me, let me help out with this. And, you know, there was someone working on this, Paul Buchheit, who, uh, you know, who, who's, uh, you know, gone on to, uh, do a lot of, uh, a lot of illustrious things in his, uh, career. He's also one of our investors here at, uh, uh, Character. But, uh, he was going on, uh, on vacation for a couple weeks, uh, you know, o- over the winter holiday.

    5. HS

      What are one to two of the biggest takeaways from your time at Google? 20 years is a long time. How did it impact you?

    6. NS

      I, I'd say one of the big takeaways is, um, that, you know, if you have a technology that is, like, really, really general and has billions of use cases and, like, ordinary people can use, like, launch it to, launch it to billions of people... I remember when I joined Google, there were, like, uh, a lot of people working on this enterprise search appliance, which, you know, it was okay. Like, I think maybe somebody had this conventional wisdom that, like, um, B2B is the only way to make money. But, like, what, what it actually turned out was, like, the much bigger thing was, uh, you know, wa- was, uh, B2C, you know, like, just serve something to, uh, serve something to everybody.

    7. HS

      How did that change how you think? Go for bigger? Go-

    8. NS

      Um, yeah, well, well, I, I think, like, right now, I, I've, uh, you know, started this, uh, this company, Character, and we're taking this large language model technology and, you know, we are just direct to consumer first. Like, here is something that is even more, uh, more versatile and more easy to use than even web search. You know, in that, like, okay, you can use it to be your friend or do your homework or, like, uh, brainstorming or get ideas or, like, a billion things. We haven't even thought of the best use cases yet. And then it's massively use- usable, like, all you have to do is talk to it. So it has these two properties, and to me, that means go and, like, launch it to, like, everyone in the world and let everybody in the world use it. Where I think some o- some of the other, uh, you know, some of the other companies are taking a more, uh, a more, like, B2B sort of approach with, you know, like a, uh, you'll have a foundational model company and then, like, verticalized application companies on top of it. So I'm really, um, like, inspired by the Google model of, like, full stack, end-to-end, all the way from, like, basic research to, um, you know, to launch a product directly to consumers. It's super fun, it's super motivating, you know, because, like, engineers like building stuff and then launching it and having, like, everyone use it, uh, immediately. And, uh, and then it also lets you do all this co-design of, like, you get, you get to affect every part of the stack, which is, uh, you know, whi- which is hugely powerful and fun.

    9. HS

      Can I ask, we asked about you joining Google. I, I think so many people are shaped by their past. When you think back on yours, how do you think about what you're running from?

    10. NS

      Yeah, why did I start working on artificial intelligence? Like, maybe, well, par- partially 'cause it's just, like, fun and what I do for fun anyway. Uh, like, just what could be better than trying to get the computer to do something that it currently can't do? But then, you know, the other thing is just to push technology forward. There, you know, there are so many, like, technological, uh, you know, problems in the world that could be solved. You have, like, 50 million people a year, like, dying from stuff like old age and cancer and heart disease and, like, all kinds of stuff that we could potentially, uh, you know, potentially find cures for. So rather than directly working on, say, medical research or something, um, I think I've got a lot more leverage, like, let's push AI technology, and then, you know, that can help with, uh, a lot of the rest of it.

    11. HS

      So how do you think about Character's, uh, mission and vision? When you think about, as you said, the world's greatest problems there, from climate change to wealth inequality to natural resources, how do you think about Character's vision and mission? 'Cause I think people misunderstand this, if we're honest.

  3. 6:1814:13

    Character. AI's Vision, Growth, and Ethical Considerations

    1. NS

      ... yeah. Okay. So what, what is, what is our mission and, uh, uh, and our vision? Well, I think we just need to have a lot of humility that, like, you know, we, you know, we don't exactly... You know, we're not in charge of the world. You know, God's in charge of the world. (laughs) Like, um, you know, we're not even in charge of the government. We're not, uh, you know, we're not in charge of what individuals do. Like-

    2. HS

      When you get up in front of the company and you sh- y- you've- you- there's many people in the team now, what do you chest-bump as the mission?

    3. NS

      I like this sort of motto of, you know, a billion users inventing a billion use cases, because that- that's sort of the superpower of this technology. And, you know, it sort of puts our company in, you know, in the right place of, we can't really, you know, guess, you know, what are the best uses- uses of this technology. And, you know, we've just, you know, observed time and time again, like, you put one thing out there, and that's not really what people want, and somebody else out there, like, finds something better to do with it. We put up as an example, like, a psychologist character, like may- maybe you wanna, you know, ta- talk to something and feel better, you know, and, you know, that gets a little bit of use. But then what we hear a lot more from users is like, "I'm talking to a video game character who's now my new therapist, and this makes me feel better." (laughs) W- we had no idea that was go- that was going to go on. Um, but like ... And then, you know, there's this huge, like, use case in, like, some mix of, like, entertainment and companionship and emotional support. We were totally not experts in this stuff. Like, you know, th- like, w- our job is just to put out something general and, like, just respect the agency of, like, our users, of everybody out there to, uh, you know, to, to do what they want with this stuff.

    4. HS

      When you think about the incredible growth that you've seen, 450 million messages a day, 20 million users, what do you think have been the ones two biggest elements that have driven that growth?

    5. NS

      Uh, one is that we launched. (laughs) Like, th- that's definitely been a, a frustration, uh, eh, you know. In the past, things seem, uh, you know, potentially too much brand risk at larger companies to like actually launch and get it out there. Another aspect is, you know, we launch something general. We sort of let, let people find the use cases. And then the other is, there are like massive needs out there in the world, like, okay, there are billions of people who, you know, feel like they need someone to talk to. So, okay, combine those elements, provides people with something general that they can use, and there are people out there with needs, a- an- and they're going to find it.

    6. HS

      Can I be d- I'm fascinated to hear your thoughts. I, I totally agree with you in terms of the horizontal use case. I'm fascinated by all the different ways they talk to it.

    7. NS

      Yeah.

    8. HS

      Do you not worry that we're losing touch with other humans? We- they've got no one to talk to, and so they talk to a machine.

    9. NS

      I think there's, like, you know, huge, like, value in the connections between people, and, you know, moral value as well, right? The last thing I wanna do is to, you know, take people away from fr- from human connection. In a lot of ways, um, you know, we wanna help with, you know, human connection. A lot of the people who, who don't have friends, and who, you know, are not as well connected, one big, uh, source of that is just social anxiety. And, you know, like, you know, there are the ... Yeah, there are huge numbers of people who are like uncomfortable, and we've got, you know, we've gotten testimonials of people who said tha- that they were uncomfortable, you know, talking to other people, and like, this is great practice, is- that actually helped them, uh, build up practice in either social s- situations.

    10. HS

      Do you think so, or do you think it honestly just builds up habit? You get used to talking to someone who's not a human?

    11. NS

      Uh, that, that is gonna be ultimately up to the users.

    12. HS

      What do you think is the hardest product challenge you face today? It's a difficult product paradigm that you face, so many different use cases, so many different people's needs. What do you think are the hardest product paradigms for you as a team to face?

    13. NS

      Um, the main things we'd need to do is, um, make it, make it very general so it, you know, so we're not like cutting down on the use cases, make it usable, right? People think of those two things as being in opposition to each other, being versatile and being usable. And, like, you know, we talked to like some, uh, you know, potential product managers early on, and they all say the same thing, "Oh, yeah, pick your verticals, narrow it down to ma- to make it usable." And we're like, "No, we're not going to hire these people. That's like the opposite of what we want to do. We want to build something that is usable, but very, very general purpose." So there's, there's sort of that dichotomy.

    14. HS

      I just don't get how you d- I'm s- I'm naive here, so I'm asking for education. But it's like, I'm trained on the thought that the more you specialize, the deeper, the richer conversation value you can provide. And so, how do you provide quality high enough with such generalization?

    15. NS

      Yeah. And that's been the magic of, uh, of, of neural language modeling, you know, the previous systems were all these rule-based systems that, you know, like fantastically complicated, you know, systems with millions of handwritten rules, and, you know, like, uh, you know, really, really complicated. Uh, uh, required knowing something about linguistics and about state of mind, and like all, all kinds of stuff, right? The new way of doing things with neural language models has none of that. Like-... I could know, like, pretty much zero about language in particular, um, other than it's, like, a sequence of words. So it has nothing to do with understanding language at all and there are n- not millions of rules. It's, it's actually relatively simple, kind of like a big black box. It, it all boils down to this one beautiful, simple problem of you have this sequence of words that's, like, the beginning of your document, guess what the next word is. Like give me odds on what the next word is in the sequence. And that, that problem is called, uh, language modeling. Just guess what the next word is based on the previous ones. You know, so I got involved with this, you know, around, like, 2015, uh, that, you know, the, um, you know, there were some other folks at Google, uh, you know, working, you know, working on this problem. They're like, "How, how good can we make it?" And this struck me like, hey, this is the best problem ever because it is so simple to, um, to state. And there's a huge amount of free training data. You can just, like, download the text of the web off of, like, you know, whatever you want, Common Crawl. You've got, like, billions to trillions of training examples of how, you know, guess the next word. And then if you can do it well, then this thing can just talk to you. It can be, you know, the better you do it, the smarter it gets. It's hugely, hugely, um, general and useful, super simple t- to state and now we just have to do it well. People were, you know, started building better and better, uh, neural networks, which I guess got renamed deep learning as some sort of rebrand, you know, neural networks had a bad name 'cause, uh, the, the hardware wasn't good enough. So people were using deep learning for, um, for language modeling and, you know, the bigger and better you made these things, the smarter it got. And,

  4. 14:1323:19

    Technical and Business Aspects of AI and Machine Learning

    1. NS

      like, around 2016, like, the, you know, the m- the most useful application, like the killer app, uh, for this was, uh, machine translation. It was about smart enough to take, you know, take English and translate to French or s- or something like that, which is, you know, massively, massively useful, it lets everyone in the world communicate with each other, but, you know, still not smart enough to, like, carry on an interesting conversation or do your homework or, like, a- any of those things. But there seemed to be a pretty clear path, hey, let's just make this thing smarter, like, bigger, better, smarter and it's going to get these capabilities.

    2. HS

      Can I ask you, when you talk about kind of working back then in 2015, 2016, these are very different, uh, excitement cycles to where we are today. I remember 2015, 2016, we had kind of a chatbot phase when there was, like, super excitement very quick, in like a month period. But there wasn't a sustained belief that we have today in AI transforming the whole way society works. And I guess my question to you is, like, where we are today, is that the result of technological progress recently, very recently, or is it a result of investors and society catching up with what's been developing over a much longer period?

    3. NS

      I, I'd say, I'd say it's both. Like, uh, you know, I, I think there's been a lot of technological progress, uh, both quantitatively and qualitatively. The models that were there in, like, 2016 or something were, you know, were too dumb to be fun. The, the neural models. Then there were, was this, all of the chatbot stuff you heard about back then was these rule-based systems that, you know, were just highly fragile and not going anywhere. You just needed more and more rules and there's, like, no way to think of, like, all the things that could come up and they just don't generalize. So that wasn't going to work, but at the same time, the, um, you know, we- we- we were progressing on, on the, on the neural network solutions, which, which were going to scale. It took some amount of time. I'd say around 2020 was when sort of, like, really impressive stuff was a- was sort of in the lab, but not launched. So my co-founder, D- uh, Daniel de Freitas, like, he's, he's a very cool, smart, scrappy guy. He, he's, like, on this lifelong mission to do chatbots. You know, since he was a kid in Brazil he's wanted to build, like, open domain chatbots.

    4. HS

      You said the more and more you do it, kind of the better it gets and the better responsiveness and accuracy it gets. I'm always cons- not concerned, but just, like, um, questioning 'cause you hear so much. What's more important? Is it the size of the data or is it the size of the model?

    5. NS

      Yeah, probably the size of the model is the, is the bigger challenge. We can get a lot of, we can get a lot of data, but, like, really, actually the number one thing that's important is how much computation you do to train it. So you want to train a bigger model and you want to train it for longer. The two things are both important, but what the real constraining factor has been is how mu- how many operations of computation it takes to train it, because if you make it bigger and you train it for longer, both of those multiply into how long the thing takes to train. So if the people have been building better and better, uh, essentially super computers to, you know, to train these models.

    6. HS

      What are the biggest constraints on your models today, do you think?

    7. NS

      Basically, that's it, is, uh, is computation. So, um, you know, the model we're serving now we trained last summer and spent about, about $2 million worth of compute cycles, uh, doing it. We will do a lot better in, uh, you know, i- i- y- you know, in the near future if we...... get a lot more better (laughs) hardware, which we, which we are getting, and spend, you know, spend longer, uh, training the thing, we can train something smarter. You know, back in, say, 2016, you could train something that was, like, smart enough to, like, translate languages, but not smart enough to, like, answer questions or, or, or be fun.

    8. HS

      If that's models, on the data side, uh, how do you, how do, how do you think about proprietary versus non-proprietary data? You said there about kind of in the early days you could download kind of the, the s- the data of the internet, so to speak. Character is producing a ton of proprietary data within your, uh, conversations as are-

    9. NS

      Oh, yeah.

    10. HS

      ... many verticalized solutions. It, be it in medical, be it in finance, be it wherever. To what extent is the value in proprietary solu- like, data ownership, versus it will still and always be downloadable by everyone?

    11. NS

      The data that you get from users is great because it tells people, like, what, you know, what users like, or, like, what users like in, in some particular, um, application. It's, it's kind of like a, you know, training a human, like, most of what's important is, like, you have, um, you know, decades of experience training your brain on stuff that is not really specific to your task at hand or your occupation, but you've kind of gotten a generally good understanding of the world and, and gotten generally intelligent. And then you can, uh, improve on that dramatically by getting a smaller amount of training in the task you're doing. B- both, both will contribute, and we, we do have, like, a huge amount of data flowing in from users, like, just how, you know... Ob- obviously, we're v- you know, very careful to not compromise anyone's privacy, but just based on, you know, an aggregate how, how people are, you know, using the service, we can learn to make it better.

    12. HS

      I, I, I totally get you, and, uh, well done for ticking the compliance box on (laughs) that one. Um, (laughs) you always have to tick the compliance box. You always have to do it. Um-

    13. NS

      Well, it, it's ver- it's very important because, like, uh, you know, if you just were to do the naive thing of take, like, every conversation you had with everyone and just train on it, then you could just be, like, spitting out, like, someone's private, you know, private life to s- you know, to somebody else. Like, people talk to these, you know, talk to these bots and, like, you know, pour out their heart. You know, like, uh, you, you, you wouldn't... you wouldn't... wouldn't want to just make that, like, freely shareable with everybody.

    14. HS

      Why is Chara- and I mean this in that why is Character a standalone company, and not a product of Facebook? If you think about a natural extension of Facebook into the metaverse, social, the extension of your physical friendship group into the metaverse or the, you know, uh, non-physical world, why does Character need to be a standalone, and why isn't it within Facebook or Snapchat or social platforms?

    15. NS

      Like, my, my experience, like, uh, you know, coming from Google is that a startup can just move way faster than a big company, and can launch products in ways that, you know, uh, you know, l- large companies are just going to move too slow because they're worried about, you know, compromising their, you know, their, their existing products.

    16. HS

      I to- I totally get you on the speed difference there. Do you think startups win then as a result in this next wave of AI innovation entrepreneurship? 'Cause a lot of people I have on the show now say t- it's, eh, I'm not... less so on the Facebooks of the world, but more the Microsofts and the Adobes. When... who wins, startup or incumbent, if you have to pick a side?

    17. NS

      I'd say the users win. (laughs)

    18. HS

      (laughs)

    19. NS

      The users are going to have a lot, uh, you know, a lot of options. But, you know, on the, uh, on the business side, I think there can be, you know, a lot of winners. You know, the- there's going to be room for, like, multiple, multiple players in there, you know, big companies doing what big companies are, are good at, startups doing what startups are, are good at. We- we're, we're gonna try to move our company from being a startup to being a big company as, you know, a- as fast as we can there, but a lot of just individuals and universities and such, you know, in that, like, the hardware is progressing so fast that what you could do at a big company, you know, one year, a few years later you're gonna be able to do, you know, at a university lab or in your garage.

    20. HS

      I totally get you and agree. In terms of those individuals and universities, I had Yann LeCun on the show, who's fantastic, and he said about, you know, the future of open versus closed and why he's such a protagonist for open. Do you agree in terms of open being the dominant method and mechanism of the community, or do you think actually closed wins? Many others have said closed

  5. 23:1930:05

    Business Strategies and AI Philosophy

    1. HS

      wins.

    2. NS

      I mean, as there has been, there, there will be a, uh, big ecosystem of open and closed, of, you know, people who aren't sharing their secret sauce with... and people who are sharing all of their secret sauce, and the ability to mess around with things at a small scale is going to lead to way, way more research being published, even if some of the larger entities are no longer, you know, publishing research. Obviously, though, there's also, like, economics of scale of both training the best models and serving. Like, if you want to serve a product, you, you can do it p- maybe 100 times more efficiently if you're serving many, many people at once and kind of batching things together versus you're, you know, serving, like, an individual or, you know, or you, you've got your own rig to run your language models in your basement or something.

    3. HS

      Can I ask you, you, you sit at the center of the ecosystem, and you have done for many years.What do you think society believes that you would like to change their perspective on, on AI? Uh, you do interviews, Noam. You get asked the same questions, you see the same headlines. What do you, like, "God, I wish, wish people would change their mind that AI's gonna kill everyone," or, "AI's gonna replace all jobs," or any of the clickbait that we see everywhere? What do you-

    4. NS

      (exhales)

    5. HS

      ... wish society would change their minds on with regards to AI?

    6. NS

      The best applications just haven't even been invented yet. That, you know, uh, we, uh, you know, we're, we're still at, like, a invention-of-electricity kind of moment, or invention of the computer wh- wh- where we don't really, um, you know, we don't really know what, what the coolest things are going to be.

    7. HS

      In conversation sometimes, your friend will do something a bit wacky. Yeah, they'll do something kinda cool. And models can do the same with hallucinations and introduce a bit of creativity (laughs) . Are hallucinations a feature, or are they a bug?

    8. NS

      We consider them a feature. Um (laughs) , or, or at least... Okay, basically, our goal, uh, you know, our, our strategy is, like, launch something general. Let people do what they want with it. And if these models are hallucinating, which we, which they certainly are and we advertise that they, that they are, then the use cases that emerge first will be ones for which hallucination is a feature. You know, I'm happy to have, like, entertainment and emotional support and fun be the first use cases. You know, I'm happy to have, sort of, productivity be the first use case. Let's just let that happen naturally based on what the technology's good at.

    9. HS

      If you think about Google, which is like helping people find information faster, better, more efficiently-

    10. NS

      Yeah.

    11. HS

      ... what do you think will be characters? 'Cause as you said, like, a million people doing a million things. I mean it respectfully, do you know and will we ever know?

    12. NS

      I think you could ask the same thing about any company that is selling some very general tool, like a company that's selling computers or, uh, or, or phones, or, like, phone service, or electricity for that matter. What is electricity for? Is it for, you know, is it for fun? Is it for productivity?

    13. HS

      What's been the hardest aspect of building Character for you? Have there been any elements which really stand out (laughs) ?

    14. NS

      Yeah, th- thank God everything's, uh, you know, e- e- everything's going well so far. I mean, we, we've got a, uh, really great, uh, really great bunch of, uh, bunch of people, like some of the best, uh, sort of researchers and engineers from, uh, from the industry.

    15. HS

      Have you enjoyed that transition to CEO and scaling CEO?

    16. NS

      Yeah, uh, like, I, I have. I still do a lot of, uh, a lot of, uh, technical work and, uh, and leadership, which, uh, which is big. I'm gonna stay CEO 'cause, uh, I, I want the company to make the right decisions. Um, so, you know, I, I don't really judge, um, you know, what I do by how much fun it is. Uh, it's, it's more like, what's the most useful? So very, very happy to, uh, you know, to, to be, uh, doing what I can.

    17. HS

      Can you unpack that for me? I'm sorry, I'm just interested. "I don't judge it by how much fun but by how useful."

    18. NS

      Yeah, it wa- it wasn't, like, a, a matter of, like, "Am I going to be having more fun being a startup CEO than an ML researcher at, uh, at Google?" It's, it's more like, "Hey, I want to push this technology forward. Like, what's the best thing I can do?"

    19. HS

      I guess I didn't know if you'd find, like, utility and fun come together. I love what I do and I have a lot of fun doing it, and it's the most useful.

    20. NS

      You know, a lot of things about parenthood are, are, are absolutely terrific and super fun, but I, I think it made me more, more religious. I decided to take a change of attitude from, like, "What is fun right now?" to, "I should be thankful for having the opportunity, you know, to do something important and me- and meaningful." So I, I think that it's, it's kinda been a big, uh, attitude, uh, you know, shift in, in growing up.

    21. HS

      If you could phone up yourself the night before your first child was born and give yourself a piece of advice, what would you say to yourself?

    22. NS

      Um, uh, "Get some sleep first (laughs) ."

    23. HS

      (laughs) .

    24. NS

      (laughs)

    25. HS

      Really, wha- really, what do you know now that you're like, "You know what? I would've told myself..."

    26. NS

      Neural language modeling. "Get into neural language modeling (laughs) ."

    27. HS

      An exa- an, an example is, I just, uh, I had one of the world's most famous, uh, hedge fund managers on the show, and they said, "The only thing that matters is my wife. The children don't matter. I don't matter. The only thing that matters is looking after my wife. And if I look after her, she'll look after the children and she'll look after me."

    28. NS

      Yeah, not everything in the world is your responsibility. You should understand what is your responsibility and what isn't your responsibility. And I think that, that works really, really well in, you know, in, uh, in marriage and, uh, a- a- and in parenting. I mean, I think religion's a lot ab- about that as well. Like, what is, you know...... sort of beliefs about what is, you know, what, what's your responsibility and, you know, a, a, and what's not. What should I be, you know, what, what do I need to do? What, what should I be concerned about? And what, what, and what

  6. 30:0536:32

    Quick-Fire Round

    1. NS

      should I not?

    2. HS

      Uh, listen. I want to move into a quick fire, Noam. So I say a short statement, you give me your immediate thoughts. You're like, "Dude, what the fuck? I, I come on to talk about AI." And now, you're-

    3. NS

      Oh, no, no. This is-

    4. HS

      And now, we're-

    5. NS

      ... this is, this is great. I mean, uh, like, uh, l- luckily, we're recording a lot and you can like, cut out the crazy parts. (laughs)

    6. HS

      So what do others not know that you know to be true?

    7. NS

      You know, the, this technology is just gonna get way, way smarter. I think we're at a sort of Wright Brothers first airplane kind of moment in, uh, you know, in the AI, that there's like, a lot of momentum going on both in, uh, building better hardware and in research. So, you know, what, whatever really amazing applications you're seeing now, it's probably nothing compared to what's gonna happen in the future.

    8. HS

      What do you think that adoption timeline is? Is that like one to three years or is it like industrial age where actually it took kind of 20 to 30 years to see machinery and optimization happen widespread?

    9. NS

      Oh, I, I think, I think things are gonna move, uh, very fast. Like, I think we'll see like a lot of, a lot of very cool stuff happening in the next one to three years.

    10. HS

      What do people not understand about Character that you wish that they did?

    11. NS

      I think like externally, it like, looks like, um, entertainment app, but, you know, really, you know, we are a full stack company where we're like an AI first company and a product first company. Having that is a function of picking a product where the most important thing for the product is the quality of the AI so we can be completely focused on making our products great and completely focused o- on pushing AI forward and those two things, a- and those two things align.

    12. HS

      Tell me, what single element would you most like to change about the AI community? I wish there was a transparent log where you could see legitimacy of authors and you could say AI-

    13. NS

      Oh.

    14. HS

      ... like publications or blogs. But I think so many quickly profess to be experts, and actually just determining discovery-wise who is Yann LeCun, who is Yoshua Bengio, who is Noam Shazeer versus who is someone who moved from Web3 and crypto and is now a AI expert.

    15. NS

      Uh, there's so much stuff being published, it's hard to, hard to know what's good, and I think a lot of that has to do with the fact that, you know, th- this field is kind of alchemy r- right now. Like, no one knows exactly what is going to work, so, you know, you have a lot of people trying lots and lots of different things and, you know, you, you can come up with hits by having like a good intuition of what will work on the ML side kind of combined by like a good mathematical understanding of what will run fast on hardware that you can buy or that you can build. So there are some hits that come out and, you know, people will adopt and it's combined with a lot of noise. Um, like negative results are not useful because they could be negative 'cause somebody just made a mistake and there could be, you know, a bug, you know, that it didn't work for some other reason. What is interesting is positive results that are proven out by experimentation, like, you know, if somebody can say, "Hey, I did something and, you know, did better at this well-known problem," then that gets interesting. Then you, then everyone tries to figure out, okay, why does this work? How can I adopt it?

    16. HS

      What did you believe that you turned out to be wrong on?

    17. NS

      You know, when I started getting into deep learning, I, uh, you know, around 2012 had a bunch of early, uh, failures trying to, um, do sparse computation, you know. I was like, okay, like, you must be able to do something, you know, better and more efficient by building a sparse network. And that was, that was so wrong because I, uh, I did not understand that the reason this whole field is working so well is because now we have this magic hardware that's great at these dense matrix multiplications and so you can do them like orders of magnitude faster than you can do anything that involves poking around in memory, and there was no one there to like explain that to me when I got started with deep learning. Um, so okay, as soon as I sort of understood that part of it, it's like, okay, let's do sparsity, but let's build it out of these dense building blocks so it'll, so that it'll run fast and then, you know, publish this, uh, mixture of, uh, experts, this spar- sparsely gated mixture of experts idea that's, uh, only now getting like a lot of adoption. But, you know, that, that was back in, uh, uh, 2016 and then have had like a string of hits ever since, which I will attribute to, uh, to divine intervention but also to, um, you know, to, to understanding like the, the hardware mechanics and the, uh, sort of quantitative computation aspects of the field.

    18. HS

      Noam, 2033, 10 years from now, where will Character.ai be then?

    19. NS

      On Mars. (laughs)

    20. HS

      (laughs)

    21. NS

      Uh, no, uh, n- not, uh, I have, uh, absolutely, uh, no idea. Like, we, we will see what technology is like then, but, you know, it's just important for us to be, uh, to be agile. Like it's, it's kind of like if you were in, um, you know, in, uh, 1900 and like asking, you know, w- where some company would be in 2000. There will be such technological, uh, improvement before then that, that it, it's roughly impossible to predict where any company will be.

    22. HS

      I think this has been unlike any interview you've done for you. (laughs) I feel like the questions have stretched boundaries of parenthood that, that people didn't ask you before. I really enjoyed having you on, Noam, and I hope you've enjoyed it too. (laughs)

    23. NS

      Very much so.

Episode duration: 36:32

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode w149LommZ-U

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome