EVERY SPOKEN WORD
80 min read · 16,041 words- 0:00 – 0:27
Introduction
- EGElad Gil
(instrumental music) . Today on No Priors, we're having a special episode of Sarah and me just talking. Hello, Sarah. How are you?
- SGSarah Guo
Hey, Elade. What's going on? I see you a lot.
- EGElad Gil
Not much. Good to see you.
- SGSarah Guo
Let's talk about models. What's going on in the model world?
- EGElad Gil
Yeah. Um, I guess there's a lot of hand models that are emerging, so I was thinking of maybe trying to do that-
- SGSarah Guo
Mm.
- EGElad Gil
... eventually.
- 0:27 – 5:01
Model news and product launches
- EGElad Gil
- SGSarah Guo
It's almost as good of a business as, as investing.
- EGElad Gil
I know, right? Um, yeah, so there's been a lot that's happened in the model world, uh, recently. Obviously, Google launched Gemini, which I think had a few interesting characteristics both in terms of, uh, performance, but also the huge context window, right? It was a million token context window. Uh, companies like Magic, I think, in the past have actually put out, like, a 5 million token context window model and things like that, but it's really exciting to see that. And I think for certain application areas like biology, longer context windows actually seem to be quite important. And so for example, if you're doing a protein folding model and you have a short context window, you're often actually not encapsulating much of the protein, right? The average, uh, protein is, I think, something like 300 amino acids long, at least in the human genome, but there are things that are dramatically larger than that and so you just can't capture it in some of the context windows being used for biological models. And so I do think this is gonna be one of those areas that will end up being more important than people think, at least in the short run. Um, but Gemini 1.5 seems to have some really interesting performance characteristics. There's obviously Sora from OpenAI, which was, um, the video model that, uh, you know, is just beautiful to watch. You know, there's other model companies like Pika and others that I think are doing exciting things as well. And then, um, Mistral or Le Miz launched, uh, Le Chat, which is really the name of the product.
- SGSarah Guo
Le Big Model.
- EGElad Gil
Le Big Model. Le Big Mac.
- SGSarah Guo
I believe they call it Mistral Large. Yes. (laughs)
- EGElad Gil
Yes. Le Large. Mistral Large. (laughs) They launched that, and the thing that's really, really impressed me about Mistral is just the velocity of shipping. It's incredibly impressive. They went from basically starting the company to almost GPT4 level in less than a year, right?
- SGSarah Guo
Nine months, yep.
- EGElad Gil
It's amazing. And they have, uh, you know, small performant models. They have Le Big Mac or, you know, the large model. They have chat. They have multiple languages. It's just- it's very impressive execution. So... And then I think the other thing that they just launched or announced was that deal with Microsoft where, you know, they're- they're now being licensed onto Azure, and so I think the main models in Azure now are OpenAI, LLaMA, Mistral, and then some of the Microsoft models. So again, that's striking as well. So, uh, just very impressive progress by that company so far.
- SGSarah Guo
I think the design space for what you actually want from models is certainly going to include state-of-the-art capability, and Mistral is very much going, uh, after that, and they've said so. But I- I- I think, like, from the beginning, the company has talked about efficiency, um, and latency and the ability to serve different use cases with that and, um, and also, you know, being long-term proponents of retrieval, right? Like, one of the big debates in the research world right now... I don't know how much of it is a debate, but people are talking about it. I'm on one side of this. Is that, um, like, RAG and retrieval is dead with sufficient context. And, uh, curious what you think here, but I'm- I'm more of the belief that it just opens up the set of trade-offs you can make between, um, retrieval, more sophisticated retrieval and model reasoning by having a larger context window versus saying, like, "We don't need any, um, ability to, uh, work with a specific dataset versus just retrain or, um, stuff something into context."
- EGElad Gil
Yeah. We're gonna have both, in my opinion. The other thing I think that's very under-discussed, and this could lead into agent stuff, but I'd like to, um, also spend a little bit of time on Gemini before I move to agents, is if you look at a lot of the optimizations that are done for, um, areas where you had, uh, human-related sort of reasoning or other components, pre-LLM-based reasoning, uh, a lot of it was happening at inference time, right? So when you were doing- when you were trying to build a better poker AI, a lot of what you did was, um, you know, certain types of tree searches or other things when you hit inference time, right? You built the model, but at inference it did a lot of extra work. And I think that's also a little bit under-discussed in terms of probably a lot of what's gonna happen in the future, particularly when we get into agents and reasoning, is stuff that's happening at that point of inference and then it's used to sort of feedback over and, um, sort of continuously train or retrain a model over time, because I think that's the other piece of it is, you know, from a model perspective, you, uh, spin up a giant data center and you spend $100 million over 12 months overall between all the different works that you do and everything to launch your next model, and then you have a file, and then you use that file, right? For the next year as you train the next model versus saying you're gonna do some sort of continuous upgrading or training. And so, all these things are gonna shift over time. I think it's early in the technology cycle,
- 5:01 – 8:23
Google enters the competitive space with Gemini 1.5
- EGElad Gil
and so all these things are gonna happen. Um, you know, one of the companies that has a lot of capabilities to do interesting things over time, of course, is Google. So I'm a little bit curious if Gemini has changed your opinion of sort of the AI model race and what role Google plays in the future. You know, has it not changed your mind much?
- SGSarah Guo
I think the question on, like, whether or not, uh, Google has the ability to do the research work to have a competitive product, uh, has been answered, right? Gemini is a very impressive model. I think the, um, the capabilities that they have internally that they haven't released yet around, um, additional, like, function calling and multimodality are also really, really impressive. And so the questions around Google are less about do they have... Like, they have all of these extraordinary advantages, and you're- you're the ex-Googler. Like, I wanna hear your opinion. But they have the distribution. They have the cus- the consumer behavior. They have all the data on, like, what the search behavior is. They have the data on what queries are valuable and which they would peel away and turn into, like, an answer. Um-... uh, they know how to build, like, advertising auction systems, and they have a great research team and enough GPUs and, uh, um, and the model capabilities.
- EGElad Gil
Do you think it was progressive enough though?
- SGSarah Guo
Do I think the models are progressive enough?
- EGElad Gil
Yeah.
- SGSarah Guo
One might actually ask if they're perhaps a little too far in that direction.
- EGElad Gil
Mm-hmm.
- SGSarah Guo
Right?
- EGElad Gil
Interesting.
- SGSarah Guo
Um, and, and so (laughs) I, I think, like, the question is actually can they steer Google to, like, focus on being competitive versus the many other demands from their employee base, um, and to, like, different missions that are not, um, brokering the world's information and, like, market cap?
- EGElad Gil
Mm-hmm. Yeah. It's interesting because, um, the launch of 1.5 has made me more bullish on Google, uh, and I, I was always actually quite positive on them, right? Like, um, I think I read a blog post a year, a year and a half ago basically about the model world, and one of the things that I mentioned at the time was I felt like Google was kind of a sleeping giant. And once, once it awoke, you know, um, it could really make enormous progress quickly. And just as Mistral has executed from scratch as a startup, which is extremely hard to do, right? You're literally building everything from the ground up, um, although obviously there's open source to support you and all these other things, but fundamentally, you're just building an entire company. It's pretty amazing, right? Um, uh, Google has, uh, really accelerated its efforts, and it's had a series of launches over the last two, three months that have been quite impressive in terms of the velocity from cold start to, to having things that are externally accessible, and they have all the resources that one would need in order to do extremely well in AI, right? They have the compute. They have pr- unique proprietary data as well as all the data from the web, all the data from YouTube. Um, they have specialized data that you could potentially opt into, like, you know, all your emails and your Google Docs, and, you know, they have this immense corpus of really valuable information, um, and then they have amazing talent. And so really, I think the, the thing that was, um, lacking until recently was the will.
- SGSarah Guo
Mm-hmm.
- EGElad Gil
And it seems like now, because of the competitive dynamic, the will has been reborn, right? Um, and so it, it really feels to me like they are gonna make really big strides going forward,
- 8:23 – 10:22
Biology and robotics using LLMs
- EGElad Gil
and, um, you know, it's always possible the velocity only increases from here for them.
- SGSarah Guo
If I think about the domains in which these, um, general LLMs are still not as capable... I mean, it's every domain, but, um, in particular, not as capable as we want, um, like, two of the areas, one you already mentioned, um, uh, that I, I'm, I'm excited about include, like, biology and then robotics, right? So maybe let's talk about that for a second. As a, as a task, for example, if you ask ChatGPT to design a DNA sequence that can express CRISPR-Cas9, it can't do that yet, right? And i- if we think about cell design, protein design, protein optimization, a lot of these, um, are areas where you have researchers showing, like, really exciting progress in use of transformers and diffusion models to, um, get to much better predictions for, for example, um, drug discovery and, um, target identification. And so I think, you know... I- I've seen a number of companies in this area of better understanding of biology that really feels like a different type of reasoning, a different type of dataset, and as you said, even, um, like, specific context window constraints. A- and so I, I think that's an interesting one, and then on the, um... I don't know if you wanted to mention the robotics side or if that's something you've been looking at too.
- EGElad Gil
The robotics stuff seems super interesting. It's a little bit earlier, um, than some of the other models and, and part due to data constraints, but it seems like there's pretty reasonable ways to generate some of that data now, so, um, it seems like the... You know, in general, I wouldn't be surprised if 2024 and 2025 is the year of proliferation of models, um, where we're gonna start to see an expansion in terms of the different types that are covered, you know, chemistry, material sciences, et cetera, et cetera. Robotics will be part of that biology, it'll be part of that maybe physics and math. Um, I think maybe the last thing that
- 10:22 – 14:22
Agent-centric companies
- EGElad Gil
is happening from a model perspective is I think the last few weeks have seen, um, a lot of different sort of agent-centric companies get up and running, and, um, I think that's been an inter- a really interesting wave. And some of them, again, are taking very different approaches from the traditional, "Let's just build a giant LLM," and they're looking at things like AlphaGo or some of the game-centric, uh, work that have been done in the past. You know, how do you build a better proto- poker paper? How do you build Diplomacy? How do you build Go? And there, you have a very strong notion of acting sequentially based on changing information. You have some forms of what's known as self-play. You know, you, you have the machine play itself a billion times ago, and it learns new patterns based on that. You have really interesting approaches and heuristics and algorithms, uh, time of inference versus training, and so I think that, that harpist of knowledge is about to hit the world in the context of new products. An- it'll take time for those products to emerge, you know, six months, 12 months, a year, but, um, it does feel like that's another wave that's coming, where you're taking a fundamentally, uh, different approach to re- uh, that in- involves reinforcement learning, but is just different in terms of how you think about what you're actually doing in architecting and what you're inferencing and all the rest. So th- that's the one other area on the model side that I think is very exciting.
- SGSarah Guo
One thing that I've seen, um, here is that people are getting much smarter about agents as part of systems versus expecting to, um, uh, simply, like, instruct an agent and have it work with compounding failure across a bunch of tasks, right? Uh, in a gene- gen- general environment across any type of software, right? And so if it is operating in an environment that supports reinforcement well, like a game environment...... or even a, a web application environment, um, but one that is constrained to particular tasks or working- agents working in domains that better support a sampling and validation, like code generation. Like, I'm really excited about that, and I, I feel like I've begun to see the glimpse of some of those things work. Whereas, a very real question you could have asked in Q3, Q4 of last year would be, like, "Does... Is any of this stuff useful, right? Is it anything?" And I, I think now it's like, it is. Yeah.
- EGElad Gil
Yeah. People went too broad too early versus just saying, "I'm just gonna focus on a handful of targeted use cases or domains, and I'm gonna figure out how do you create feedback loops in those domains so I can actually train effectively." And so, you know, the very ear- early versions of this, even pre-dating this LLM wave was, um, you know, "Hey, we're gonna have a browser plugin and it'll watch everything you do and then it'll do everything you do." Which is a very different from- problem from saying, "Hey, we're gonna make our PA better, we're gonna make it better, we're gonna make customer support better, we're gonna make, you know, XYZ thing better." Um, so I think the targeted approach makes a lot of sense.
- SGSarah Guo
Yeah. And I, I think some of the teams working on this have also... They've actually experimented with post-training in environments where you can pay for, um, for, uh, human feedback data, right?
- EGElad Gil
Mm-hmm.
- SGSarah Guo
And if you do that, then you actually understand like the, um, the distribution of data you need, the scale of data you might pay for. And that's very exciting because it turns it, like, the agent problem from one that is, um, like open-ended untenable to just like, "How much is it gonna cost to make a particular task work?" And I- I'm massively oversimplifying here, but that is a very different proposition when scoped than, uh, like as you described, the initial set of forays into agents, which is like, you know, "We'll try to do anything."
- EGElad Gil
Yeah. That makes sense.
- SGSarah Guo
And I think we'll still get there, but there's, um, like rapid success on this front. (instrumental music) NVIDIA. Everybody's talking about earnings.
- 14:22 – 17:29
NVIDIA earnings
- SGSarah Guo
What do you make of it?
- EGElad Gil
I think earning money is a excellent idea. How 'bout you?
- SGSarah Guo
I think Jensen understands this better than everybody else. I think one thing that people have been talking about is whether or not this was a, like, short-term phenomenon, right? Like, if, um, there was only so much demand and once the supply chain caught up a little bit, um, there would be less insane growth. And I, I think now people are pretty confident, especially hearing Jensen's comment, that they expect to continue to be supply constrained through the rest of the year. Demand is just, like, much, much larger than I think most people expect on the CapEx side. Um, and, and I think it's, like, worth understanding the upgrade cycle that drives that, right? Because there's this huge efficiency incentive to upgrade from A100s to H100s to H200s to B100s. I was talking to one of my portfolio companies that's buying in the tens of thousands of GPU size and is skipping to B100s because they described it as, like, free money in terms of training efficiency. Uh, it's funny when somebody describes spending hundreds of millions of dollars as free money, but free money in terms of training efficiency if you can actually get access to a cluster of a certain size. And so if, if others feel that way, it, it is, um, wild how much this expen- like, expands the, the server market.
- EGElad Gil
Yeah. It's probably a good time to run a hedge fund. I think in general, um, one thing that's a little bit under-discussed is a lot of the emphasis on startups and startup rounds and, "Oh, look. The startup raised $100 million," or whatever, and the reality is a lot of the spend is the big hyperscalers and then other clouds that are building out right now. And then I think the other thing is that if you were to look at at least enterprise adoption of AI, it's still really, really, really early days. And despite that, if you look at Microsoft Azure revenue in the last quarter, they mention that, um, revenue grew by 5% from AI-related products, which if I'm doing the math right, if it's a $25 billion a quarter, uh, Azure sort of, um, revenue, then that means they're adding something like one, one and a half billion a, a quarter in new spend due to AI, right? So that's five or six billion annualized. And so, um, you know, one thing that is a little bit, uh, uh, perhaps not talked about is there's a lot more stuff coming. And over the next two years, three years, et cetera, as enterprises really adopt this at scale, we should anticipate as well that, um, you know, the need for compute will continue to grow, so... It's really interesting to see about this replacement cycle you're talking about, the massive spend by big tech on, um, on LLMs because they're driving most of the spend on LLMs because they're, they're the big rounds, right? The big rounds aren't venture capitalists investing billions of dollars. It's the big tech companies. It's Amazon and Google and Microsoft and the- and, uh, Salesforce and NVIDIA actually, right? Um, and then there's the enterprise adoption, which is still TBD. So yeah, there's a lot going on.
- SGSarah Guo
On this point, if
- 17:29 – 20:43
ROI in AI
- SGSarah Guo
you look bas- um, among, you know, AI years are like dog years, so a year to the Meta earnings beat at the end of January, uh, did you, did you see this article, uh, that David Khan wrote at Sequoia, the 200... Uh, like AI's $200 billion question?
- EGElad Gil
Uh, is this where he basically said based on the spend if you think of the ROI you need, then you need to generate hundreds of billions of dollars in return-
- SGSarah Guo
Yeah.
- EGElad Gil
... in order to justify all the, uh, yeah, all the spend that you had, yeah.
- SGSarah Guo
Yeah, very succinct summary. And, um, I was like, "Okay, yeah, that is the question," and I feel like the Meta earnings beat was the, like, one-day answer to that question, right? So to your point, they're one of the large spenders, um... Uh, they said they're gonna spend 30 to $37 billion on, uh, CapEx-... in 2024, driven by, like, AI, driven by servers, right? Um, Mark has this great, like, quote where he's talking about 600K, uh, H100 equivalent units of compute and saying, like, there's no room for other people. But the response to all of the investment that has, um, in, in capex for, um, training and inference at Meta over the prior years has been, like... a huge earnings beat from better targeting, leading to better conversion, better recommendations, leading to better engagement, better advertising tools, leading to better ROI, um, as well as, like, the cost controls that the rest of the industry is doing. And so they had this one day... I thought it was really nice that the number was exactly this too, that this one day ad of 197 billion of market cap, biggest single session ad before Nvidia. I forget where Nvidia ended up landing after their beat, uh, but, like, that's the answer, right? Like, you know, 197 billion, um, of increase in enterprise value on, w- m- 25, 30 billion of capex. Like, you should keep doing it.
- EGElad Gil
Yeah. Yeah, it's kind of amazing. It's, it's kind of a related question because I remember Yuri Milner showed me this chart, which basically he looked at the aggregate increase in startup market cap and the aggregate increase of what at the time was, like, FAANG market cap, and obviously now there's like the Magnificent Seven or Eight or whatever it is. Um, and so if you looked at the top tech companies at the time, they added, like t- I don't remember what it was, five or ten times the market cap of all the startup ecosystem combined during the same period of time. And to some extent you could argue we're going into the same thing, at least in the short run for AI and we still haven't seen the monster AI companies emerge from scratch and undoubtedly those will exist. Um, but at least for the next few years, it seems like where we're gonna see that really huge market cap incremental add, um, maybe companies like OpenAI and some of the model companies, but also it, it seems like increasingly it's just gonna be existing companies adding huge amounts of, uh, uh, revenue and earnings and, um, compute and (laughs) everything else along the way. So it's back to, like, maybe the right thing to do right now is just start a hedge fund.
- SGSarah Guo
I, I think that also begs a question of, um, how to think about, like, all of the other companies, like tech and not, in terms of, um, amount of impact from AI. I actually
- 20:43 – 25:45
Impact from AI
- SGSarah Guo
think it would be, like, a really fun lens to run a hedge fund, um, (smacks lips) uh, with because you can take a, you can take a very long term view of something that feels very secular and just classify companies this way and long short, like, take that strategy as the only lens. Um, because, like, I, I do think that there are a number of services companies that are, um, squarely in the sights of things that you will be able to significantly automate. And the only question is which of these management teams is going to have the, um, investment capability, technical talent, guts, conviction to invest the way Mark did through, you know... People were really mad about the capex spend for a few years at, at Meta, right? And I think the answer is mostly, especially some of these services firms, like, um, maybe they partnered to get there, but they mostly will not make the transition, I think.
- EGElad Gil
The other thing that isn't really discussed is the impact it's already having on some businesses. So obviously ServiceNow had, like, a below quarter in part due to AI, so we're starting to see a little bit of that enterprise adoption. Um, one of the folks from Klarna posted, uh, today that they built an AI assistant that's powered by OpenAI that in its first four weeks handled 2.3 million customer service chats for them. And so it ended up handling two thirds of all their customer service inquiries. It was on par with humans in terms of customer satisfaction. It was higher accuracy, so it led to a 25% reduction in repeat queries. Um, customers resolved their errands in two minutes versus 11 minutes. It's live 24/7 in over 23 markets communicating in over 35 languages, and it performed the equivalent job of 700 full-time agents. And so basically, Klarna in, you know, a few months or a year or however long it took them to build this, built this customer service chat product and it replaced 700 people's worth. And they say that at this point they have something like 3,000 full-time agents, and so it cut the agents needed by about 25%. Right? And so, uh, it's this really interesting post from Klarna where they announce this and then one of the things they announce as part of that is, you know, longer term society needs to think about what this means for society (laughs) , uh, because this technology seems to be so good for certain human level tasks. And this is back to that point of AI adoption in the enterprises just starting, but how many years is it before every enterprise realizes that they can cut customer support dramatically, at least for certain types of products just through, just through adding, you know, simple apps, you know? And so, uh, I think that's the other thing that is kinda happening in the background that isn't talked about that much but, you know, is already starting to really show its face in, in pretty interesting ways.
- SGSarah Guo
Yeah. Well, I, I do think you're gonna get this accelerated adoption that goes use case by use case, right? Where, like, in, in any market you have early adopters that build it in house or go get these solutions and are willing to take the risk when you don't actually know, like, what the impact will be, how well it will work. But as soon as one payments company does that and it's a better experience for the customer or it has real, like, impact on operating cost, I think, like, you switch very quickly over to the entire sector being like, "We have to adopt it in order to be competitive on both fronts."
- EGElad Gil
Oh, yeah. Yeah, this stuff tends to happen slowly and then suddenly all at once, and I think we're in the slowly phase right now. And, um, I actually had my team go and take, um, uh, global services and look at that, right? And so if you look at, uh, spend on software in the US right now, it's about a half trillion dollars.... in software spend a year. If you look at, uh, human-centric services, just payroll, for things where gen AI can probably impact things, it's three and a half to 5 trillion. So if you convert just 10% of that spend into AI revenue, you've effectively recreated the entire US market software industry and market cap, right? And so these are huge trends that are coming, and you can kind of imagine vertical by vertical what are those things gonna be. And then you can ask, is it gonna be built as internal tools for companies? Is it gonna be a new company that emerges that serves these things? Or is gonna be an incumbent who figures it out and adds it? And so this sort of customer support chatbot thing, you know, you would have thought that there's a company doing this for everyone, and it looks like in this case they're, um, they just did it internally or in-house. Uh, but you could also imagine an existing company like a Zendesk or somebody adopting to this, and the real question is which of those three scenarios is gonna happen, at least from a startup perspective? But from a technology wave perspective, this is massive, right? And you can build in the feedback loops really easily for this type of product, right, because you can have the customer rate it or thumbs up, uh, thumbs down at the end of the session, et cetera. So you have a really good sort of, um, RLHF or some sort of training set for it as well. So it, it's a, it's a product that should get better and better and better over time as you use it more.
- 25:45 – 29:09
Building effective AI tools in house
- EGElad Gil
- SGSarah Guo
Yeah, I think one of the things that is an indicator of, uh, like, where that services spend might be that gets externalized is actually, like the big tech companies actually have, you know, they're tech companies, but they have broader businesses than, um, I think sometimes they're given credit for, right? Like Facebook Meta interacts with SMBs as advertisers. If you look at anybody who has this, like, large commerce, um, type customer base, so as you just mentioned, Klarna or Square or Meta or Shopify, like they've all done this now and it's working-
- EGElad Gil
Mm-hmm.
- SGSarah Guo
... right? And, and so, uh, I think the fact that these are the companies that have the technical teams that are capable of doing it in-house is a nice indicator for, like, well, if it's that effective, everybody else should too. And the question is, I think not every segment of customer, like retailers with, um, enough of a technical team to build an e-commerce presence may not build this themselves. Then it's a more likely, uh, scenario that either an incumbent or a new company, be it Ciara or something else, ends up owning that customer service segment.
- EGElad Gil
Yeah, 100%. Yeah, we, we have a long list internally of, like, the companies that I think should exist in this space, right? 'Cause there's, there's so many obvious ones, and very few companies exist for most of them, if any companies. And so I think it's, it's back to this idea that there are these human capital waves happening in AI, and the very first wave we saw was researchers, and they built, uh, early model companies, and they built some of the early applications like Perplexity and Harvey, and all these things were actually started by people who were working on models initially, and they were just closest to the technology so they knew what to do. And then the second wave of human capital was, like infra people 'cause they were the, the second closest to LLMs. And then the third wave, of course, is gonna end up being application builders, but many of them were not aware that any of this stuff was important until ChatGPT came out 15 months ago, and they're just starting to show up, right? It takes them nine months to quit their job and a few months to figure out what to do and find a co-founder and a few months to build a prototype. And so we haven't seen anything yet really in the app wave. You know, all the apps or many of the apps so far were started by people who were very close to the research community, and then it's kind of permeated into other areas with, with some things growing really fast, right? There's like half a dozen medical scribing apps that all seem to be growing at a pretty good pace, or there's, um, a few other application areas where it seems like there's a number of people working, but then there's lots and lots of spaces where it seems like nobody's doing anything, which is, which is kind of weird honestly.
- SGSarah Guo
Yeah. There's a joke that the foundation model companies, um, are here to replace all the jobs, but they don't understand what any of the jobs are, and I think there's, like, a little, a little bit of truth in the sort of exposure to, uh, what happens in, you know, a, a broad range of companies in terms of functions and outsourced services, and so I think that is the opportunity, right? Like, now it's a race for people who are just great engineers smart about a domain to go experiment on the fringe of that, and I, I still think there's opportunities around. Like you and I have talked about, um, the, uh, domain areas where you might want specific models or verticalized companies still, and we should, we should talk about that, but I, uh, I, m- m- my team and I just gave a presentation at this AI and production conference about how if 2023 was the year of infrastructure, like '24 is the year we begin to see applications,
- 29:09 – 33:23
What would it take to compete with NVIDIA
- SGSarah Guo
so I think we're pretty aligned there. I do wanna ask you, like, one thing before we move away from all of the earnings stuff, which is, um, the most obvious place somebody's already making money is either, like, cloud providers, inference providers, or, um, just NVIDIA as a chip maker. What would it take to compete to have, like a second source with NVIDIA?
- EGElad Gil
I think there's a few different approaches, right? I mean, uh, fundamentally, if you look at what people claim is the defensibility and part of NVIDIA, it's a mix of chip performance, CUDA, and interconnect. You know, NVIDIA bought, uh, Melonix back in 2019, it was an Israeli company, uh, to basically provide the interconnect side. I think that was, like a $5 billion acquisition, so it's quite large relative to NVIDIA's market cap at the time. Um, and then obviously CUDA has been developed over many years, uh, uh, and then obviously they've, they've iterated really well on these sort of different generations of chips. Um, so minimally you at least need the, you need some form of silicon in this performance (laughs) and then you need to make sure that it's actually able, you're, you're able to, um, use it effectively and then you're able to scale it, which is sort of the interconnect side. Um, and there's the incumbent side of it, right? AMD is obviously working on this, Intel is trying to, et cetera. And then there's the, um, the startup side of it where we've seen things like Rock-... uh, emerge where they have very fast inference for open source models as well as language models, which is pretty striking. You have Cerebras, which has taken a fundamentally different approach to the chip side as well so, you know, there's a few startups that I think have some, uh, interesting early hardware and there's some new companies like X that have talked publicly about how they're really focused on transformer-based models and architectures for chips that they're building. So, um, there will be this potential wave of second sourcing over time, but, uh, you know, in general, if you look at many of the most advanced chip markets historically at least, there's tended to be a winner, or I should say a leader, and there's been, there's tended to be a second place party, and that was, you know, during the microprocessor world, that was Intel and then AMD was number two. And, um, you know, in, uh, mobile it kind of morphed a little bit, right? You had Qualcomm and Arm doing different things but both quite successfully, but I think Qualcomm was always, at least for a period of time, the bigger company, although Arm is much larger now. I should actually check that in terms of market cap. Yeah, Qualcomm is 176 billion and then Arm is 140 billion, so they're pretty close actually now. Um, there used to be a pretty big disparity, uh, between the two. In part, that's because Arm is being used now in sort of broader ways. Uh, so, you know, you, you kind of tend to see these market structures in semiconductors where there's a leader and then a second place and, um, I think part of that is traditional Moore's Law chip generation related stuff. I don't know how that will hold up or how that'll morph in AI. I don't know if you have an opinion on that.
- SGSarah Guo
Yeah. The, um, you know, the way Jensen has described advancements in chip performance tend to be more, um, uh, memory management and new techniques versus just, like, transistors fitting, um, on a particular die size, and, uh, I think somebody else at NVIDIA called it Jensen's Law of, like, ability to get performance from full system. But the, uh, the only thing I'd add to your description of, um, competitiveness here is also, like, manufacturing even for these fabless chip design companies is a big deal, right? Like, so you gotta do what you said, design something better including interconnect, design an entire, like, build an entire software ecosystem, CUDA's been around since 2006, but after that you have to go get capacity at TSMC, right? And then you need to get yield up and then you need it all to be competitive in terms of pricing. I think the desire, like, the economic pressure given two trillion of market cap and more demand than NVIDIA can support is higher than ever, but I think the moat is actually really, really deep. And so, um, when I think about, like, what could be, what could be enough to go disrupt that, I've seen a, um, I'm, I'm sure you've seen many of these companies, but I've seen a few different, um, approaches. It could be a, um, chip and system designed for, like, specifically very much around latency,
- 33:23 – 35:42
The architectural approach to compute
- SGSarah Guo
um, eh, b- but the other thing that you said, right, like, something, for example, optimized to transformers as an architecture, you're taking a bet around how much stability there is around a particular architectural approach, and I think that's felt like a, um, a quite good, uh, uh, bet for a while now, but for the first time in a long time, there is some interest in, in things like state space models with companies like Kertisia and, um, some, some alternatives. Right? Um, if you're a really big company with your own use case, right, if you're Meta or you're Google and you all, you, you know, either have, like, the entire ad system recommendation serving, spam, et cetera, um, or all that, like search and your own cloud, then you don't need to make everything work on the software ecosystem side. You just need to make one application work, and, you know, these companies also acquired teams in. But that's how you end up with, like, TPUs and Traniums and all that, uh, but I, I would love to meet companies in this area and still haven't, haven't seen something that's gotten me, um, over the edge even in a place that is so obviously economically fertile.
- EGElad Gil
Yeah. I think, uh, one thing you pointed out which was interesting to expand a little bit on is TSMC and the whole fabless, fabless semiconductor world where you're basically, you know, uh, outsourcing the development or the manufacturing of the chip to a handful of players, TSMC being the bigger, but there's, well, the biggest, but there's, there's a, there's one or two others that are big enough to at least handle some volume. And, you know, there's been this push to try and repatriate semiconductor manufacturing to the US and it's run into all sorts of obstacles that are pretty avoidable, environmental reviews that go on endlessly or other things that have prevented people actually starting to build these things that take many years to build. Um, and it's been interesting to watch that in Japan, they're starting to actually have really interesting, uh, development of fabs specifically for this purpose, and so I'm increasingly wondering whether Japan emerges as sort of a second source location in part to geopolitically hedge Taiwan. Um, but I think that's something also to kind of watch in terms of where you're actually seeing fabs go out and how do you think about that geographic distribution, but also why is the US in some sense getting in its own way for something that has pretty broad-based strategic importance on multiple levels,
- 35:42 – 38:30
the roadblocks to chip production in the US
- EGElad Gil
you know, including national security law, so.
- SGSarah Guo
If you listen to the, um, TSMC CEO about this, he talks as much about, um, uh, about, like, the human capital and the cultural elements of human capital required to make a place like TSMC work as the CapEx spend, right, and the, um, the access to equipment, the need to actually build the fab. Um, I think that's pretty interesting because, like, you know, we, we can invest a great deal, um, but it's, it's very hard to change culture, and so I, I, I do think that there's, um, there's one version of, like, maybe you have fabs in, um, Japan or Mexico or Southeast Asia or a, um-... uh, like a, just a broader global supply chain for chip production, or maybe you have robots making chips, right?
- EGElad Gil
Yeah, I mean, th- that's all true but the flip side of it is Intel has manufactured chips in the US for a long time. TI did historically, right, but Intel still does, so I, I don't think there's a complete lack of human capital. Obviously it's concentrated in part in Taiwan and to a secondary extent in, um, Korea right now. But I, I do think there's, there's the capability to do it and I think, again, we're, we're, there, there are other things that are getting in the way I think even before that.
- SGSarah Guo
Yeah.
- EGElad Gil
Like, can you even, can you even break ground on the plant? Maybe step one, right? (laughs)
- SGSarah Guo
Yeah.
- EGElad Gil
Maybe we should start with the basics so then we can deal with culture when we actually have a fab.
- SGSarah Guo
Yeah. Well, and, I'm, I'm, I'm, uh, I guess very willing to believe that these companies and industries didn't exist in the place they, places they do without, like, great leaders for TMC, SMC or otherwise, and so, like, maybe it's not a, uh, solvable problem. Like, I'd be curious if you believe in the Intel fab business that they're, um, uh, trying to push and push to other customers now. But it, to me it's not binary. It's like, of course we can, like, make chips in America. The question is can we make them without the, um, churn and with the yield and cost to make them competitive? But maybe it's so important, like, you don't need them to be competitive for some period of time.
- EGElad Gil
Yeah. And also my point is, um, we're already doing that for Intel, right? Intel's fab business is in the US. Not the, not the, the fabless TMCC-style business. Just making their own chips. They've been doing it for decades in the US. It's been fine. It's been high yield, you know?
- SGSarah Guo
Yeah. It's been, it's been fine-
- EGElad Gil
Yeah.
- SGSarah Guo
... but it's also been, um, behind in terms of, uh, process technologies, right? But maybe, maybe that's not a human capital issue. Maybe that's other, other issues at Intel.
- EGElad Gil
Yeah, it seems like it's a other issue, yeah. (laughs) I think my general take on the whole market is the more I learn, the less I know in AI, and it's the opposite of every other field I've ever been in. Usually-
- SGSarah Guo
Really?
- EGElad Gil
... the more you learn about something... Yeah, usually the more you learn about something,
- 38:30 – 42:17
The virtuous tech cycles in AI
- EGElad Gil
the more you can create sort of these straight-line hypotheses or y- you know, what you know kind of compounds and it's static. And I feel in the AI world, like, every week there's, like, so many new things that your entire world model shifts.
- SGSarah Guo
In, like, a fun way.
- EGElad Gil
(laughs) Yes. It's, it's, uh, fun to be exhausted. But I think, um, you know, there, there's just so much going on and the pace of innovat-... It really feels like, you know, that, that early slope into the, you know, the exponent that is a singularity or however you wanna phrase it, but it really feels like this, uh, self-reinforcing loop of new stuff. And honestly, a lot of it was kind of held back in the larger tech companies and now it's kind of flourishing externally and that's creating competitive pressure on the larger companies and the larger companies are reacting and that's spawning more startups and it's just this really interesting virtuous cycle. Uh, and to some extent, the big tech companies are helping fuel it all by then funding the companies that are working at very late stages with huge rounds, and they're funding a lot of the compute in the industry in a way that's, you know, at least an order of magnitude, maybe two orders of magnitude more than what the venture community is doing. And so it's this really interesting virtuous cycle of startups come out, that accelerates big tech doing stuff, that causes some people to leave b- big tech to do some interesting things externally. They then get funded by big tech and that accelerates both themselves and big tech and we, you have this kind of interesting cycle happening right now, so it's very exciting days.
- SGSarah Guo
Yeah. I drew, um, I drew a slide that has, like, as you might hope, like, a bunch of reinforcing cycles. It's very fancy.
- EGElad Gil
Mm-hmm.
- SGSarah Guo
And the one I would add to that is, like, what we started talking about, which is when something begins to work, if it is actually valuable, like the Klarna thing that you described, like, at some point if it's valuable and it moves the needle in the business, you have to do it, like, as a part of the competitor set.
- EGElad Gil
Mm-hmm.
- SGSarah Guo
And so I think, like, we started with this, like, narrative-driven thing where, um, you know, CEOs would say that they're gonna do AI because, like, the markets believed that was the future and it was very generic, and you see that show up in the spending numbers, or at least the expectations around spend, right? I was looking at this, um, survey from one of the investment banks that says, like, Fortune 1000, um, uh, IT budgets go up to 5 to 8% this year instead of 3 to 5% generally and it's all because of AI. Like, that's pretty big, right? That's, like, two (beep) X. And, like, if that's true, then that's also part of the reinforcement cycle here 'cause if the companies start to work, then they get to continue building these products. VCs, you know, or investors like us will, will keep, keep trying. So I think it's pretty exciting.
- EGElad Gil
Yeah, it's RLPAAF. Yeah, RLPAAF.
- SGSarah Guo
(laughs) Rolls right off the tongue.
- EGElad Gil
Reinforcement learning through product adoption feedback. You're welcome.
- SGSarah Guo
Well, I'm just gonna plug that into ChatGPT and have it write the paper, but, um, I will be sponsoring author if you'll be first author.
- EGElad Gil
Hmm. Yeah, I'll see if I include you. (laughs)
- SGSarah Guo
Academic violence.
- EGElad Gil
(laughs)
- SGSarah Guo
Find us on Twitter @nopriorspod. Subscribe to our YouTube channel if you wanna see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.
Episode duration: 42:14
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode E4TldCRLyoo
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome