EVERY SPOKEN WORD
50 min read · 10,235 words- 0:00 – 0:24
Introduction
- SGSarah Guo
(music plays) Hi, listeners. Welcome to No Priors. Today, Elad and I are just hanging out. We're gonna talk about LM consolidation, what's going on in chips, I think an interesting dynamic a- around what type of risk you should take as an AI company in pushing the envelope, and, um, some big transactions. So let's get into it. First topic. Elad,
- 0:24 – 2:18
LLM market consolidation
- SGSarah Guo
are we done here? Is it too late? Is the LM market consolidated?
- EGElad Gil
It's such a interesting question, right? Basically, what we're seeing is a number of model companies, um, are either, you know, having their teams join larger enterprises, so that may be parts of Inflection or Parts of Character or parts of other companies, parts of Adapt, AWS. And so that, you know, there's sort of one dynamic going on there with the model side. And many of those companies are continuing to exist, right? Like, you know, some of the products are still running and being used in different ways. Um, at the same time, there's, um, enormous capital moats, uh, emerging to get to the biggest scale for foundation models. And so if you look at it, you know, these companies are now raising billions or tens of billions of dollars, often either from hyper-scalers, so the Amazons and Microsofts of the world, or from sovereigns, right? Because those are the only people who can actually give you billions and billions of dollars. The venture capital industry is just too small to actually be able to support the next round for these companies, so everybody's kind of partnering up. And so it's a really interesting question to ask, well, for all the other players in the market, where are they gonna get these ever-rising amounts of capital and who do they partner with? Does Apple end up with a partner? Does Samsung end up with a partner? Does, you know, um, XYZ other company end up with a partner? And so you can kind of map, like, all the potential partners to all the model companies and just ask, how does all that fall out? And then in parallel, the big hyper-scalers have an incentive to fund these companies simply because it, you know, in some cases also translates over to more cloud utilization as an industry in general. The incentives start to flip between VCs, clouds, other strategics and sovereigns in terms of what they want to do. It does seem like it's increasingly hard to think that most companies will end up being competitive outside of a fundamental breakthrough in the model architecture or cost of actually training and then running inference on the model, or doing the post-training side. So I think it's a really interesting open question, but it does feel like we're moving into a stage of more and more consolidation. I don't know. What do you think about it?
- SGSarah Guo
Most
- 2:18 – 3:58
Competition and decreasing API costs
- SGSarah Guo
of that makes sense to me. I would argue that the market has become more competitive, not less over the last, like, year and a half. Maybe it's competitive between a set of players that have, like, as you described it, you know, a capital moat, that there's some breakaway scale. But the dynamic now, at least from the consumption side, is there's continual and aggressive performance increases and, like, competition on the benchmark and price decreases, uh, uh, and also, you know, real open source players. And so you can have consolidation and people not necessarily making money yet.
- EGElad Gil
I think you actually raise a really interesting point which, um, shouldn't be under-discussed, which is that API costs have dropped something like 200X in the last 18 to 24 months, or something along th- uh, along that order of magnitude. David on my team actually pulled together a chart of all the pricing for all the various models and, uh, what that looks like over time, and it's dramatic in terms of how cheap, um, you know, the dollars per million tokens has, has gotten. And so I, I think related to that, if you have a 200X decrease in the cost of running these things, or inferencing these things, and part of that is distillation, part of that is like what are you actually using in terms of the generation of GPU, et cetera, et cetera, um, the actual margin available and the, and the, the, the revenue available is increasing from the perspective of usage, but it, it's becoming harder and harder to just go out and compete with a, with a model, at least as an API business, right? And so that kind of pushes you into specialization, into other areas of doing either bespoke specialized models or specific types of post-training or vertical applications or things like that.
- SGSarah Guo
I think the other way
- 3:58 – 8:20
Innovation in LLM productization
- SGSarah Guo
that you could look at the consolidation is just like what is the argument for capital at that scale from a business perspective? And, like, the mostly unsaid thing is, like, really, like, it's still AGI is the business, right? There will be emergent models, emergent behaviors and capabilities in the model where it will figure out how to make money for us or it will be obvious, like, how valuable it is. But I think in the more, you know, immediate, and I'm not even saying that's wrong, but I'd say in the more immediate and, um, like, two to three year horizon, you know, you have consumer as a business, either apps, um, uh, like, uh, advertising or subscriptions, and nobody's gone the advertising route in anger yet, or enterprise as a business. Both these are real today. Um, but I, I think also it's become, like, much more of a, uh, product fight, right? You have the big hires of Mikey K and Kevin Whele at Anthropic and OpenAI. I think you see players trying to build moats around the, um, capital moat and research that they've done to make more than just chat as an interface. Um, and so I don't... I, I think on both sides, like, you're, you're gonna see providers try and push customers down a more locked-in path in terms of, like, APIs, right? We saw this with, you know, AWS. You're gonna have sophistication around, like, don't just do... It's a, it's a storage bucket. You're, you'll have prompt caching and JSON output interfaces, fine-tuning. And that actually makes it much less of a commodity market if people adopt it, um, 'cause it's easier.
- EGElad Gil
Yeah, there's a lot more features being built in, and I think you're referring to sort of what Anthropic has done on the caching side, which is a really interesting move in terms of, you know, how that impacts cost and timing and everything else, or latency.
- SGSarah Guo
Yeah, a- and I, I do think... I'm pretty excited as a consumer, um, like, what we should expect from interfaces, right? Not that chat goes away, but, um, you can imagine much smarter chat with automatic contacts and different surfaces. So I, I do think that, like...You know, there's a, there's a question about whether or not you can compete with the consolidation, there's a question of how the big players compete. But the challenges I think that are possible in the market that people are trying today, or the reasons you could still go after it would be, um, like, if people are taking different, very different reasoning approaches, where, you know, you can, you can collect the amount of capital required to get to competitive scale, which decreases, you know, uh, when you're repeating work that has already been done, um, and, and because you have the benefit of, like, you know, the hardware progress that continues to be made. But here, you have people working on, you know, math and code for self-play, and so I think that's interesting. It's not necessarily, like, purely different architecture, but what the next level of scaling is. A- and then, just, um, you know, distillation and relevance of small model, um, uh, fine-tuning I, I, I think is, like, a- another open question of, like, how people are gonna really use these things.
- EGElad Gil
Yeah, and I think, um, it's important to clarify that we're talking specifically about language models, right? And so there's lots of other model types that will be coming over time in physics and biology and material science and, uh, image gen of different forms, et cetera. And, uh, you know, in some cases these things are going multimodal, but in many cases, you're gonna have unique models for each, and really what we're referring to right now is sort of this core large-language model market, and, you know, how that evolves over time. And then to your point, there's other pieces on top of that that could be used either for language models or for other types of models in terms of, uh, reasoning models, agentic flows. Like, it's, it's almost like a orthogonal axis. And then the third piece of it is the differentiation within the, um, infrastructure around the models. You mentioned caching as an example, and then there's long context window, there's rag, there's all sorts of other things as well. And so I do think as we think about how all this stuff evolves, we're gonna see evolution across all three axes. And the real question that we're trying to address right now is just simply for the core LLM market. You're building better and better larger language models. You know, how does that evolve? And there, it does feel like things have consolidated a bit, but, you know, uh, it's funny. You look at the history of social networks, and everybody thought this company
- 8:20 – 11:40
Comparing the LLM and social network market
- EGElad Gil
called Friendster was gonna win, and then everybody thought MySpace was gonna win, and then Facebook emerged. And by the time Facebook emerged, everybody said, "Well, it's just a commodity market, and there'll never be a long-term differentiator." And then Facebook won sort of the core social piece, and then even after that, you had Instagram and you had Twitter and you had, um, Snapchat and eventually ByteDance and TikTok, right? So there, there are these ongoing waves of stuff even after people called the end of social. And so I think the same thing is likely to be true here, where there'll be certain people who start to grab parts of the market. You know, LinkedIn became the sort of, um, uh, enterprise identity social network or whatever you wanna call it, right? Your resume social network. But then Facebook became one core piece, Insta became one core piece, Twitter became one core piece, et cetera. And, you know, Twitter was kind of news and real-time information. The same thing should happen here over time.
- SGSarah Guo
Do you think the, um, direction of other domains, let's say, like, video or audio or, um, other model domains goes in the direction of this commoditization?
- EGElad Gil
I think the reality is that it's gonna be, um, general purpose models for certain things, and then specialized ones for other application areas, and that could be wrong, right? It comes down to what's the degree of generalizability that you have not only in the model capabilities, but then also in the tooling around it, and then does the tooling need to be vertically integrated with the model? And so say, for example, you have a really good image gen product, and it may have artistic applications, it may have graphic design applications, it may have UI design applications. Is that all one model? Is it fine-tunes or post-training on one model? Or is it, you know, one big model for, for one aspect and then a bunch of fine-tuned models or I should say specialized models for other things? And I think that's a really big open question. And, you know, I think there's similar discussions to be had just around, um, AGI or more general purpose intelligence, right? Like, what, is that gonna be... If you look at the way the brain works, it's a set of reasonably specialized modules for vision and vision processing, for different aspects of emotion. You know, there, there's these really interesting things in the psychological literature where somebody will literally have, like, a steel beam accidentally driven through their head in a construction site, and suddenly they'll lose a very specific type of emotional functioning or reasoning, but everything else is fine, right? And so the question is, like, how specialized will these models be and how generalized? And I think that's also true things, for things like image gen, you know? Will you have a different model for graphic design than what you're using for artistic expression? I don't know. You know, it's a, it's a interesting question, um, and I think time will sort of tell on that as well. I do think one thing that's been interesting is in the last couple months, it does feel like the image gen market has started to kind of heat up, right? Before, it just felt like Midjourney was gonna be the default independent player that wins and then maybe end up with some multimodal stuff around DALLE and OpenAI or some of the stuff Gemini was doing. But there's, like, an increasing number of companies now that are really emerging that seem really interesting in terms of the fidelity of their models.
- SGSarah Guo
One of the things that makes me feel a little bit silly is if I, like, have a belief, like, you know, video, video and image models, audio models, like, they will tend toward, like, rapidly increasing capability and some commoditization, and then still being surprised by the pace, right? And, and so I, I do think that there's,
- 11:40 – 13:21
Increasing competition in image generation
- SGSarah Guo
like... When, when Sora came out, for example, um, uh, it's, it's an amazing research advance, but there was also a sense of, like, who's really going to be able to catch them? And, you, you know, you could argue now that you have a handful of companies that are showing really amazing video generation capabilities, where it's not actually, like, a bunch of smaller players are step function behind there, between, um, Runway and Pika and even, um-You know, if you go from, as you said, like image and video, you have like very small players or, um, uh, uh, mid-stage players like the Ideograms, Hotshots of the world. Like, uh, it- it's impressive to me how many times I see researchers come out, have a five-person team, and not that much capital. And versus the narrative of the AI, like, market five, six months ago, say, like, "Oh, like, you know, we can produce something really competitive." Um, even the stuff that LumaLabs has been coming out with, right? And so I think that's been an update for me mentally.
- EGElad Gil
I think one thing that's striking is the size of the models is shrinking relative to performance over time too, right? And so that may be through distillation, that may be through other things. But across the board, we're seeing more and more performance off- off of smaller and smaller models, which I think is the other thing that, um, I think a priori wasn't as expected, say, a year or two ago when all this stuff was kicking off. Like, you knew there was, there was room to sort of effectively compress certain things but, you know, it's- it's- it's been striking how far you can go in some cases. Um,
- 13:21 – 14:43
Trend in smaller models with higher performance
- EGElad Gil
and again, the brain may be a good- a good example of what is possible because you have a 20 or 30 watt device running in your head that's pretty good in terms of being general purpose and doing image identification and other things. (laughs) You know, we have very cheap hardware running. Um, so, you know, from that perspective, (laughs) there's still quite a bit of room to go, and it's in a compact space.
- SGSarah Guo
I think we're gonna see really, really cool experiences on the image video audio side, because as you say, if the models get smaller and they get better, um, they also get, you know, and there's different architectures, like what Karteia's working on, you're gonna get much more real time. And we don't, uh, I- I don't think we have a lot of real time applications in production at scale today. And the- the difference in experience, like Marc talked about this, of, you know, you can generate images as you speak is a very different one than, um, the, uh, like, you know, I'm an artist making an output experience. And so I think that will happen over the next couple, couple months.
- EGElad Gil
There's sort of two areas of innovation, um, relative to the stuff we're talking about, and get- we should probably touch on both of them. One is sort of the chip layer and how that may further accelerate certain things. Um, and then secondly, I think it's a little bit of, like, what- how do you think about what you actually do in terms of the output, the data you train on, et cetera? And how much do you push the envelope on that? So for
- 14:43 – 17:33
Areas of innovation
- EGElad Gil
example, say you go back to the early days of Google and there's huge controversials- controversies at Google- around Google because what Google was doing is indexing the web. So it's taking all this content that was distributed around the world, and it was- it was, um, from the perspective of some of the folks back then, they were effectively scraping the web, right? They were taking all the news content, they were taking everything that everybody had written and posted, and they were indexing it, and then they were making money off of it. And one of the things they would do is they'd have this small, what was called a snippet, right? And if you look at the Google result, there's like a little blob of text and then the link. And that blob of text some people claimed, um, wasn't under fair use for copyright law. There's this concept of fair use, like, you know, can you use a small thing without having to pay the copyright owner? And so there was all these lawsuits and people coming after Google both on the news side as well as, um, the fact that they were portraying these snippets that- that some people viewed as copyrighted information. And where that all netted out years later was for the three things. Number one is they invented somebody known- something known as robots.txt, which is a file that you put on your website that tells a- a web crawler like Google or Bing or whoever whether or not they're allowed to index your information or crawl it. Um, number two is, um, people decided these snippets fell under fair use from a copyright law perspective. Number three is there were some content deals that were struck for content, um, particularly around very specialized content where Google was getting feeds and then incorporating them into their oneboxes and things like that. And then, um, the fourth thing I think that happened was that some of the people realized it was better to be in Google than not because they'd get attribution. And so a lot of the news parties that pulled themselves out of the Google index and said, "Just remove me from the index," realized they lost a ton of traffic by doing that. And so they went back and said, "Actually, you can start indexing us again because we realize it's a bigger financial penalty to not do it than to do it." Right? And this took maybe a decade or 15 years to play out, right? It was kind of this ongoing arc. And Google, by pushing the envelope, um, and being very thoughtful actually about the legalities of these things, they had a really sharp team focused specifically on copyright and other areas, um, kinda threaded the needle, right? And kinda made it out, um, reasonably unscathed by all this. Um, how do you think that evolves for other companies in terms of, you know, the places that perhaps are seeing some- some questioning of approaches or Imogen, some of the audio companies, et cetera? You know, how- how much risk do you think a startup should take and how should they think about the various approaches? And again, we- we have this really interesting set of pa- past case examples that- that may be informative or related to this.
- SGSarah Guo
So we have- we have these businesses, like, as historical examples that absolutely push the envelope, like Airbnb and Uber, that challenge the concepts of, you know, restrictions
- 17:33 – 24:19
Legacy of AirBnB and Uber pushing boundaries
- SGSarah Guo
on leasing and medallions as regulatory capture, right? And the companies, these services wouldn't exist that many consumers love if they hadn't said, like, "Well, you know, we think consumers want this business. We're going to try to get to scale and try to understand the risk profile as we go. Um, and then at some point when we have more market power of people actually using the business, you know, we will address some of these issues and, like, come up with a policy point of view." And I- I- I do think a lot of companies that are operating in the AI space will have to wrangle with these questions along the way. Um, I think a really common question for, um, many companies is like, is Google likely to come after you for scraping YouTube data, for example?Um, because there's, uh, you know, I- I'd be shocked to find out if there are ways to get to scale a video data that don't involve some YouTube data. Um, and, you know, I- I think the, the overall, um, orientation toward this should be like a business risk one, right? Um, if you think about the story you just told about fair use and Google and, like, their general attitude toward scraping, I'd ask, like, "Well, why do they, you know, why do they allow certain businesses to exist? Is, um, taking a certain stance on YouTube hypocritical legally relative to their core business?"
- EGElad Gil
There's also some examples of companies in the past that were completely obliterated by going too far. So Napster would be an example of that, where the music industry sued it-
- SGSarah Guo
Uh-huh.
- EGElad Gil
... basically into the grave, right? Um, and in the music industry in particular, there's lots of examples of, uh, companies that have, um, died due to lawsuits. Um, and, you know, I guess there's, there's two types of risk. There's almost the l- legal/lawsuit risk, which aren't always the same thing, right? There's a third type, the second type of risk which is regulatory. Are you doing something that's pushing the regulatory envelope or where the regulations are very unclear? You know, crypto would be an example of that, but there's some examples in AI right now. And then the third is almost like reputation risk. What outputs are you willing to allow? And I think Grok has been really interesting from that perspective in terms of they're explicitly saying, "We're not gonna police the output that much," right, relative to what all the other parties are doing, and that includes what images we'll allow to be generated and that includes, um, what sort of text we allow to be generated or the kind of responses that we allow. And to some extent, that's probably a closer mimic to human behavior than what many of these companies have been doing, right? A lot of the companies have really been actively, um, you know, focused on preventing lots of different types of output from these models, and in some cases it feels like it's trying to do the right thing by users, and in some cases it feels very politically driven in terms of the orientation. And so, you know, I think that's a really fascinating experiment that's ongoing right now in terms of, um, how much does society care about the output of the model in terms of what you allow and don't allow relative to what are other norms for speech or other norms for creative expression that already exist in society? And I think the- a lot of the companies have actually curtailed it more than the norms for much of society, right? There may be a slim part of society that feels a certain way, but for most of the society, you know, there, there tends to be, or it looks like there's broader tolerance for certain types of things. And obviously there's things that you never wanna have that are, you know, um, truly disturbing or are illegal in terms of content output. But I, I think it'll be interesting to watch how that all goes.
- SGSarah Guo
I think that's a philosophical question as well of, um, you know, are, are you, are the restrictions to be on generation or on distribution, right? Because I think it's, um, a much stronger argument that if you own a platform, like, controlling for certain types of distribution is a responsibility. Generation feels a little bit more like free speech, but, um, but, but I think it is, like, a complicated question. Uh, should we, should we talk about semis?
- EGElad Gil
No, I guess the other piece that we talked about touching on here, one was sort of content and risk and, you know, how you think about the degree to which you should or should not push the envelope as a company. Uh, the other piece was semiconductors, and since semiconductor performance underlies a lot of everything that's happening in the eye right now, be it training, be it inference, et cetera, how do you think about the coming wave of semiconductor startups or system startups that have really started to emerge again? I think there was a prior wave maybe six, seven years ago which was Grok and Cerebras and few other folks, and now it seems like we have a new wave between, um, MADAX and, um, Etched and a few other companies, some of whom are, are gonna participate, uh, in this podcast, uh, reasonably soon. What do you think is interesting in this market? What's going on?
- SGSarah Guo
So, as you said, the, the wave, like, uh, you know, more than five years back now, um, was... And I really admire the foresight of some of these companies saying, "Uh, we're going to have a different workload. That AI workload requires a different type of computation." But making a bet so far in the future on, um, chip and systems design is a very hard thing, right? And so I- I think the, um, you know, seven years ago, it was not abundantly clear that scale transformers were going to be such a big piece of the workload, and so I'd say, um, uh, you know, the, the market has evolved in very unpredictable ways, and now you have, uh, I think a cluster of companies that is very focused on optimizing for transformer architectures and, like, area allocated to matrix math. Um, and, and so I- I think it'll be an interesting question of whether or not you can, um, uh, surpass the economics of AMD and... The economics, you know, performance for economics of AMD and Nvidia, um, which, uh, have been really strong, like, high-speed innovators to date, um, with, you know, some argument that AMD is making progress, especially with the, um, with the investment in, um, that ZT acquisition as well. But it... I'd, I'd say, like, the, the whole thing with chip investing is, like, what architectural bet are you willing to make because you have to run on a multi-year cycle and then pace of delivery and then price performance, right? Uh, but it, it feels like that bet is worth making. You know, you... I think you know a lot also about, um, the, um, like, shape of demand that has, uh, that has also emerged, right? You have, like, a lot of sovereign cloud demand as well, which is an interesting opportunity for companies.
- EGElad Gil
Yeah. What do you make of the AMD
- 24:19 – 25:49
AMD Acquires ZT
- EGElad Gil
ZT acquisitions? How do you (laughs) think about why they did it, what's the purpose? Like, what, what is this move by AMD right now?
- SGSarah Guo
I'd say the market is pretty divided about whether or not AMD can become competitive. If you look at the pieces that they need, um, you know, they need a, they need better software, uh, like, if you think about the competitors with CUDA. And so they bought this company Silo-Um, uh, a little while back, which is essentially like a $600 million acqui-hire of hundreds of AI engineers and researchers who've done a lot of work on AMD, so there's that layer. Then you have the networking piece, so they're part of like, UA-Link, open source competitor to NVLink, and then like, the- the theoretical like, missing component that ZT fills is, um, if you think of it as like a $1-2 billion acqui-hire of 1,000 systems designs folks to support the like, rack and data center scale AI business instead of like, individual chips or component scales, because the- you know, the- the thing that NVIDIA's really selling now is full systems through like, multi-year strategy of, you know, delivering these essentially like, data centers for, um, for research labs. And- and the question is like, can AMD go assemble the pieces to go do that? But, you know, one could argue these are all the components.
- EGElad Gil
Cool, I think we're out of time.
- SGSarah Guo
All right, well, I'm excited to like, talk to, um, you know, Etched and Maddox and Cerebras and some of the companies that are- are working on this, uh, in the next wave.
- EGElad Gil
Yeah, it should be very exciting, and I'm
- 25:49 – 26:27
Elad’s looking for a Robot
- EGElad Gil
gonna do a quick PSA. (upbeat music) So, uh, I'm very interested in buying either, uh, a human form slash humanoid robot or a- like, a Spot or something else from like, Boston Dynamics or, you know, one of- one of those really interesting robots. So if you ha- if you have any suggestions or advice, ping me, or if you have one for sale, let me know.
- SGSarah Guo
Find us on Twitter @nopriorspod. Subscribe to our YouTube channel if you wanna see our faces, follow the show on Apple Podcasts, Spotify, or wherever you listen. That way, you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.
Episode duration: 26:27
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode c83JI8zYZxw
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome