EVERY SPOKEN WORD
50 min read · 10,022 words- 0:00 – 0:32
Intro
- EGElad Gil
(instrumental music) . Today on No Priors, we are going to have a host-only discussion. There's so much going on over the last couple of weeks in AI. We just thought it would be good to take a big deep breath and a step back, and talk through some of the really big changes that seem to be happening in the landscape. Sarah, there's been a lot of new models that have come out over the last, even just week or two. There's Claude, Grok, Databricks, a variety of folks have launched things. What do you think? What's going on?
- SGSarah Guo
Yeah, I think it's a huge
- 0:32 – 3:21
How to think about scaling in 2024
- SGSarah Guo
update, um, for most people's priors versus a year ago, right? I think it's very, um, it's very likely at this point that you end this year with a handful of GPT4-level models, and that some of those are open source, right? And so, I think Mistral first, but then also, um, Databricks with DBRX, they- they changed the point of view on what you can do with, you know, a relatively small amount of compute, tens of millions of compute, and then also, uh, from a, from a scale perspective. The Databricks team in particular just declared a, uh, a very strong point of view that they call Mosaic's Law, where a model of a certain capability will require a quarter of the, um, you know, dollar capital investment every year due to a bunch of, uh, improvements on the, um, hardware and algorithmic side. And I don't know if that's grounded, uh, in any particular technical belief, but I- I- I- I do think that the model landscape completely shifts versus what people expected to be... I think most people expected it to be quite, um, monopolistic or at least oligopolistic a year ago, right? And I- I think that the... There's still a really big question at the state-of-the-art, because if you go up one level of scale in terms of, um, capital investment if you're still, you know, the dominant factor is- is compute, compute scaling, um, I think that question remains. But there's an awful lot you seem to be able to do with the GPT4-level model. So, I think, like, the net impact of that, uh, is pretty good from the application or the sort of enterprise adoption side.
- EGElad Gil
Yeah, it definitely feels like, um, you know, the most cutting edge, smartest models in some sense are gonna end up with an oligopoly at least in the next couple of years, just because of the scale of capital needed. But also, just how far ahead you start to be as you have a model that can help you build the future models, right? Even just things like data labeling or certain forms of reinforcement learning through AI feedback or other things like that. And so, as you get better and better model capabilities, you start bootstrapping the next generation of models, although obviously you have to do other breakthroughs to- to get there. Um, and then to your point, I think under that, you have this broader swath of different models and companies and things that are available. And one could argue part of what that's gonna do is just kind of flip some of the- the value capture, the revenue, the margin, the people, whatever metric you want to use over to the clouds, because they're gonna be hosting all these things, right? So, whether it's LLaMA or whether it's Claude or whether it's one of these other entrants, there's just gonna be a lot of room, I think, for the clouds to make money over time as well, which I think is a little bit under-discussed in terms of, you know, who captures value in this market besides the model providers. Related to the clouds, um, how do you think about the recent Inflection-Microsoft deal?
- SGSarah Guo
I think the- the first reaction is, like, they're true believers
- 3:21 – 5:28
Microsoft/Inflection deal
- SGSarah Guo
in AI at Microsoft and saw it as a live player, right? And so I think the- the sort of obvious, um, observations with, uh, Microsoft here would be they both see a- a product... They- they r- see a product opportunity that they need, uh, AI-aware product leadership and research leadership to go after across Microsoft properties. Despite all of the, you know, initial real traction around Copilot in the code domain, uh, I- I think we're still far short of what revenue Microsoft actually expects to drive in terms of across this productivity suite and in search. And they're ambitious to go after that, and I- I think this is a leadership change that supports that. Now, they're clearly still working with OpenAI given, like, direct statements from both companies and the Stargate, um, data center effort. But it's also hard from the outside not to see this as somewhat of a hedge, right? Not in a criticism of OpenAI, but, you know, if you are a true believer, this is the most important technical driver for your company and then you're reliant on an outside player, um, that's not a position that a- a trillion dollar company likes to have. You know, Mustapha has had more capital and more compute available to him than the vast, vast majority of entrepreneurs and, uh, research teams, and I think one- one big argument you can make for Microsoft is just, like, you have direct access to that if you're focused on the research, right? And so I think it supports what you- what you said where, um, the spend required at the, um, perhaps not even this generation but the next generation really requires a- a certain level of sponsorship that is, um, challenging for most independent players. Um, but it's- it's a directly opposing view to, um, the- the, for example, the Databricks narrative today. Can I- can I ask you, like, a more, um, domain-specific question? So, um, OpenAI just announced voice cloning, right? And the interesting thing here is,
- 5:28 – 7:02
Voice cloning
- SGSarah Guo
um, you have companies like, uh, ElevenLabs with really great traction. Uh, um, other competitors out there focused on, um, different feature sets like latency that are progressing, but let's go ahead and assume for argument's sake, given, uh, the OpenAI announcement of voice cloning that both OpenAI and DeepMind and maybe others have very, very good voice video image song models. Um, and the question has been, like, you know, will they release that and what does that do to the market beyond, uh, beyond text APIs?
- EGElad Gil
Yeah, I mean, it seems like a- a lot of the hesitancy, as far as I can tell, for these companies to go aggressively after the voice side is just, um, regulatory slash societal concerns, right? I think one of the concerns people have on the voice cloning side is, do you end up with different types of deepfakes or other things, where it's much harder to tell with the voice what's going on? There's obvious ways around that. I think you can do an attestation, where when you upload a voice for the first time, the person actually who, like, owns the voice in some sense, whose voice it is, can actually do some form of attestation or other things. Um, or there's other ways to do verification. My sense of the market is that multiple players, um, have this technology, but they've been holding it back. And in some cases, they may have had it for a year or two now, uh, because, you know, there's also been open source versions of this, like, um, Tortoise and things that the Suno team was working on earlier and things like that. So, I'm surprised in some sense by how little competition there's been.
- SGSarah Guo
Okay, let's- let's characterize, like, the rest of the investing landscape, and then Elad himself, um, you know, driving the rest of the landscape. Like, do investors keep funding, uh, general foundation model training efforts from here or more- more specialized ones? Can you talk
- 7:02 – 12:50
Investing climate
- SGSarah Guo
about what you think those dynamics will be?
- EGElad Gil
You know, it's interesting, because if you look at the scale of capital that's gone into foundation models, uh, venture capitalists have put hundreds of millions of dollars into individual companies. But then the big cloud providers or, uh, big tech companies, including NVIDIA, have put billions into companies. And so most of the funding of this market is actually being done by the hyperscalers and a few other big tech companies, and that's true in China as well, right? It's- it's the really big, um, preexisting internet companies that are funding everything. And so the VCs are almost at bootstrap, and the bootstrap is sometimes tens of millions, and sometimes it's hundreds of millions, but to get to real scale, it comes from other places. And, uh, to the points earlier on the cloud side, there's a strong incentive for the cloudss to keep funding these things as long as it drives cloud revenue. So, for example, Microsoft's last quarter, they mentioned Azure revenue, uh, which is about 25 billion for the quarter, grew by, I think, it was 5% due to AI-related products, which would be another billion, billion and a half a quarter in revenue off of AI. And so that's, you know, s- six billion, five billion annualized, and it's probably still growing. And so if you look at it from that perspective, there's a strong incentive to fund these things, because they're driving so much utilization and usage. So, I- I definitely think we'll see more funding going into the market. I think on the foundation model side from a venture capital or angel investor perspective, um, I think we're gonna see fewer new language models, but we should see models in a lot of other areas. And, you know, we have new things happening in music. We talked a little about text-to-speech with Eleven, but then there's a bunch of other areas around video; image gen; um, physics models; biology models; uh, material science; robotics; et cetera. And so there's this broad swath of other types of foundation models that are starting to get funded or who are accelerating in terms of the funding cycles there. And so one can anticipate we'll see a similar thing there, where we'll probably have venture capitalists, um, do the first set of rounds, and then it'll s-shift over time to large strategic players who really view that as things that are beneficial. And then there may be other areas where, you know, people are doing really interesting things. Applied Intuition is a good example of a company that's doing, um, simulation software, and, you know, uh, they've been doing really interesting things in terms of, like, uh, modeling behavior there for years now, right? So, uh, I just think there's a lot of, um, a lot of room to still do lots of interesting things on the foundation model side. But I- I do think it's gonna continue to shift over time. What domains do you find most interesting, or what's your framework for figuring out which of these things are gonna be not only important societally, but also good businesses?
- SGSarah Guo
One basic, uh, way to look at this, which is, what are the capabilities that, uh, we are still missing or struggling with, right? And so one, um, one thing that I've been interested in for a long time is just, how do you operate on time series with, uh, more general knowledge and reasoning, right? There are so many ways in which, uh, being able to better understand time series would be really, really valuable, right? And it's a very unsolved problem. If you look at anomaly detection, anything that is an infrastructure monitoring, a security, a healthcare, um, a consumer behavior use case, there's domains and then there's, um, sort of, uh, other- other dimensions, like context windows and how you handle, um, a particular type of data. That's one domain where I just feel like it's, there's huge commercial applicability and, um, interesting architectural approaches that could allow you to break through. Then I think there's, like, we take the existing, um, uh, advancements in language models, and all people from all fields applying machine learning are now paying attention to this and, um, you know, working at some of the best labs. And if they go look at the domains that they're- they've been traditionally focused on, so for example, like, robotics and biotech are two areas where I've been spending a bunch of time, you know, there's something in the water, where a bunch of very smart people are showing leading results on traditional benchmarks, um, with these approaches. And, you know, the- the sort of core of it is that you get, um, several smart teams at once thinking that a domain can be solved with, um, a foundation model, um, that has more generality and then some cleverness in approach, right? And I- I don't mean that in a trivial way, because, like, for example, in robotics, you know, most machine learning people will look at it as a- a data collection problem, where your internet data, even video data of a bunch of actions just isn't, um, enough. We need, uh, embodied action data, like, you know, controls data in some way, and then people who have very, very different ideas on data collection. Both, um, you mentioned simulation, but also, like, real world efficient collection is kind of a core question for a number of these companies. And then also different ideas in terms of how to split up the value chain from, are you doing software that will apply to many different types of hardware? Are you doing a, um, you know, a verticalized company? And, uh, I, uh, I am inclined to believe that, you know, a lot of these domains are going to be solved, and it's just a question of, like, picking the product path through.
- EGElad Gil
I think there's a set of things that are just intellectually interesting...I think, for example, the biotech models were really cool where it seemed like in some examples, long-context windows make protein folding easier, which is really neat. And then there's the societal implication side, and there may be some things that never make money, but are incredibly societally useful. Uh, and then lastly, there's, like, what are big commercial applications? The thing that, that I find very fascinating is, um, you know, there's one or two of these areas where I've seen, you know, a dozen teams all enter at once, and there's so much white space in AI right now. There's so
- 12:50 – 16:36
Whitespace in AI
- EGElad Gil
many different things to do. I've actually started incubating a few things again just because there's, you know, people just aren't working on certain areas that seem kind of obvious. And I've looked for companies. I'd rather back somebody than, you know, incubate something, which is much harder. But it's surprising to me that half a dozen people, or a dozen people, who are all really smart and really talented and really vetted in the field will all jump on one thing, and then there are these wide open opportunities somewhere else. And so it's a very odd market right now, where you don't see the fast follows for certain things that are clearly working that you'd expect.
- SGSarah Guo
What, what's your explanation of that?
- EGElad Gil
I think it's a mix of what people view as societally significant, and therefore they want to work at it, but also I feel like any set of startup waves always have these memetic actions where, uh, there's almost these, these memes of what to build that spread, and that's happened in prior technology waves too. And the memes are often correct. Not always, but often. I mean, a great example of that would be in the mobile wave. I, I knew of literally a dozen different photo apps where people had pho- photo upload from their phone, and they'd go viral and they'd, you know, grow like crazy and then they'd all die, because there was no sustainability. And then Instagram came out with, like, the compact format, the filters. There's actually a company called Camera+ that had filters before that, but they were charging 20 bucks to download, uh, because they just wanted to monetize it. It was sort of this indie dev shop that never wanted to grow very big that was doing it. And so, you know, Instagram, it similarly started working on filters, a common format, and then a feed and more of a social product, and that's something that became sticky and s- and, you know, sustained. But there was at least a dozen other ones I knew of where I knew the founders who were doing it, right? And so that was memetic, but correct, but it took the right product substantiation to do it. In robotics, the correct, uh, product substantiation is harder, or in biotech, it's harder, or in LLMs, it's harder, you know, 'cause these are very big mass markets, and I think people are just excited by the scale of commercial opportunity for these things too, right? At the same time, there's other markets where there just aren't that many players, and so the question is, why is that and what's the difference? And... You know, it's, it's fascinating to watch this happen again, you know? And again, memetic things are often correct. So, you know, there was a dozen search engines before Google, there was a dozen social networks before Facebook, et cetera.
- SGSarah Guo
It's really interesting because it's a entire market driven by technologists right now, right? Like, everybody's getting nerd sniped into a, a few areas. Um, and that, as you said, may be right, but I, I think driving factors for me is just, like, how much cross-pollination there is between, uh, people who would understand the edge of research and then, like, particular domains, right? The ability to operate on financial or accounting data with these models seems like a very, um, a very useful commercial capability, and actually perhaps much more tenable, uh, and like a sort of linear path to commercial value than some of the things that we're describing. Like, uh, if you go talk to a experienced executive in the last generation of great biotech platforms or, um, you know, uh, successful robotics players, these are not... They're, they're not easy industries, traditionally, right? Um, uh, but there's a lot of accounting and finance software in the world. There's a lot of, you know, professional accounting services, and I, I just think it's like there's just not that much interaction between, um, accountants and controllers and, like, you know, and systems engineers and research scientists. So, that's part of my explanation of it.
- EGElad Gil
Yeah, it's also just there's been a few technology breakthroughs across these different fields that suggest that suddenly some of these things are more tractable than people thought. And so, I think there's also a technology why now, or at least improved points that make people excited to go and, uh, you know, build out
- 16:36 – 19:54
AI video landscape
- EGElad Gil
these things that have now been discovered and shown to be possible. What do you think are, um, some of the other areas that, you know, is worth, uh, looking at big changes? I mean, video may be an example. I know you're involved with a couple of companies there. Do you want to talk about the video landscape and some of the shifts you're seeing?
- SGSarah Guo
Yeah. I actually think that, um, it is, uh, kind of a mistake to look at it as just, um, a modality to be solved, and like we should... We're go- we're going to have, uh, a, a D... And the rest of the Sora team from OpenAI on, um, no priors, so we should certainly ask them. But, uh, I, I think video, um, and the control of video for a commercial application is a little different than video generation understanding as belongs in a general, uh, multimodal model. And we'll see, like, what the, you know, if the persistence of that is two years versus 10 years. This is one of these areas where, um, it is so obvious that the demand is unbounded, right? For, um, uh, commercially viable or just shareable high-quality video generation. Uh, and then it's not one form, right? Like, many people cut it into A-roll and B-roll, and today, despite the extraordinary advances from one second, you know, very small, uh, clip with obvious artifacts to where we are with things like Pika and Sora today, um, we still have a very w- very long way to go, both in terms of, um, interfaces and controllability, like, uh, um, length, quality, et cetera. And so I, I think it's an area that, like, deserves a, a lot more investment, um, but I, I think the set of things that you might want to...... make those assets com- commercially viable. It's actually, like, a very deep product problem with, like, specific research involved and, uh ... And as you mentioned, one of the companies that I'm an investor in is a company called HeyGen, um, that has grown really, really quickly over the last year, uh, with sort of prosumer and commercial traction focusing just on video avatars or the ability to have, like, you know, people, a clone of you or a spokesperson for an organization, um, speak on film to a camera. Uh, film, camera, right? Generated pixels. But it, it's very cool to see the specific, uh, ways that you can progress this if you are focused on it, because we get so many requests from end users who have very little idea how or no idea how any of the technology behind the scenes works and, um, you know, people are very creative and they want full control. And so, like, one of the releases from the company this past week was, uh, I, you know, I have video of an avatar moving, walking around, going through gestures, and I want to replace what they're, uh, what they were saying. Like, one of the things that was very much learned, you mentioned the last generation of, um, like, mobile image apps. One thing you learn from the last generation of video applications, uh, was, like, if you take, um, certain, uh, uh, dimensions of creativity, uh, away, or you, you just control them, like, a- and you,
- 19:54 – 22:21
Agentic user experiences
- SGSarah Guo
uh, lower the required quality of, uh, different dimensions, you make it much, much easier for people to create. Right? And so I- I think that's one path that, um, companies like HeyGen are going down. I wanna ask you about, uh, one thing that I- I think, like, um, Devin and cognition has really, uh, woken up a lot of engineers and product teams to, which is, like, how much space there is in exploring agents and, um, and different, uh, model user experience.
- EGElad Gil
Yeah. I ... You know, it's interesting. I was actually working on a post on this in terms of it feels like there's this shift in terms of, uh, agentic UIs and what they look like, because I think a lot of what people were doing before was modeled on either ChatGPT or Copilot, where it was either forms of, like, chat or just, like, auto-complete of different nature or sort of inline or things like that. And I think the Devin UI was really interesting from the perspective of it was a new way to think about how you display information in terms of what an agent is doing. And even just seeing ... Uh, you know, in Devin, they have four tabs, right? There's the, um, the plan that you're doing, the shell, the code that's being written, and then there's a chat interface, right? "Here are the steps that I'm doing," and so you know what's coming. Um, you can see the code that's being written and you can kind of redirect the agent to do other things if you think it's going down the wrong path or what it's browsing. Um, it'll prompt you for things like API keys or tokens, um, and so you can interact with it and re-steer it along the way. And I'm starting to see that UI now pop up in other use cases where people whose products were demoed to me a week or two before have suddenly shifted more down this direction if they're doing anything agentic because you realize that most people don't want to just sit there and wait and wonder if the agent is actually doing what, what they want. They wanna be able to see it and maybe interrogate it or interfere, and put things on the right path so it gets it done faster, and it's ... The way to think about agents today, I feel, is almost like a junior intern. They're very eager, they're trying really hard to please (laughs) , but they, they still have a lot to learn or you kinda need to give some direction. And so this is a mechanism by which you kind of almost get that update email from the intern saying, "Hey, here's what I'm up to." And you say, "Oh, actually, could you go do this other thing a little bit different?" Or, "Have you considered doing these three things?" And so I think there's a lot of really interesting things that'll be coming as we rethink UIs and then eventually, the entire UI will go away once agents get good enough, right? And so I think this is kind of the intermediary step of human in the loop, and eventually as agents get smarter and smarter, and that's gonna be through breakthroughs in the base models but also breakthroughs
- 22:21 – 26:02
Prosumer as the first wave of application AI
- EGElad Gil
in reasoning and other areas, um, we'll start to see actually, I think, some of the UI go away over time. And so I think we're in the- the sort of early, uh, form of this stuff, and it's very exciting to just see these, these new paradigms being, uh, created and happening. You know, one thing that you mentioned that I think is kinda interesting is that a lot of the big use cases in AI surprisingly are prosumer ones, right? That's the, that's, um, HeyGen, that's ChatGPT, that's Perplexity, that's a variety of things.
- SGSarah Guo
Tsuno now.
- EGElad Gil
Tsuno. How do you think about the whole prosumer market? Like, do you think AI ... The- the first real wave of AI is prosumer in some sense, or no?
- SGSarah Guo
I think it structurally has to be, right? Just based on, like, the pace of ability to adopt things in the enterprise. Prosumer applications, they get to carve this path of direct user value, which we think we can create a lot of with AI capabilities, that is neither, like, oh, we need to fight existing incredibly strategic and embedded, for example, like, consumer social networks with their network effects, um, but actually have these capabilities that generate their own distribution. And it's not that you don't need to be smart about distribution to consumers and prosumers, but really, like, these companies are growing on the backs of just great product that people want. It is very, very hard to get to millions of enterprise users in a year simply because of the, um, decision-making and, like, security processes and roadmap involved in getting a large customer to change something internally, and the risk tolerance of all of those versus, like, I want to do something that is $10 a month valuable to me or makes me more productive as a ... as a much faster decision. So I think structurally it's something we should expect. I also think it's just interesting that the Canva numbers are, um, are quite public now and, uh, you know, a billion of its 1.7 billion in ARR is prosumer. Um, and so I- I think, you know, I'm- I'm just sort of respecting the data here in, um ... Uh, the argument for some of these prosumer companies is that, um, you can often grow into a more professional set of use cases, and the increase in capabilities is creating huge markets where they didn't exist before. Right? And you should see many more of these companies, and I- I do believe that argument. I think they're gonna ... I think they're gonna be sneaky big markets.
- EGElad Gil
Yeah. It's kinda notable because if you look at the first internet wave, if you look at the '90s, the first wave was all consumer and then the second wave was B2B, and so they used to talk about it, about B2C versus B2B, right? And I think we have this odd, um, parallel or analog here where the very first adopters of this technology are consumers and prosumers and then there is some enterprise related stuff like what Harvey's doing or others, but, you know, the initial waves seem to be more driven by people who are using it in their personal and professional lives first, which is very similar to what happened with email and the internet more broadly. And so, it's- it's a interesting parallel and maybe this is just the dynamic of truly fundamental technology shifts. And, you know, you could argue that was also a lot of the wave of, uh, mobile, right? All the really big new companies, or most of them, were consumer companies, uh, between Uber and Instacart and all the rest, right? And so, uh, WhatsApp, et cetera. So, very interesting parallels. (instrumental music)
- SGSarah Guo
Find us on Twitter @nopriorspod. Subscribe to our YouTube channel if you wanna see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way, you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.
Episode duration: 26:02
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode RLUI2AujblE
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome