a16zThe Future of Software Development - Vibe Coding, Prompt Engineering & AI Assistants
EVERY SPOKEN WORD
50 min read · 10,473 words- 0:00 – 0:48
Introduction to Infrastructure
- JLJennifer Li
Infrastructure never goes away, it just gets layered
- MCMartin Casado
A new piece of infrastructure changes the way that you program computers and it changes the stack that's around it. We're building systems to build other systems
- MBMatt Bornstein
Developers are making a lot of the decisions, and a lot of the, like, marketing and sales to developers looks more like consumer these days than it used to
- JLJennifer Li
Who is going to earn the distribution and earn sort of the developer attention is going to be a different game
- MCMartin Casado
From an application standpoint, we've abdicated logic
- MBMatt Bornstein
This is a pretty big deal. It's by far the biggest thing that I've seen happen sort of in my life
- MCMartin Casado
Software was always the disruptor. One of the most exciting thing about the AI wave is, like, software's being disrupt- like we're being disrupted, right?
- JLJennifer Li
[laughs]
- MCMartin Casado
And so we're like, "Ah." [laughs]
- JLJennifer Li
The AI's disrupting software.
- MCMartin Casado
That's right.
- JLJennifer Li
The AI is eating software. [laughs]
- MCMartin Casado
I know. I know, we're like, "Ah." [upbeat music]
- 0:48 – 2:27
Defining Infrastructure and Its Components
- ETErik Torenberg
Hey guys, we're here today to discuss the state of infra. We've done a consumer podcast. We- we've done a consumer team. We've done it with the AD team. Now excited to do it with the infra team. First, can we get a definition of, of infra? Where, where does infra differ from enterprise? How, how do we think about it in- internally?
- MBMatt Bornstein
I would say infra is basically what makes software work, right? We'll probably get pretty deep into a set of technical definitions. You mentioned sort of networking, storage, compute. Where does AI fit into that? But I, I think at the simplest possible level, if you want software, infra is what engineers are using behind the scenes to, to make all this possible.
- MCMartin Casado
And our, our formal definition internally is technical buyer, right? So it's the stuff you use to build the stuff, the stuff you use to build apps, and if it's used by a technical user, we consider it infrastructure, whereas something like, let's say, vertical SaaS could be used by a flooring company or by a marketer or by sales, that would not be con- considered infrastructure.
- MBMatt Bornstein
And, and technical user, for the record, is developer, data scientist, analyst-
- MCMartin Casado
Administrator
- MBMatt Bornstein
... cybersecurity professor. Yeah, right. There's, there's a, you know, DevOps, right? There's a wide range of, of sort of, like, people. These are our people, right? Like-
- ETErik Torenberg
Yeah
- MBMatt Bornstein
... the, the kind of nerds behind the scenes.
- JLJennifer Li
[laughs] And in the system terms, we can think about it as, uh, consum- c- compute, uh, networking, and storage, uh, but also all the tooling that goes around a developer's day of, like, what they're using to build software and what are the, um, tools and products that are operating, uh, this ever-growing and more complex software as well, um, all the way to semi-technical users that may want to, you know, either prototype or tinker with, um, building applications. We're very interested-
- ETErik Torenberg
Yeah
- JLJennifer Li
... in anything in the technical domain and used by technical people.
- ETErik Torenberg
So you mentioned compute, um, networking, storage.
- 2:27 – 6:34
The Fourth Layer: AI Models
- ETErik Torenberg
How should we think about models? Uh, is, is this the fourth layer of infra? How, how do they interface? How, how should we think about that?
- JLJennifer Li
I certainly think of it as a fourth l- uh, layer of infrastructure. It's, um, you know... it certainly leverage and build on top of all the three pillars we're talking about. It has a lot of, um, demand of compute and, of course, it's, uh, trained and, um, also producing a large amount of data. Um, and to leverage and use these models for our purposes, you know, uh, latency and networking capabilities is also very important. Um, but it's going to be as prevalent, um, you know, as, uh, any piece of, uh, infrastructure software. I, I don't know, like, the, the ana- analogy these days anymore. Is it a database? Is it sort of like a new form of compute? So really, to me, it's like a fourth, uh, pillar that incorp- incorporates everything, but also provides intelligence for the software we're using and building today.
- MCMartin Casado
I think it's probably worth asking, like, why a piece of infrastructure is a piece of infrastructure.
- ETErik Torenberg
Yeah.
- MCMartin Casado
Um, and generally, a new piece of infrastructure changes the way that you program computers and it changes the stack that's around it. Like, it's got different memory requirements. It's got different latency requirements, and so it just requires rethinking how we build software and how we build infrastructure. You know, I would say in addition to compute, networking, storage, I would say distributed systems-
- ETErik Torenberg
Mm-hmm
- MCMartin Casado
... would also be included just because, like, things like state consistency require you to think about, like, proximity and guarantees. I would say databases probably did too, because it changed our programming model. You have different guarantees. Uh, and these models very much fit in that, uh, for a couple of reasons, and I think Jennifer's exactly right. Like, you just build different data centers and different chips if you want to build these models, so it, it, it has that impact. But programming them is, like, non-obvious. Like, [laughs] we're just still trying to grapple, like, how you program with them. Like, they don't really listen to you. Sometimes they do the coding themselves. And if I were to try and distill, like, what is the one biggest difference that these models provide to infrastructure, it's the following. I don't remember ever in, like, the history of computer science where we've, like, from an application standpoint, we've abdicated logic.
- ETErik Torenberg
Hmm.
- MCMartin Casado
Like, actual, like, like, like, like application... Like, in the past, we've abdicated resources. Like, we're like, "Give me compute. Give me storage." Like, you know, these abstracted resources, but the logic, the yes or no, like, the what it's doing always came from the programmer. But in these ones, we're like, "Come up with the answer for me." And so it's requiring us to rethink what does it mean to be a programmer? What does it mean to be software? Et cetera. So I... You know, it's just clearly very fundamental to computer science, and I would say, you know, again, to Jennifer's point, it's very, very much a new, a, a new piece of infrastructure.
- MBMatt Bornstein
And I think a lot of people are trying to reason by analogy. You sort of alluded to this, Jennifer. It's like, oh, is it like a database because it can answer queries, or is it like a network 'cause it's sort of non-deterministic and we need to handle retries and weird edge cases? But, like, I, I think people are really just trying to figure out how to program these things, which you sort of said, Martin. But, like, we've kind of got to start from a blank sheet of paper, which is what makes our jobs, like, really exciting right now, 'cause there's a lot of people trying to figure it out and coming up with new ideas.
- MCMartin Casado
Well, I have to add on to this.
- ETErik Torenberg
Please.
- MCMartin Casado
So, okay. So listen, many of us have been in computer science for a long time, right? We've been in our schools and in our, you know, uh, operating lives and in our investing lives, and, like, software was always the disruptor, right?
- MBMatt Bornstein
[laughs]
- MCMartin Casado
Like, we disrupt, [laughs] like, we, we disrupt the taxis, or we disrupt, like, sales, or we disrupt the back office, or we disrupt everything. Like-One of the most exciting thing about the AI wave is like software's being disrupted, like we're being disrupted, right?
- JLJennifer Li
Yes.
- MCMartin Casado
And so like [laughs]
- JLJennifer Li
AI is disrupting software.
- MCMartin Casado
That's why I said-
- JLJennifer Li
AI is eating software.
- MCMartin Casado
I know. I know, like, ah. [laughs] So we have to think about it.
- ETErik Torenberg
Software is a self-lay- eating beast, right?
- MCMartin Casado
Yeah, that's right.
- JLJennifer Li
Yeah. [laughs]
- MCMartin Casado
It's like honestly, I think this is the first time I can honestly say that like the profession that I've, I've, I've dedicated my entire life to is being disruptive, and it's very exciting 'cause it's kind of eating itself in a way.
- ETErik Torenberg
And it's tempting to be a curmudgeon, right?
- MCMartin Casado
[laughs] Yeah.
- ETErik Torenberg
Because we've all been doing this our whole lives. [laughs]
- MCMartin Casado
That's right.
- ETErik Torenberg
So it's like really being open to and embracing the new stuff-
- 6:34 – 17:46
The Evolution of Infrastructure
- ETErik Torenberg
What's different from past super cycles versus what, what can we learn, uh, as, as we enter this new one?
- MCMartin Casado
There's two things that happen. So one of them is often when you bring the marginal cost of something down, like with compute, you know, we did it for computation, and for the internet we did it with distribution. It, it, it increases the TAM a whole bunch. So, so for one you almost always see this massive TAM expansion, and part of that tends to be because the TAM is bigger, you've got new users. Um, and because you have new users, like there's normally like a new behavior that happens, right? Like this is very much the case with the internet, right? Which is like people weren't used to going to a computer and talking to everybody around the world, uh, on top of the internet. And existing companies don't know really how to think about new behaviors. Like, they've built these sales motions and the o- operating things around like the old behavior. And so you see TAM expansion, you see new behaviors. Those new behaviors provide white space for challengers, new like startup companies to come and to go ahead and fill those models, and I think we're seeing exactly that happen with this one as well. Like, clearly this market is massive if you look at how successful these model companies are. But also you're seeing use cases that like computers just never really have done before, and you're seeing that too. And so in that way I think it rhymes very much with, say, the internet, and it rhymes very much with, with probably even the microchip.
- JLJennifer Li
Maybe I'll answer that question just from my personal experience. I, I'm always like sort of a tools person and also wanting to like having tools for fulfilling certain, uh, creativity, 'cause I kind of came to coding and computer science as a late bloomer, uh, after, you know, my 20s, and really enjoyed all the tools available for me at that point of just like, you know, building software and like learning computer science as well. Now we just have like, you know, massive and massive leverage in trying to create anything as long as you have a good idea. Like going from, uh... Martin laughs at me about this. Like I was a bigger like local NOCO champion [laughs]
- MCMartin Casado
[laughs]
- JLJennifer Li
For let's say the last, you know, five, 10 years. Uh, because again, these are tools for people who have good ideas but may not be like, you know, educated in the computer science term, but, you know, the, the Retools of the world, the Wix and Squarespaces, like you can build applications, can build software easily with these tools. But now you're given like the next level of thought partners, tools to really, um, as any role in the company, prototype, um, you know, software interfaces for your end customer, for your end user as long as you know what they need, what they want to see. Like you can really realize these ideas really quickly in like at your fingertips.
- ETErik Torenberg
So low-code's finally happening. It just takes a lot of code.
- MCMartin Casado
That's right.
- JLJennifer Li
It just turns out the code is natural language, you know? [laughs]
- MCMartin Casado
[laughs] I know, it's so funny 'cause it when, when, when Jennifer joined the team, she was very excited about low-code, but from my view, low-code is like Python. [laughs]
- JLJennifer Li
[laughs]
- ETErik Torenberg
[laughs]
- MCMartin Casado
It's like the scripting language. Like, you know, you write less code.
- ETErik Torenberg
Interpreted languages. Yeah.
- MCMartin Casado
NCM is low code-ish. [laughs]
- ETErik Torenberg
Right.
- MCMartin Casado
So we kinda had to bridge that gap.
- JLJennifer Li
[laughs]
- MCMartin Casado
And it was kind of in a way like a bit irreconcilable like, you know, uh, until AI came out, and it's very, very clearly like what the promise of low-code was. And so you're right. I mean, it really is disrupting software.
- ETErik Torenberg
So when I was a kid, right, the internet was sort of a new thing. I just remember really vividly there was this movie with Sandra Bullock called The Net.
- MCMartin Casado
Mm-hmm.
- ETErik Torenberg
And she orders a pizza from her computer.
- MCMartin Casado
Yeah.
- ETErik Torenberg
And this was like completely mind-blowing.
- MCMartin Casado
Yeah. Yeah.
- ETErik Torenberg
And like now actually this is a common user behavior, but like what we're dealing with now is just so much bigger than that, right? I just think it's really hard to... Like, like I think some of these points about how the infrastructure evolved, like how will the companies adapt and things like that, are probably transferable, but this is, this is like a, this is a pretty big deal. It's by far the biggest thing that I've seen happen sort of in my life.
- MCMartin Casado
Yeah. Yeah.
- ETErik Torenberg
Let's zoom out and take this long view, and I- Martin, you're actually the perfect full circle because were, were you the first infra investment ever, uh, uh, as a portfolio founder?
- MCMartin Casado
I think I was either... It was either me or Okta, but I will, I will say I was like, you know, m- me and Todd were like the infra portfolio. [laughs]
- ETErik Torenberg
[laughs]
- JLJennifer Li
[laughs]
- 17:46 – 21:27
Developer Tools and the AI Wave
- MCMartin Casado
go deeper into the present a little bit. Can you guys share how, how we think about sort of the, the, those different subcategories or landscape that, that kinda makes up Infra? We could also plug some examples of portfolio companies or, or spaces where we made, made, made some bets. Here's a few important categories. Um, uh, developer tools, meaning anything developers use to make their lives better, easier, faster, more efficient. Cursor is probably our top developer tool company right now. Um, but we've done a bunch- And, and before that, GitHub, right? I mean, like- Yeah, GitHub before that. And Jennifer, you've actually backed a bunch of interesting dev tools companies in the last few years too.
- JLJennifer Li
Yeah, from Linify, Seamless, um-And there was a long time where the, the dev tools was kind of like written off by VCs, right? Like they were, it was sort of like-
- MCMartin Casado
Oh, yeah
- JLJennifer Li
... yeah
- MCMartin Casado
The, the TAM was too small.
- JLJennifer Li
Yeah, exactly. [laughs]
- MCMartin Casado
I mean, at the time of GitHub, would you ever have imagined a repository would be a huge company?
- JLJennifer Li
Right.
- MCMartin Casado
Like, it was like almost a joke. Yeah, yeah.
- JLJennifer Li
Right. And people aren't sure about the business model-
- MCMartin Casado
Yeah
- JLJennifer Li
... and you know, all these things.
- MBMatt Bornstein
So small TAM is the classic like red flag for, for infra-
- JLJennifer Li
Yeah. [laughs]
- MBMatt Bornstein
... investing. You know, like, like if you're at home listening to this and someone tells you it's small TAM, like run, run in the other direction.
- MCMartin Casado
They're, they're not an infra investor.
- MBMatt Bornstein
Yeah, yeah.
- MCMartin Casado
Infra creates TAM. If there's one takeaway from this thing is inf- is, is TAM creative. So yeah, so like I think dev tools, you've got core infra, which is compute networks and storage, right?
- JLJennifer Li
Yep.
- MCMartin Casado
This is like to IT and then, and then you tend to actually be quite a bit above the core infra stack. So maybe you talk through kind of the areas you focus on.
- JLJennifer Li
Well, I, I think about both like how to, how developers are like using tools to improve their efficiency, but also how customers are getting value out of that as well. So a lot of, you know, packaging maybe dev tools into SaaS forms, um, like I, I mean, invested in this company called Pylon. It is a SaaS company that does like, you know, customer support, but fundamentally it's like doing a data pipeline. So that's like core infrastructure that's very good at connecting with systems and providing context to a lot of like agents, AI models. So, uh, to me, that's, that's infrastructure. Um, and we're spending a lot of time on, of course, like the, the cutting edge AI research, so a lot of foundation model investing.
- MCMartin Casado
I will say the, the reason that we're a little bit skittish on this question is early in super cycles, it's very hard to distinguish between an infra company and like the application companies. And the reason is 'cause the TAM is so small and so new, the new technology becomes the app, right?
- JLJennifer Li
Right.
- MCMartin Casado
So let, let me refer to like, like, you know, the original super cycle that started the firm-
- JLJennifer Li
Right
- MCMartin Casado
... which was, which was the internet. Like, like Netscape, I remember it when it came out. Like it was like a consumer thing or at least-
- JLJennifer Li
Yeah, yeah
- MCMartin Casado
... a student and school thing, right? And so this is one company which, you know, everybody was downloading from an FTP server Netscape and using it as individuals. You know, the, the enterprise didn't know what to think about it and banned it or whatever and the same company that was like building, that built JavaScript, that's building this core technology is also doing the browser, and over time it kind of matures, and then of course you have all of these internet companies and, uh, uh, and then all the applications show up. We're seeing the same thing in the AI wave. Like is Midjourney, is that a- an infra company? They build a model, or is that an app company? Well, it's kind of both in this sense, and so I do think that at this stage it's very hard to distinguish.
- JLJennifer Li
Right. It's, it's exactly right. Like it's very hard to answer if OpenAI is a, is a app company or it's a infra company. It's liter- liter- literally it's building like infrastructure that's like a cloud running these models for different, uh, sector and different use cases, but at the same time building a consumer app that's ChatGPT. So we, we think of foundation model companies are similar, like ElevenLabs, they're a voice AI provider, and they have the creator application that can use the studio to create voices, but at the same time they're also supplying the voices to these large scale enterprise use case that are like, you know, fine-tuning, cloning your own voice and, you know, distributing that through API. So it's sort of, you know, both.
- 21:27 – 22:11
Data Engine Systems
- MCMartin Casado
Yeah.
- MBMatt Bornstein
Another area we've done a lot of work is data systems, and there's sort of been these two branches. One is kind of this like kind of backend data eng driven big data systems, you know, Spark, Hadoop sort of thing, and the other is this sort of data analyst, um, kind of more Tabular, Snowflake kind of, kind of thing. And, um, I think as a firm we've been super, super active and super aggressive, you know, on, uh, investing in companies like Databricks. I mentioned Fivetran, dbt, which, which you guys invested in together, Hex, which is doing really, really well, uh, Tabular, which was, which was acquired by Databricks. So, um, we're still, I think, really, really bullish on this. Unfortunately, AI has kind of sucked the air out of the room for a lot of data companies from a bunch of different angles, but, but I think we'll continue to do, do more of this a lot, uh, as well.
- JLJennifer Li
How,
- 22:11 – 25:28
Defensibility in AI Infrastructure
- JLJennifer Li
how do we think about defensibility for AI companies, whether it's a, whether it's an app, app at the app layer or the, or the model layer, um, do they all have their respective sort of areas of defensibility, or how has the sort of our notion of defensibility evolved?
- MBMatt Bornstein
So we once wrote a blog post that there was no defensibility anywhere in the stack. [laughs]
- MCMartin Casado
For any, for anything. [laughs]
- MBMatt Bornstein
For anything. [laughs] Like, like-
- MCMartin Casado
And yet people make lots of money in-
- MBMatt Bornstein
Yeah. And, yeah. [laughs]
- MCMartin Casado
[laughs]
- MBMatt Bornstein
Um, uh, you know, the argument at the time was like, okay, like Nvidia has sort of a moat because chip designs are, are hard to copy, but if you go sort of up or down, they're all sort of manufactured at the same place, right?
- MCMartin Casado
Right.
- MBMatt Bornstein
At TSMC. You know, if you go down, if you go up, the cloud providers provide effectively the same product. Like the models are training on the same data, have an- have like similar capabilities. You know, the apps are all kind of like all using the same model. So, so that was sort of the naive theory I think when we were just trying to understand this at the beginning, and I think maybe it was true at the time. Um, what Martin was sort of alluding to a second ago is like meanwhile, every company at every layer of the stack is doing like fantastically well right now. And during this kind of initial phase of industry development, which, which we sometimes call the Brownian motion phase, like I actually think it's really hard to make sort of pronouncements like this about what's gonna work, what's not gonna work, where's value gonna accrue, et cetera, et cetera. It's like app companies are doing really, really well, and like we're, we're like pretty clear past the sort of wrapper phase. Like I don't think there are any wrappers anymore. Like building good products with AI is really hard.
- MCMartin Casado
Yeah.
- MBMatt Bornstein
And the founders doing it now have like really good kind of intuition for how to do it.
- MCMartin Casado
Yeah.
- MBMatt Bornstein
You know, the models are clearly like pushing the whole industry forward, and like they've built huge companies from out... You know? So it's like it, it's kind of all working right now, and I think you could actually make a case for how defensibility will work. Um, it's, it's quite different from the way defensibility's worked before, so, and, and maybe you guys wanna add onto that.
- JLJennifer Li
Yeah. On how defensibility worked for infra companies before, largely like it came from... It, it's really hard to, let's say if you're pr- building like a new database, like a new framework, um, it just takes a, a lot of expertise in the domain of understanding, you know, what has happened in the past, where did the, you know, the, the field come through, and what are the, you know, um, uh, innovations that needs to happen to polish, let's say, like a software into, um-This new, um, abstraction to provide to developers, um, maybe one example I-I'm thinking of is like DuckDB. It's such a high performance, small but really nimble, uh, database that it took like four years for the team to write, to write it. To replicate that is really hard, and that's generally what happened in the past infrastructure space is like it takes these experts a lot of time to build like a new piece of security software if it's like, you know, YubiKeys or if it's, um, again, like Databricks o-on Spark. Um, but now, um, you know, as, uh, the AI infrastructure comes through, like I think a lot of those defensabilities still, uh, stays and st-still are true 'cause these are earned s-secrets of like what are the downfalls and guarantees, um, other software ran into. But I, I do feel like the adoption phase is just like really massive. Like who is going to earn the distribution and earn sort of the developer attention is going to be a different game.
- MCMartin Casado
The industry tends to go through these expansion and contraction phases. Think of like the Big Bang or something. Like it expands and then it contracts. So
- 25:28 – 27:09
Expansion and Contraction Phases
- MCMartin Casado
what happens when it expands? When it expands, zero-sum thinking is deadly because you're just get-getting more market. We're clearly in an expanse phase, [laughs] right? Like everybody's like, "Oh, NVIDIA can't like, you know, sell more chips," but they keep selling more chips. "Oh, like the hosting platforms can't continue to get margins," yet they keep continuing to get margins. I mean, Matt said it perfectly. I totally agree. So if you're in the expansion phase, then there's just more to sell. You should be aggressive investing. So what happens in the collapse phase? So look at any layer of the stack. So in the collapse phase, which things start to consolidate again, you have consolidation, but what is the end state of consolidation? The end state of consolidation will always be an oligopoly or a monopoly. It's not like layers ever go away. If you have an oligopoly, like say the clouds, then you have what is effectively price fixing, but it's tacit, right? Which is everybody's like, "We're gonna price it this and we're gonna maintain our 30% margins." So you still have value there. You have margins. Or in the case of like a monopoly, you'll end up with like, say, like an Intel at the time, then you can also maintain margins. So in none of this do you lose margins, right? And I just think this is why people think so sloppy about this. People use the words like commoditization and no defensibility. That tends to be a battle between layers of the stack, but the only way you can do that is actually, like actually move down the stack and enter somebody else's layer, which is incredibly hard to do. And you do see it. Like so of course, you know, Google is going to build their own chips and they start moving down the stack, but that's a very, very different layer than somehow Google playing the different layers off, uh, uh, uh, against each other. And so I, I would encourage anybody that does invest at least in infrastructure to not think zero-sum and to realize that historically
- 27:09 – 28:32
Challenges in AI and Infrastructure
- MCMartin Casado
every layer of the stack has maintained some level of value and margin, and if not, it was because a layer above them managed to kind of verticalize themself, but then it's that one player against the rest of the world.
- MBMatt Bornstein
Yeah, I mean it's, it's like faster than light speed travel's just been invented, right?
- MCMartin Casado
Yeah.
- MBMatt Bornstein
And we're sending all the spaceships out in all directions, and there's plenty of planets like and stars to claim for everybody. Like we don't need... You know, [laughs] we're not even close enough to each other to like fight over the AI.
- MCMartin Casado
For the, for the AI wave. Yeah, for sure.
- MBMatt Bornstein
Yeah, yeah.
- MCMartin Casado
Like this is a trend-
- MBMatt Bornstein
Yeah, in, in, yeah, in general
- MCMartin Casado
... but it will, it will slow down and then the consolidation will happen, but I guarantee you'll just end up with these great companies that maintain margin. Like AWS still has great margins. You know, Google still has great margins.
- MBMatt Bornstein
Databricks too is growing at an incredible speed for that-
- MCMartin Casado
Yeah, at scale
- MBMatt Bornstein
... at scale. You know, I think people underestimate how hard the, these problems are in, in many cases, right? You're sort of imply, uh, applying consumer thinking 'cause this is how we live most of our lives and like, oh, wouldn't it be relatively easy to move to a different part of the stack or take out your competitor or someone, a customer could just switch back and forth.
- MCMartin Casado
[laughs] Yeah.
- MBMatt Bornstein
And, and, and it's just different laws of physics I think, I think in infra.
- MCMartin Casado
Yeah. I totally agree.
- JLJennifer Li
It just turns out in general the switching cost of infrastructure piece is so much higher. Even with like API business, people tend to think you can just like switch over to another API. There's so much logic embedded in calling the API in the software itself. Like there's, uh, a lot more switching costs compared to, you know, your regular SaaS software or consumer software.
- MBMatt Bornstein
Yeah, totally, 'cause you're actually integrating systems, right? It's, it's not necessarily a person who can just have a preference for one thing or another. You're sort of integrating.
- JLJennifer Li
For sure.
- 28:32 – 30:59
AI Models and Generalization
- ETErik Torenberg
S-Sam Altman once had the advice to startups, uh, last year. He was like, "If you're worried about us improving our models, you're in a tough spot. But if you get, if you get more excited about your business, um, by us sort of improving our models, then, then you're in a good spot." Do you think that's a helpful framework or?
- MCMartin Casado
I think it's helpful for OpenAI, for people [laughs] who believe that.
- JLJennifer Li
[laughs]
- MBMatt Bornstein
No comments.
- JLJennifer Li
Uncertainty.
- MCMartin Casado
I, I, I wanna say that too.
- JLJennifer Li
[laughs]
- MCMartin Casado
Like, you know, like if, you know, a16z, if you think us investing in this company isn't good for you, then like- [laughs]
- JLJennifer Li
[laughs]
- MBMatt Bornstein
[laughs] Yeah. Yeah.
- MCMartin Casado
You know? But like if you wanna buy from our custom- [laughs] from our companies, like that's great.
- MBMatt Bornstein
[laughs]
- MCMartin Casado
I mean, that's a very open question that's actually a technical question, it isn't a business question, which is how much does general training generalize, right? So we know in like the pre-training world it generalized really well. So you'd create one model, and that model was just as good at code as it was at like writing a poem or, right? So we know that it was very general. And, and in that world, sure, as the models get more powerful, then they can do all of the things, so they compete with all of the things, right? But it, it seems clear to me, and again this is an observation and it may not be correct, that as we get more into the RL world, that you make some trade-offs. And then let's say I, I RL something for code, it's not gonna be as good as something else, and like you're making these trade-offs. And then in that world, then it's not the case that the model's gonna generally be good. So I think it's great to compete at the model layer. And so again, I think this is maybe a, a, a reasonable rubric certainly for OpenAI to have people believe, maybe a reasonable rubric if you believe that these models are gonna be generally great, but I just don't think it holds up to how things are gonna play out.
- JLJennifer Li
This was debate we had two years ago, I think, whether the general model and the most capable model will rule or, or a lot of, uh, small, small, medium-sized models that are very good at specific tasks, um, that's going to be the future. It turns out both are true.
- ETErik Torenberg
Both, yeah.
- JLJennifer Li
When we talk about complex systems, you cannot just like use one model that drives everything, at least not today. Uh, but you can compose like, you know, very capable and powerful models to like, you know, take certain tasks and also chain together in a processes from like-Processing document to like feeding in a model to have some reasoning and give, give you back clean, uh, and structured data, um, to make decisions and put into your application and to serve end users. Like that's a complex system that invokes many model calls instead of just like one big, um, models task.
- ETErik Torenberg
Did you, um, guys have any reactions,
- 30:59 – 34:09
Thoughts on Andrej Karpathy's Talk on AI
- ETErik Torenberg
or is it worth talking at all about, uh, Karpathy's, uh, talk? Did, did that framing resonate with you? Did you have any sort of, uh, differences of how you would frame certain things or how you... You know, one, one thing he mentioned is that he thinks it's not the year of agents, but the decade of a- of agents. Perhaps it's not as immediately, uh, you know, um, you know, upcoming as, as we might, might have thought.
- MCMartin Casado
So this, this idea of a prompt engineering that people have talked about, and, um, and s- and somebody, it wasn't Karpathy, but Karpathy piled on top, is this, it's really not about prompt engineering, it's context engineering. And so what is context engineering? So if you're gonna call a model, you kind of have to know what to put in the context in that prompt. And what tools do you have to do that? Well, you could use other models, but at some point you're probably gonna use traditional computer science. You're gonna use like things like indexes, you're gonna have to do prioritization, et cetera. And to really drive the best performance out of those models, you do want the context to be corre- uh, correct. And I, I do think it's probably the right framing of this problem. And the next step is, is in as much as we're gonna provide formalism to how to use these models, to how to use existing tools, to how do you improve the performance, you should be thinking about what's the right way to get the right context into those models. And, and I bring this up because, you know, like we said before, new infrastructure pieces create new patterns and new methods of software and building systems, and this is a great example of that kind of emerging before our eyes and people reasoning about it. And I, I truly believe in five years we'll look back, we'll come up with a whole new, you know, set of formal ways to build software and they will have strong guarantees and we'll understand them and, you know, there'll be all the tools for it, et cetera.
- JLJennifer Li
The way I think it-- that relates to our world is if you think about what is the, like, new form factor of infrastructure that needs to become part of this context en-engineering, it goes back to, you know, a lot of, uh, well, we're obsessed about it, with data pipeline, how do you feed the right data and context into the models or into, um, the context and how do you have agencies, tools, or infrastructure that will provide a discovery and, you know, guarantees of, of observability of these tools as well. Like it's the classic infrastructure problem that's still unsolved, so it's very exciting time.
- MBMatt Bornstein
There's a few different types of infra founders, right? And, and we could probably come up... You know, like there's the sort of infra founder who loves solving really messy, long tail, kinda like nasty problems, right? There's the type that kind of just gets fed up with a problem and they're like, "I'm gonna solve this finally," like, "I'm sick of this." Then there's the type that just kinda sees the world in a new way, right? And it's kind of like this is actually how we should marshal these resources and like, it can change how... You know, like React is a great example of this. We had all these like kind of progression of front-end development frameworks, and finally React kind of was like the way that stuck and, and for years now has been sort of the default front-end. So like I think what Karpathy's talking about is kind of trying to figure that out, right? And he... I think his Software 2.0 thing was really interesting. We were investing in a bunch of like traditional ML companies at the time. Um, and I think Software 2.0... I think he's sort of right about that too in sort of directionally. And, and I think he... My hope is it'll spa- inspire a lot of new infra founders to kinda do this work, like see the world in a new way and like figure out how, how these primitives should really be kind of arranged.
- 34:09 – 36:18
AI and Human Expectations
- MCMartin Casado
One of the difficulties of having any conversation around AI is it just exploits this weakness in the human imagination to like dump all of our fears and hopes and dreams into this anthropomorphic fallacy, right? It's just... And this goes all the way back to the Promethean legend. And so, I mean, let's talk about even this context, right? Like, you know, we're building systems to build other systems. Those systems have constraints, right? And so you can fail on either side of this when it comes to this anthropomorphic fallacy. On one side, you could be like, "This stuff doesn't work. You shouldn't use it. You should only use traditional things," which, okay, that's clearly not the case. They seem very useful. But on the other side, you can kind of like, you know, believe they'll solve all of our problems and you don't need formalism and you just kind of like go to the beach and you come back when AGI is done and it'll do it for you type thing. And so a part of our job, and what we spend a lot of time talking about, is trying to find that pragmatic, non-blinkered, non-pessimistic middle despite all of the rhetoric. And then you... I mean, you hear all the rhetoric, right? The stuff is gonna like, you know, "No, we're not gonna have to work and we're all gonna be on, um-
- JLJennifer Li
The beach. [laughing]
- MCMartin Casado
[laughing] Right. Or, or, you know, like it'll kill us. Like I mean, I, I mean, you know, the, the whole thing. And, and I think where, where we've landed is this is a real disruption. It is changing all of software. It'll look something place different, um, but it is still gonna ri- require professionals. And I do think that like n- like the statement it'll require professionals is a very meaningful one. It means that you actually still need people that understand the specifications of the systems. And, and, and not everybody agrees with that. Some people are out there like, "Listen, you will never need a programmer again because people are gonna just say some high-level thing and it'll show up." And the only one, one statement I'll say to that is formal systems came out of natural languages for a reason, and like either you care about specifying what you're designing or you don't, and if you do, you need to be a professional. And that's why every professional discipline, even though they started with a normalit- formal or, uh, natural language, has ended up with a formal system.
- 36:18 – 40:31
The Role of Developers in the AI Era
- ETErik Torenberg
And what is your mental model on, on coding specifically? Will there be like fewer engineers who are just higher powered or sort of, uh, whether le- fewer junior engineers? H- how should we think about
- MBMatt Bornstein
I think the best way to think about this is simply that we're going to have more developers. I, I think it's very unlikely that, like, we're going to shrink development teams because we have amazing new tools. Like, that's just kind of not how these markets have worked in the past. I think exactly the opposite is going to happen, is like we're gonna be creating so much great software, and it's gonna be so accessible to so many people who may work at a big company or may just be sort of hacking on the weekend. You can't anthropomorphize these models, right? A, a model is a file on a hard drive in a computer somewhere, and when you run a Python script, you can transform one piece of data into another piece of data. Like, that, that's what this is, and programmers, um, you know... Programming is a fundamentally creative job, right? You are literally creating things in the most strict sense of the word, which is you're creating software that doesn't exist before, and that's something only a person can actually do at some level of abstraction. So I, I, I personally think this is a huge boon for, for programmers, and you have to change the way that you're working, and it's, like, a huge productivity boost and, and I think that's, like, more... It creates more, not less.
- JLJennifer Li
I cannot agree more. Uh, I feel like for many of the developers and programmers I talk to today, they are like going to Disneyland, uh, just because how many great tools there are to help them move faster, like, you know, build things they have always wanted to, to, um, uh, as both side projects and also their main job. Um, I, I do think it also changes the dynamic of, like, how people are picking up new languages, picking up new frameworks. It, it is, again, uh, sort of a, a next, uh, level of iteration speed given what we're seeing with AI agents and also AI coding tools.
- MCMartin Casado
He- here's another, I think, useful mental model on this stuff. Like, I think it's worth asking the question, why do people buy software? Like, why does someone buy some random SaaS tool? Is it because it's so hard to build it? No. I mean, like, most SaaS tools are, like, crud. Like, they're just these basic kind of read/write databases. They're all kind of the same. So, so why, why do people buy them? And, and, and Aaron Levie, uh, who's the CEO of Box, I, I think said this so beautifully. The reason people buy software is 'cause somebody else made the decisions of what the workflow should be, and what the operational logic should be, and what data is important, how you use that data is important. Like, creating a product is a lot of understanding what is being used and guiding the user along that direction. So if not, I'll just give you a compiler. [laughs] Like, you do whatever you want, or I'll just give you a database, and you do whatever you want. I mean, there's a reason that we have a proliferation of vertical SaaS, and it's, it is this kind of, this articulation or, you know, this, this transfer of domain understanding. And that just doesn't go away independent of how you create the software. And so we will still need to design products based on whatever problem is being solved, guide people so that they're the most effective with them, and they can understand the best. And, and, and, and, you know, we did this with Assembly before, and then we did it with high-level languages, and then we did it with high-level frameworks, and then we're gonna do it with AI, but, like, the fundamental process of that articulation will not go away.
- MBMatt Bornstein
And it's totally orthogonal to creating the software itself, right?
- MCMartin Casado
Oh, yeah, of course.
- MBMatt Bornstein
It turns out to be a much harder problem to go out and collect requirements from an unknown set of users with an unknown set of needs, and, like, figuring out what to build, that turns out to be much harder than actually building it.
- MCMartin Casado
So what, what do you, what do you think is the, the, the average number of lines changed in a PR in the industry?
- MBMatt Bornstein
Two.
- MCMartin Casado
Yeah. [laughs]
- JLJennifer Li
[laughs]
- MCMartin Casado
But it shows you to the point. Like, it literally is understanding the need from the business and the need from the user and making some minor tweak. That is the long tail that goes into software. And by the way, it, it turns out it's two.
- MBMatt Bornstein
Oh, really?
- MCMartin Casado
Well, I think that's the median. It's the median is two.
- MBMatt Bornstein
Okay.
- MCMartin Casado
Yeah, so [laughs]
- JLJennifer Li
[laughs]
- MBMatt Bornstein
I've written a lot of, uh, you know, like, cheap PRs in my life. Um, just for the record, by the way, Martin said crud. Crud is a technical term, right? Create, read, update, delete. We actually think-
- MCMartin Casado
Oh, sorry
- MBMatt Bornstein
... applications are great, right? We don't think they're cruddy.
- JLJennifer Li
[laughs]
- MBMatt Bornstein
They just, like, literally are crud.
- MCMartin Casado
No, sorry. Crud, yeah, crud, crud, crud is-
- MBMatt Bornstein
Just, you know, foot, you know, footnote. [laughs]
- MCMartin Casado
[laughs]
- JLJennifer Li
[laughs]
- ETErik Torenberg
All right.
- 40:31 – 43:17
Synthetic Data
- ETErik Torenberg
Good, good to know. The, um, Jennifer, you were mentioning earlier how w- you know, two years ago we were having this debate on generalization, and sort of it, we, we've learned from it. What, what are the debates we're having now, i- internally or with, with, with your peers, or what are the main questions that, that we're asking that we, we can't wait to, to see how they're going to reveal themselves in the next few months or next year that are going to, you know, i- impact our business?
- JLJennifer Li
Oh, gosh, every, every week is different. Uh, what are some recent ones? Uh, definitely, like, how, how realist- how realistic, uh, are agents today, like, really producing production, uh, level, uh, software? Um, I mean, that's more on the coding agent side, but also just in, in general, like, the agent, um, evolution. Where, where are we in going from demo-ware to, you know, producing real value, tangible value?
- MBMatt Bornstein
Synthetic data is one we talk about a lot.
- JLJennifer Li
Only. Yep.
- MCMartin Casado
We've been talking about them for 10 years. [laughs]
- JLJennifer Li
[laughs]
- MBMatt Bornstein
Yeah. I mean, this, this is what's so great about infra. You can have the same debates for 10 years, and they never quite go away. [laughs]
- MCMartin Casado
Just, like, change the background. It's, like, everything else is the same.
- MBMatt Bornstein
Yeah, consistency versus availability. Literally every system has those trade-off. Um-
- JLJennifer Li
[laughs]
- MBMatt Bornstein
Uh, so, so the synthe- synthetic data thing, right? Is, is, like, can... Like, it's almost an information theory question. It's like, can you make models meaningfully better without introducing new information to the system? And I think it's now pretty clear you can do a, y- y- you can do a little bit, but, but the question is, like, does this lead to sort of like a self-improving utopia of models or, or not? And, and I think we have some pretty strong opinions on the not side of that. Um, generalization, Martine mentioned, is a, is a pretty interesting one. Like, if you train a model to be really good at math, does that mean it's gonna be really good at other things, or is it just, like, really good at math? Which I get excited about. [laughs] Not everybody, not everybody gets excited about this.
- MCMartin Casado
I, I... You know, another one that we talk a lot about is, like, like, what these things are actually good for, um, just because the path we came from used a lot of AI. Pre- pre-gen AI used a lot of AI, so there's a lot of AI-shaped holes in the enterprise, like chatbots and this and that, and, like, that's very much on the brain, and it seems to sometimes confuse the discussions from the new use cases we're seeing.Like if you look at the most common use cases of something like ChatGPT, I think the top one is like companionship and therapy, and then it's like managing my schedule, and it's like, it's like the top of the pyramid of need stuff. And then number five is professional development, not like low-code development or whatever. And so I think what is happening is we have this idea of what we thought AI was gonna do and like, you know, just kind of the stilted attempts previously, and like Jennifer actually, you know, ran product for, um, you know, a, a chat company, uh, prior, and then what it really is good for. And clearly there's some overlap and convergence, but it's not nearly as big as, as people say. And so we do spend a lot of time just trying to be very honest with ourselves or like, what is the new behavior being created? Where is this stuff getting used? Like, what is it just being trying to cram into places it's not actually quite good at?
- 43:17 – 45:16
AI Agents
- ETErik Torenberg
To that point, w- how do we think about agents right now? Or where, where, you know, what, what are they good for, going to be good for soon enough? What is sort of the, the, the s- the state of them? How do we think about the broader conversation?
- MCMartin Casado
I mean, coding agents are awesome. Like th- they're amazing.
- MBMatt Bornstein
It's, it's really amazing. I-- So, so I have a very simple way to think about this, um, and I'm like the anti-agent guy, by the way. I think it's kind of a marketing thing. This is the other thing about infra people, we like are like allergic to marketing, which, which is not always a good thing.
- ETErik Torenberg
[laughs]
- MBMatt Bornstein
Which is a good thing you're here, Erik.
- ETErik Torenberg
Yes.
- MBMatt Bornstein
Yeah.
- ETErik Torenberg
It's always ask, "What do you mean?"
- MBMatt Bornstein
Yeah, yeah, exactly.
- ETErik Torenberg
[laughs]
- MBMatt Bornstein
Yeah. Can you explain? Um, if you take the simplest definition that a, basically an agent is a, is an LLM running in a loop, um, a very simple way to think about this is errors propagate throughout the loop, right? So if you have a small error, it gets worse and worse. And, and this is why a lot of agents doing, say, uh, general web browsing don't, don't perform very well yet. On the flip side, if you have a way to correct those errors in the loop, which is one thing that you have in code, right, you can lint, you can interpret, you can even try to compile and things like that, um, you actually do see good performance over time. So that, that's a very simplistic and, and maybe not quite the right way to look at it, but like if you can do this kind of error correction, I think you're seeing a lot of sort of improvement from this, this iterative approach.
- MCMartin Casado
I mean, it's incredible. I'm actually on the like the, the, the GitHub mailing list of a lot of the companies that I work with, I mean, just, just mostly for, for interest. And I have even in the last week seen a bunch of kind of cursor like agent commits and like even, even in the last twenty-four hours, the Slack integration, seeing it come from Slack. And so I, I do think for kind of bite-sized, well articul- tasks you can articulate very well, we're starting to see them really work. So in, in, in the coding space, they, they... I would say I'm a convert. Um, but to Matt's point, like the, you know, go, you know, wander out in the woods and bring back a bear, I think we're- [laughs]
- ETErik Torenberg
[laughs]
- MBMatt Bornstein
[laughs]
- MCMartin Casado
I think we're a, I think we're a long way from that kind of agent.
- MBMatt Bornstein
There are a lot of bears you could run into in that, in that instruction.
- 45:16 – 47:29
Vertical Integration vs. Horizontal Specialization
- MBMatt Bornstein
[laughs]
- ETErik Torenberg
Will we see more vertical integration or more horizontal specialization?
- MCMartin Casado
You know, historically, we've seen both, and what's interesting is we're already seeing both now, right? Like Apple, of course, has just been historically vertically integrated.
- ETErik Torenberg
Yeah.
- MCMartin Casado
Uh, Microsoft and Intel historically horizontally. Uh, and often companies will start horizontal and then go vertical. So like, um, you know, Google is horizontal. I mean, it was built on top of normal servers, but then they built their servers, and then [chuckles] you know, they, you know, they built their own chips, they built their own networking gear. And so I, I think you always get a mix of the two. What's interesting about now is we're actually really seeing both. I mean, I would say that OpenAI is very much a vertically integrated company now with ChatGPT-
- ETErik Torenberg
Yeah
- MCMartin Casado
... driving a lot of it. I would say Anthropic, a lot of the usage really is more horizontal, and they're doing a great job of that. I think we're seeing this on the model layer too. I mean, a very interesting discussion we haven't had, but it's a very interesting one, is like open source, quote unquote, really seems to work with these models just because you can't, as a user, re- you know, recreate it. So like if you look at like BFL, they've done a great job building kind of like a horizontal layer for these models. Um, but then you've got companies like Ideogram, which have built a great kind of vertical experience as well. And so I would say for AI, we've got already this early on great examples of both, uh, and I, I don't see any reason that that will change.
- JLJennifer Li
I think from the business front, it just poses, um, new interesting questions and challenges too of how do you capture the value. Of, of course, horizontally can capture the value through being able to address every single use case by providing that to developers or enterprises. But if you vertically integrate, you kind of have to pick the lane of like, do I wanna focus on image models? Do I wanna focus on, you know, graphic designers? Do I wanna focus on, you know, people who are generating pho-photographies? Um, you kind of have to understand the market and the user, uh, personas, use cases pretty well to capture the maximum value, where you probably can, you know, take a easier path to just provide APIs so everybody can use it. [upbeat music]
Episode duration: 47:29
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode EIPxf7rgIPI
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome