a16zBuilding the Real-World Infrastructure for AI, with Google, Cisco & a16z
EVERY SPOKEN WORD
30 min read · 5,987 words- 0:00 – 1:16
Intro
- JPJeetu Patel
The good news is infrastructure's sexy again, so that's kinda cool. This is like the combination of the build-out of the internet, the space race, and the Manhattan Project all put into one, where there's a geopolitical implication of it, there's an economic implication, there's a national security implication, and then there's, um, just a speed implication that's pretty profound.
- AVAmin Vahdat
I mean, I think it's easy to say, uh, I've seen nothing like this. I'm fairly certain no one's seen anything like this. The internet in the late '90s, early 2000s was big, and we felt like, "Oh my gosh, can't believe the, uh, build-out, the rate." This makes it... I, I mean, 10X is an understatement. It's, uh, 100X what the internet was.
- SPSpeaker
[ on-hold music]
- SPSpeaker
Hello, hello. [laughing]
- AVAmin Vahdat
Hello.
- JPJeetu Patel
Hello.
- SPSpeaker
All right. What better time and place to talk infrastructure? All right. [laughing] So we were back in the green room and just as, um, the first question was getting answered, I got cut off, so this could be an entire repeat for all I know.
- JPJeetu Patel
[laughing]
- SPSpeaker
So but anyway, let's go, right? [laughing] Um,
- 1:16 – 3:00
The Scale of the AI Buildout
- SPSpeaker
the first question is similar. So both of you, firstly, welcome and thank you for being here.
- JPJeetu Patel
Thanks for having us.
- SPSpeaker
And, uh, hope, uh, you'll have a great day and a half as well. Um, both of you've been in the industry for a while, and both of you have lived through many infrastructure cycles, right? So have you seen anything like this cycle from your vantage point? Not from an investor vantage point, but from your internal, um, vantage point where you are responsible for building things and, and planning for things and so on. Any one of you, where do you wanna start? You wanna start, Amin?
- JPJeetu Patel
Go ahead, Amin.
- AVAmin Vahdat
I, I, I mean, I think it's easy to say, uh, I've seen nothing like this. I'm fairly certain no one's seen anything like this. The internet in the late '90s, early 2000s was big, and we felt like, "Oh my gosh, can't believe the, uh, build-out, the rate." This makes it... I, I mean, 10X is an understatement. It's, uh, 100X what the internet was. Um, I think the upside is a-as big as the internet was. Uh, same thing, 10X and 100X. Yeah. Noth-nothing like it.
- JPJeetu Patel
Yeah, I'd agree. I don't think there's any priors to this size and this speed and scale. Um, I'd, I'd say, um, the good news is infrastructure's sexy again, so that's kinda cool.
- SPSpeaker
[laughing]
- JPJeetu Patel
Um, [laughs] it was a long time where it wasn't sexy. Um, the, um, the thing I would say that's, that's really interesting is this is like the combination of the build-out of the internet, the space race, and the Manhattan Project all put into one, where there's a geopolitical implication of it, there's an economic implication, there's a national security implication, and then there's a, um, you know, just a speed implication that's pretty profound. So, uh, yeah,
- 3:00 – 5:56
CapEx, Demand Signals, and the Power Bottleneck
- JPJeetu Patel
we've... You know, none of us have ever seen it, um, at this size and scale. On the other hand, um, I think we're grossly underestimating... Like, there's... The most common question I'm asked right now is, "Is there a bubble?" I think we're grossly underestimating the build-out. I think there's gonna be much more needed than what we are putting the, um, you know, projections towards.
- SPSpeaker
So that's... The follow-on question is, where are we, do you think, in the CapEx spend cycle? But more importantly, what are the signals that you guys use internally, right, in your thinking? I mean, you have to plan data centers, whatever, four, five years in advance. You gotta buy nuclear reactors and whatnot. So how do you think about b- the demand signals as well as your technology signals? And Jeetu, the same thing for you, but from the point of view of enterprise and neo clouds, et cetera. Amin?
- AVAmin Vahdat
Uh, we're early in the cycle is, uh, what, what I would say r- certainly relative to the demand that we're seeing. So our internal users are... Uh, we've been building TPUs for 10 years, uh, so we have now seven generations in production for internal and external use. Our seven and eight-year-old TPUs have 100% utilization.
- SPSpeaker
Oof.
- AVAmin Vahdat
Right? And that, that just shows what the de- the demand is. Everyone would, of course, prefer to be on the latest generation, uh, but it... whatever they, they can get. So this tells me that the demand is tremendous, but also, w- who we're turning away and the use cases that we're turning away. It's, it's not like, "Oh yeah, that's kinda cool." It's, "Oh my gosh, we're actually not going to invest in this, and there's no option because that's where we are on the list." Same with many of you in the room, right? We're, we're working with, uh, many of you in the room and many of you are s- are telling me directly, and thank you, um, "We need more earlier." Right? Now, the challenge here though is, as you said, the... we're limited by power. We're limited by transforming land. We're limited by, um, permitting.
- SPSpeaker
Yeah.
- AVAmin Vahdat
And we're limited by, uh, backup delivery of, uh, lots of things in the supply chain. So one worry I have is that, uh, the supply isn't actually going to catch up to the demand as quickly as, uh, we'd all like. I h- I heard, uh, in the previous session some of the discussions of the, um, trillions of dollars that we're gonna be spending, which I think is accurate. I'm not sure that we're gonna be able to cash all those checks. Like, in other words, literally you all have so money you can't spend it all as fast as you want. I think that's going to extend for three, four, five years.
- SPSpeaker
Wow. And how do you deal with the depreciation cycles that are involved there?
- AVAmin Vahdat
Uh-
- SPSpeaker
Does the demand curve and the depreciation cycle curves-
- AVAmin Vahdat
We-
- SPSpeaker
... match up?
- AVAmin Vahdat
Well, fortunately we buy, uh, just in time, but the nice thing is... Or just in time for the hardware. The depreciation cycle for the space power is more like, uh, somewhere between 25 and 40 years. So we have, uh, benefits there.
- SPSpeaker
Okay.
- JPJeetu Patel
I think if you think of on the networking side and you look at both,
- 5:56 – 8:18
Data Centers, Scarcity, and Global Power Constraints
- JPJeetu Patel
um, um, enterprise and the hyperscalers as well as neo clouds, I think the story is quite different. So the, the n- the enterprise is pretty nascentAnd it's built out of true infrastructure. Um, I just don't think that the data centers... Like, if you assume that 100% of the data centers at, at some point in time need, you know, will need to get re-racked, and you will need a very different level of power, um, requirement per rack that's gonna be there compared to what used to be there in the traditional data centers. I just don't think that, um, the enterprises are far enough along. Maybe the few enterprises that are at super high scale might be there, but I don't think the enterprises are far enough along. Hyperscalers and neo clouds is a completely different story. And, uh, to Amin's point on this notion of scarcity of power, compute, and network being the three big kinda constraints in this thing, um, I, I would say right now that because there's not enough power singularly in one location, data centers are being built where the power is available rather than power being brought to where the data centers are. Um, and that's why you're seeing a lot of projects that are being built out all throughout the world. The other point, though, is the, um, the, the lion's share of the constraints that we're gonna have I, I think are gonna be sustainable for a, for a long period of time. And as you have data centers that are being built farther and farther apart, one, there's gonna be a huge demand for scale-up networking so that you can have a rack that gets more and more networking for scale-up. The second is you're gonna have a lot of demand for scale-out, where you have multiple racks and clusters that need to get connected together. But we just launched a, a new piece of silicon as well as, um, a new chip and a system for scale-across networking, where you might have two data centers that act as a logical data center that could be up, up to eight, nine hundred kilometers apart. Um, and, and you will see that just because there's not gonna be enough concentration of power in a single location, so you'll just have to have different architectures that get built out.
- SPSpeaker
Actually, that brings us, uh, to the next topic that I wanted to discuss, the future of systems and networking and so on and so forth. So Google
- 8:18 – 10:08
Rethinking Systems and Networking
- SPSpeaker
bought, bought the first, or at least large-scale, scale-out commodity servers and production for the web revolution, and now NVIDIA is bringing back the mainframe in a different form. So what do you think happens next? I mean, is, is this the new style of coherent cluster-wide computing that we need, and there's gonna be shared memory and all sorts of things? Or do you think the pattern changes again?
- AVAmin Vahdat
I, I don't think we're quite to, um, back to mainframes in that, uh, it is still the case that people are running on, uh, scale-out architectures across these pools. In other words, whether you have, uh, GPUs or TPUs, you're not necessarily saying, "Hey, that's my GPU supercomputer." You're saying, "I've got sixteen thousand three hundred eighty-four GPUs."
- SPSpeaker
Yep.
- AVAmin Vahdat
And maybe I'm going to go grab some subset. Now I've got a uniform multi-hub connectivity, uh, in many cases, which is fantastic. Same with TPUs. It's not like I say I have a nine thousand chip pod and I have to make my job fit on that. Maybe I actually only need two hundred fifty-six. Maybe I need a hundred thousand. So I do think that actually the s- uh, software scale-out is, uh, still going to be there. Uh, um, I'll note two things, though. One, you're absolutely right that, say, about twenty-five years ago, uh, at Google and other places simultaneously, there was really a transformation of computing infrastructure. Like, the notion that actually you would scale out on commodity PCs, essentially, the same ones that you could buy off the shelf, running a Linux stack, and that's what you would do for disk, that's what you would do for compute, that's what you do for networking. I mean, you all take it for granted that this is sort of... It was radical. There were many people who thought that this was a-
- SPSpeaker
Yep
- AVAmin Vahdat
... terrible idea that wasn't gonna work. I think the exciting thing about this moment right now is actually that we're gonna be reinventing... I'm, I'm not saying Google. We are gonna be reinventing computing. And five years from
- 10:08 – 12:18
Scale-Out vs. Mainframe Architectures
- AVAmin Vahdat
now, whatever the computing stack is from the hardware to the software, right, is gonna be unrecognizable. And by the way, there was this co-design, because if you think about it, I'll use Google examples because I know those best. Bigtable, Spanner, GFS, Borg, Colossus, they were hand-in-hand co-designed with the hardware-
- SPSpeaker
Mm-hmm
- AVAmin Vahdat
... the cluster scale-out architecture. And it was really the com- I mean, you wouldn't have done the scale-out hardware if you didn't have the scale-out software.
- SPSpeaker
Yep.
- AVAmin Vahdat
Same thing is gonna happen in this moment. So I, I think actually the mainframe, um, it's gonna look very, very different.
- SPSpeaker
Okay.
- JPJeetu Patel
Yeah, I, I do think there'll be, like, this ex- uh, extreme demand for an integrated system because, like, right, right now we are very fortunate at Cisco where we do everything from the, um, from the physics to the semantics. You know, you think about the silicon to the application. Um, and the... Other than power, one of the constraints is how well integrated are these systems, and do they actually work with the, the least amount of, um, lossiness a- across the entire stack? And so that, that level of tight integration is gonna be super important. And what that means the industry will have to evolve into is we will have to work like one company, even though we might actually be multiple companies that actually do these pieces. And so when we work with hyperscalers like Google or others, um, there's a deep design partnership that actually, you know, goes on for months and months together, uh, ahead of time before we actually even do the, uh, deal. And then once the deal is done, of course, there's a tremendous amount of pressure to make sure that the... You're, you're moving pretty fast. But I think the industry's muscle of making sure that you operate in an o- open ecosystem and not be a walled garden is gonna get important at every layer of the stack.
- SPSpeaker
Oh, completely agree. And so let's talk about the... Disaggregate the stack a little bit. One of the most interesting topic is processors, right? Clearly, there's an amazing vendor producing an amazing processor that has massive market share today, right?
- AVAmin Vahdat
And we see startups all the time doing all sorts of processor architectures. You've got an amazing processor inside, um,
- 12:18 – 14:36
The Next Wave in Processor Innovation
- AVAmin Vahdat
your fortress. [laughs] What do you think happens next in processor land? Yeah, we're, uh, huge fans of NVIDIA. Uh, we, we sell a lot of, uh, NVIDIA, uh, products and chips. Mm-hmm. Uh, customers love them. Uh, we're also huge fans of our, uh, TPUs. Uh, I think the future is actually really exciting and actually, uh, we're-- it's not that-- I don't think that we've hit the point of, okay, there's TPUs, there's GPUs, there's whatever, Trainiums or, or something else. We're really seeing the golden age of specialization, and that, that's my observation. In other words, if you look at it, a TPU, I'll use that example again 'cause I know it best, for certain computation is somewhere between ten and a hundred times more efficient per watt, and it's this watt that really matters- Mm-hmm, mm-hmm ... than a CPU. Uh, that's hard to walk away from, right? Ten to a hundred X. And yet we know that there are other computations that if you built even more specialized systems for, but not just the niche computation, computations that we run a lot of at Google, right? For example, uh, maybe for serving, maybe for agentic workloads that would benefit from an even more specialized architecture. So I think that actually one bottleneck is how hard is it and how long does it take to turn around a specialized architecture? Right now it's forever. Yeah. Right? For the best teams in the world, really from concept to in live in production, speed of light is two and a half years. Yep. I mean, that's, that's if you nail everything, right? And there are f- a few teams that do. But how do you predict the future two and a half years out for building specialized hardware? So A, I think we have to shrink that cycle. Mm-hmm. But then B, at some point when things slow down a little bit, and they will, I think we're gonna have to build more specialized architectures because the power savings, the cost savings, the space savings are just, uh, too dramatic to ignore. And this will actually have a really interesting implication on geopolitical structures as well, because if you think about what's happening in China, China actually doesn't make two nanometer chips. They make, you know, seven nanometer chips. Um, and, uh, and so if you think about what... But they have unlimited amount of power, um, and they have unlimited amount of engineering resource. And so what they can do is do the optimization on the engineering side, keep the seven nanometer chips, and make sure that they give people unlimited amount of power.
- 14:36 – 16:14
Specialized Chips, Power Efficiency, and Geopolitics
- AVAmin Vahdat
We might have a different architectural design where you have to get- Yep ... extremely power efficient. You don't have as many engineers as you might enjoy in China, and you can actually go to two nanometer chips, but, and those might be power efficient in some ways, but they might have thermal lossiness in other ways. Like, there's a whole bunch of things that have to get factored in, um, on the architecture that'll get more specialized even by geo and by region. And then depending on how the regulatory frameworks evolve, uh, you know, how that, that geo then expands. Like if China expands to different regions in the world, you will have a very different architecture that ta- plays out than if America expands to different regions in the world. So this is a very interesting kind of game theory- Mm-hmm ... exercise to go through on what happens in the next three years in, in tech in general, and no one knows right now. Yeah. That's the beauty of the world that we live in. Yeah, yeah. So we'll soon be measuring systems by engineers per token in addition to [laughs] watts per token. Um, all right, so let's turn to another topic which Raghu- Engineer per kilowatt. [laughs] Engineer per kilowatt. In the US. Um, networking, right? Obviously, you alluded to it, um, scale up, scale out. In your case, you mentioned scale across. So it seems to me that networking is also gonna get reinvented in a fairly significant way. So what are the leading signs that you're seeing that, and the signals that you're seeing in, on the direction networking is gonna take? Yeah, networking is gonna need a transformation, uh, for certain. In other words, uh, it,
- 16:14 – 18:52
Networking Evolution and Scale Challenges
- AVAmin Vahdat
the amount of bandwidth that's needed at scale within a building is just astounding. I mean, and, uh, and it's, it's going up. The network is becoming a primary bottleneck, uh, which is, uh, scary. So more bandwidth tran- translates directly to more performance. And then d- given that the network winds up actually being a small power consumer, that delivered utility you get per watt, like it's a super linear benefit. Like spend a little bit here, get way more there. Yeah. So I think that, uh, that side i- is absolutely there. Um, I'll put in a plug here in that w- in this, for these workloads, we actually know what the network communication patterns are, a priority. So I think this is a massive opportunity. In other words, do you then need, uh, the full power of a packet switch when actually you know what the rough circuits are gonna be? And I'm not saying you need to build a circuit switch, but there is an optimization opportunity. The other aspect of this here is these workloads are just incredibly bursty. Yeah. And, and we're to the point where, uh, and we've written about this, uh, power utilities know this when we're doing network communication relative to computation at the scale of tens and hundreds of megawatts, right? Like massive demand for power, stop all of a sudden- Mm-hmm ... and do some network communication, and then burst back to computing. So how do you build a network that needs to go at a hundred percent for a really short amount of time and then go idle? Yeah. And then same actually for the scale across use case, which, uh, we're, we're absolutely seeing. You don't run large scale pre-training across all your wide area data center sites twelve months of the year. So, and then you're gonna... This is a problem I think about a lot, is let's say you build the latest, greatest chips in these three data center sites. How long are you gonna be there before you migrate to the latest, latest chips in three other sites? And then what do you do with the network that you left behind? People are gonna run jobs on them. Yeah. But you're not gonna need nearly the network capacity- That's right ... that you did for large scale training, pre-training anyway.So the shift of needing massive networks for like five percent of the time, it, I, I don't know how to build a network like that.
- SPSpeaker
[laughs]
- AVAmin Vahdat
So if, if any of you do, please, um, uh, please, please let me know.
- SPSpeaker
Amin, if you don't know how to build this, there's nobody that knows how to build this.
- AVAmin Vahdat
We're, we're trying to figure it out. It actually is a fascinating problem.
- SPSpeaker
Yeah.
- AVAmin Vahdat
Yeah.
- JPJeetu Patel
I, I do think, like, if, if you think of... If power is the constraint and if compute is the asset, I think network is gonna be the force multiplier.
- AVAmin Vahdat
Mm-hmm.
- JPJeetu Patel
Because, you know, if a, if a packet... If, if you have low latency and low performance and high energy inefficiency,
- 18:52 – 21:00
Building Networks for AI: Power, Bursts, and Bottlenecks
- JPJeetu Patel
then the pack- uh, the... Ev- every kilowatt of power you save moving the packet is a kilowatt of power you can give to the GPU-
- AVAmin Vahdat
Yeah
- JPJeetu Patel
... um, which is, you know, super important. Um, the, the other thing is, you know, when you think about, um, scale-up versus scale-out versus scale ac-across, you'll also need, especially on inference versus training, there are different things that get optimized. Like, you might optimize for latency much more on training runs. You might optimize much more for memory on inferencing. Um, there, there's, uh, there's architectural... And so I, I also feel like the way that networking will evolve is rather than it being, um, a training infrastructure that then gets applied to inferencing, you might have-
- AVAmin Vahdat
Yeah
- JPJeetu Patel
... inferencing native infrastructure that gets built, um, over time. And so there, there's, there's good considerations to look at on, like, how all of the architectural components are, um, are moving. But, um, in, in my mind, like, if, if I were to say strategically one of the biggest things that's happening in networking from our vantage point is if you're just a wrapper around Broadcom, then you've got a monopoly that's gonna be a very predatory one. Um, and so one of the big reasons where Cisco is, um, super relevant is you don't just have a Broadcom world with people just wrapping Broadcom, mean- kind of that their systems are on Broadcom, but you will actually have a choice of silicon, and that choice and diversity of silicon is gonna be super important, uh, especially for high volume, you know, kind of consumption patterns.
- SPSpeaker
So last question on the system since you brought that up, and we'll move to use cases. Um, inference, both of you have mentioned, I mean, you talked about it in the context of the processors. You just started talking about the architecture. Are you deploying today's specific a-architectures for inference, Amin, or is it still shared workloads?
- AVAmin Vahdat
We are deploying s- uh, specialized architectures
- 21:00 – 24:00
Inference Architecture and Cost Reduction
- AVAmin Vahdat
for inference, and I think as much software as, uh, hardware, but the hardware is also, uh, deployed in different configurations, is, uh, the way I would say it. And then the other aspect of inference that is becoming really interesting is, uh, reinforcement learning-
- SPSpeaker
Yeah
- AVAmin Vahdat
... uh, especially on the critical path of serving because latency just becomes absolutely critical. Uh, and I think that... So how you would build your system and how you would connect it up, uh, to one another, and of course, networking plays a, a key role there, uh, becomes i-increasingly interesting.
- SPSpeaker
But are there singular choke points that, uh, if removed, would accelerate the thousandfold reduction in the cost of inference that we need, or is this just a natural curve that we are riding down?
- AVAmin Vahdat
So, so we're massive. I mean, two things here. One, uh, again, maybe many of you are f-familiar with this, prefill and decode on inference-
- SPSpeaker
Yeah
- AVAmin Vahdat
... look very, very different. So actually, ideally, uh, if you've, uh, you would have different hardware actually. Uh-
- SPSpeaker
Yep
- AVAmin Vahdat
... the balance points are different. Uh, so that's, that's one opportunity. It comes with downsides. Uh, we can talk about that. Uh, what I would say, though, is that may-maybe something people don't realize is that we're actually driving massive reductions in the cost of inference, I mean 10Xs and 100Xs. The problem or opportunity is the c-community, the user base keeps demanding higher quality-
- JPJeetu Patel
Mm-hmm
- AVAmin Vahdat
... not better efficiency.
- JPJeetu Patel
Mm-hmm.
- AVAmin Vahdat
So, uh, just as soon as we deliver, um, all the efficiency improvements we're looking for, the next generation model comes out and it is the, whatever, um, intelligence per dollar is way better, but you still pay more and it costs more-
- SPSpeaker
Yeah
- AVAmin Vahdat
... relative to the previous generation, and then we repeat the cycle.
- JPJeetu Patel
And it's almo-almost like the longer, um, the reasoning-
- AVAmin Vahdat
Yeah
- JPJeetu Patel
... that you have, the more impatient the market gets, right? So for example, if you have a 20-minute reasoning cycle, like for example with deep research, you could have autonomous execution for about 20 minutes. That was interesting. Now you have, you know, most of the coding tools that can go up to seven hours to 30 hours of, you know, duration of autonomous execution. When that happens, there's actually a greater demand for saying compress that time down. Um, and so you la... It's, it's kind of a self-fulfilling prophecy where you need to have more performance because of the fact that you've been able to go out and do things for a longer autonomous amount of time. And so it's almost a never-ending loop where you'll, you'll need to have more performance for inference-
- AVAmin Vahdat
Yeah
- JPJeetu Patel
... in perpetuity.
- SPSpeaker
Yeah. Though intelligence, intelligence per dollar is a business model metrics, metric, so it is not just the processor capability.
- AVAmin Vahdat
No, it's end-to-end. Absolutely.
- SPSpeaker
Yeah. So okay. So let's, uh, change topics and talk about actual usage, right? So both of you have massive organizations. Where are the key wins that you're getting today with, with applying all the AI that's available to you? And then we'll talk about what your customers are doing, but I'm actually curious
- 24:00 – 27:30
AI Inside the Enterprise: Code Migration and Productivity
- SPSpeaker
about what are you doing internally?
- AVAmin Vahdat
Wi- within the teams?
- SPSpeaker
Yeah.
- AVAmin Vahdat
Yeah. So, so I mean, coding is the obvious one, and that's actually picking up, uh, increasing traction and inc-increasing capability. Uh, we just actually in the last, uh, couple of days, uh, published a paper that showed how we applied AI techniques to, uh, do instruction set migration. So in other words, we actually had a fairly massive migration from X86 to ARM, making our, uh, entire code base, and at Google it's a very, very large code base, uh, uh, sort of instruction set agnostic and including to, you know, future RISC-V or whatever else might come along. Uh-Tens and thousands, hundreds of thousands of individual CLIs
- SPSpeaker
Your entire code base, you're gonna make it agnostic
- AVAmin Vahdat
Entire code base, because we, we, um, want, want and need all of our code base to be agnostic
- SPSpeaker
Man, that's a crazy-ass project
- AVAmin Vahdat
Yeah. So, so we, we... [laughing] It, it was. And the, the motivation though for this actually was a few years ago, uh, we had this, uh, amazing, uh, legacy system called Bigtable, and then a new amazing system called Spanner, and we decided to tell the company, "Hey, everyone needs to move from Bigtable to Spanner." And by the way, Bigtable was amazing for its time, but Spanner was better. The estimate from doing that migration for Google was seven staff millennia.
- SPSpeaker
How much? [laughing] How much?
- AVAmin Vahdat
Seven staff millennia.
- SPSpeaker
[laughing]
- AVAmin Vahdat
That... We, we had a new unit that we had to actually-
- SPSpeaker
[laughing]
- AVAmin Vahdat
... to see what... And, and it was... It wasn't, like, made up, people were being lazy. It's like, this is, this is what it was gonna be
- SPSpeaker
It's endearing that they came up with that, though
- AVAmin Vahdat
And you know what we decided? Long live Bigtable.
- SPSpeaker
[laughing]
- AVAmin Vahdat
I decided, what? It just wasn't worth it-
- SPSpeaker
Yeah
- AVAmin Vahdat
... honestly. Like, the opportunity cost was, uh, too high. So the... And we have these sorts of migrations, uh, TensorFlow, uh, to JAX. We actually... I mean, again, somewhat private, but not-
- SPSpeaker
Yeah
- AVAmin Vahdat
... not too secret. We, we've, uh, affected this internally with AI Assist, went integer factors faster. Now, there are other tasks which, um, the tools probably aren't quite yet up to the, um, whatever standard for, but the, the area under the curve is getting bigger and bigger and bigger.
- JPJeetu Patel
So we're seeing probably, like, three or four really good use cases, and then we're seeing some use cases which are not working yet. And so what is working, code migrations is working relatively well. So far we use largely a combination of Codex, Claude, and, um, and Cursor, some Windsurf. And so, um, code migrations tends to work pretty well. Um, debugging, oddly enough, has actually been very, very productive with, um, with these tools, and especially with CLIs. Um, the, um, um... Where we've not done as good a job... And then front-end zero-to-one projects tend to do extremely well, like, the engineers are super productive. When you go to code that's older, um, and especially further down in the infrastructure stack, much harder to go out and get that to happen. But the challenge that we have to orient our engineers on, this is actually much more of a cultural reset problem than it is a, um, just a technical problem, which is if someone uses something and says, "This isn't working right," um, you can't put it back on the shelf saying, "This doesn't work," for another six or nine months. You have to come back to it within four weeks and see if it works again, because the speed at which these tools are m- uh, kind of advancing is so fast that you almost have to kind of get... Like, so I was with 150 of our distinguished engineers
- 27:30 – 29:40
Rewiring Culture Around Rapid AI Adoption
- JPJeetu Patel
today, and what I had to urge them to do is, um, assume that these tools are gonna get infinitely better within six months-
- SPSpeaker
Yeah
- JPJeetu Patel
... and make sure that you get your mental model to where that tool is gonna be in six months and what are you gonna do to be best in class in six months rather than assessing it for where it is today and then putting it aside for six months assuming that that's not gonna work for the next six months. I think that's a big strategic error. So, like, we've got 25,000 engineers. I'm hoping that we can get at least, um, two or three X productivity within a very short amount of time within the next year. Um, and we, we... else we, we'll be able to see what... if, um, if that happens. The second... Like, a couple other big areas that we are starting to see some good responses is in sales. Preparation going into an account call-
- AVAmin Vahdat
Mm-hmm
- JPJeetu Patel
... really good. Legal contract reviews, actually much better than what we had thought. Um, and then, uh, the last one is not super high inference volume, but product marketing. Um, I think the first ChatGPT take on competitive is always better than what my... any product marketing person comes up by themselves.
- SPSpeaker
[laughing]
- JPJeetu Patel
So we should never start from a blank slate. Just start from ChatGPT and then go from there.
- SPSpeaker
Okay. Raghu, we could be talking about the topic for a long time, but they showed me the two-minute warning, so I wanna focus on one last question here. So we've got a lot of founders here, right? Building amazing companies. So what is the most interesting development they should look forward to in the next calendar year, let's call it, or the next 12 months, A, from your company, and B, from the industry, if you are looking at your crystal ball?
- AVAmin Vahdat
I mean, I think to build on the point, uh, these, these models are getting more spectacular, um, by the, by the month, um, and they'll be from whatever companies, uh, you like, uh, a bunch of really exciting, including ours.
- SPSpeaker
Oh, I forgot to say, you're not allowed to say models will get better.
- AVAmin Vahdat
Yeah.
- SPSpeaker
Everybody knows.
- AVAmin Vahdat
The models, the models are gonna get-
- SPSpeaker
Yeah
- AVAmin Vahdat
... uh, but, but I mean, they're getting scary good is, uh, the part that I would say. Um, but I think that then the agents that get built on top of them and
- 29:40 – 31:55
Startups, Models, and Intelligent Routing Layers
- AVAmin Vahdat
the frameworks for making that happen are also getting scary good. So the ability to, um, have things go quite right for quite long over the coming 12 months is gonna be transformative.
- SPSpeaker
Anything... Do you wanna leak any aspect of your roadmap? Next 12 months?
- AVAmin Vahdat
Not, not, not, so not right now. Yeah.
- SPSpeaker
Okay. [laughing] Jeetu?
- JPJeetu Patel
I, I, I'd say the, the, the big shift and what I would urge startups to do is don't build thin wrappers around models that are other people's models. I think the m- the, the combination of a model working very closely with the product and the model getting better as there's feedback in the product is gonna be super important. So you are gonna need foundation models, but if you just have a thin wrapper, I think the durability of your business will be very, very s- um, you know, short-lived.
- SPSpeaker
Mm.
- JPJeetu Patel
So that, that would be something that I would, I would urge you on. And I think that intelligent routing layer of some sort that says, "I'm gonna use my models for these things. I'm gonna probably use foundation models for other things."
- AVAmin Vahdat
Mm-hmm.
- JPJeetu Patel
And dynamically keep optimizing will be, uh... I, I think Cursor does that pretty well.Um, but that, that'll be a, a, a good way that the, the software development life cycle will evolve. Um, what you should expect from Cisco is... Look, truth be told, for the longest time s- people thought Cisco was a legacy company, like they, they were a has-been. And I think in the past year, hopefully you've, you've paid attention. I think there's a level of momentum in the business. There's a spring in the step in the employee base. So, uh, you should expect, like I said, from the physics to the semantics in every layer, from silicon to the application, a fair amount of innovation in, uh, silicon and networking and security and observability and the data platform, uh, as well as applications, um, you know, from us. And, um, we're excited to work with, um, the startup ecosystem and, um, so if you, if you ever feel like you wanna work with us, make sure that you reach out to us.
- SPSpeaker
Were you gonna say something, Amin?
- AVAmin Vahdat
I mean, one aspect that I, I wanna highlight about the models is, um, wh- where we were with, let's say, text models two and a half, three years ago. They were fun. Like, "Hey, write me a haiku about Martin." Did a great job. Now they're amazing. I think that what's gonna happen in the next 12 months is the same thing is gonna be happening with
- 31:55 – 33:10
The Future of AI Models, Agents, and Media
- AVAmin Vahdat
input and output of images and video to these models. And-
- SPSpeaker
Mm-hmm
- AVAmin Vahdat
... to the extent that e- even for images, imagine them as productivity and educational tools, not just, okay, here's Martin as Superman on a... Like, that's cool, too, right?
- SPSpeaker
[laughs]
- AVAmin Vahdat
But u- using it for productivity gains and learning, I think, is gonna be really, really transformative.
- SPSpeaker
Awesome.
- AVAmin Vahdat
Yeah.
- SPSpeaker
So on that note, we'd like to end this session. Thanks for a great conversation, Amin. Thanks, Jeetu. [audience applauding] [gentle music] [whooshing] [clicking] [popping] [whooshing] [clicking] [gentle music]
Episode duration: 32:47
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode OsLRf6r5U9E
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome