Skip to content
The Twenty Minute VCThe Twenty Minute VC

Andrew Feldman, Cerebras Co-Founder and CEO: The AI Chip Wars & The Plan to Break Nvidia's Dominance

Andrew Feldman is the Co-Founder and CEO @ Cerebras, the fastest AI inference + training platform in the world. In Sept 2024 the company filed to go public off the back of a rumoured $1BN deal with G42 in the UAE. Andrew is the leading expert for all things inference. ---------------------------------------------- In Today’s Episode We Discuss: (00:00) Intro (00:56) Where Was AI Landscape in 2015 When Cerebras Founded (02:34) NVIDIA’s Biggest Strength Has Become Their Biggest Weakness (04:07) What Happens to the Cost of Inference? (06:24) Why Are AI Algorithms So Inefficient? (20:58) Why is it Total BS That We Have Hit Scaling Laws? (25:26) What Will Be the Ratio of Synthetic to Human Data Used in 5 Years? (36:50) What Specifically Was So Impressive About DeepSeek? (37:16) Why is Distillation Not Wrong and OpenAI Need to Look in the Mirror? (38:07) Where Will Value Accrue in a World of AI? (40:13) How Will NVIDIA’s Market Position Change Over the Next Five Years? (48:18) Why is the CUDA Locking for NVIDIA BS? What is Their Weakness? (49:11) Why is Trump Better for Business than Biden? (01:01:22) Do We Underestimate China in a World of AI? (01:05:23) Quick-Fire Round ----------------------------------------------- Subscribe on Spotify: https://open.spotify.com/show/3j2KMcZTtgTNBKwtZBMHvl?si=85bc9196860e4466 Subscribe on Apple Podcasts: https://podcasts.apple.com/us/podcast/the-twenty-minute-vc-20vc-venture-capital-startup/id958230465 Follow Harry Stebbings on X: https://twitter.com/HarryStebbings Follow Andrew Feldman on X: https://twitter.com/andrewdfeldman Follow 20VC on Instagram: https://www.instagram.com/20vchq Follow 20VC on TikTok: https://www.tiktok.com/@20vc_tok Visit our Website: https://www.20vc.com Subscribe to our Newsletter: https://www.thetwentyminutevc.com/contact ----------------------------------------------- #20vc #harrystebbings #andrewfeldman #cerebras #ceo #founder #ai #nvidia #chips #cuda #deepseek

Andrew FeldmanguestHarry Stebbingshost
Mar 24, 20251h 14mWatch on YouTube ↗

EVERY SPOKEN WORD

  1. 0:000:56

    Intro

    1. AF

      Our AI algorithms today are not particularly efficient. In a GPU, most of the time it's doing inference, it's 5 or 7% utilized. That means it's 95 or 93% wasted. We won't be as dependent on transformers in three years or five years as we are now, 100%. The fundamental architecture of the GPU with off-chip memory is not great for inferencing. Now, they will continue to do well in inference, but it can be beaten and I think they know it.

    2. HS

      Ready to go? (upbeat music) Andrew, it is such a pleasure to meet, man. I've wanted to do this one for a while. I've heard so many good things from Eric for a long time, so thank you so much for joining me.

    3. AF

      Oh, ha- Harry, thank you for having me. I appreciate it.

    4. HS

      Not at all. This will be a fantastic conversation. I have my pen ready.

    5. AF

      (laughs)

    6. HS

      I feel like this is gonna be a learning experience for me.

  2. 0:562:34

    Where Was AI Landscape in 2015 When Cerebras Founded

    1. HS

      Um, I wanna go back to 2015. What did you and the team see in the AI landscape in 2015 that led to the founding of Cerebras?

    2. AF

      We, we saw the rise of a new workload, and th- this is every computer architect's dream. We, we, we saw a, a new problem to solve, and what that means is, is m- maybe you can build a new machine better suited to that problem. And so in 2015, and the credit goes to Gary and Shawn and JP and Michael, my co-founders, they, they saw on the horizon the rise of AI. And that, what that meant was there'd be a new problem for computers, that what the AI software would ask from the underlying chip, the processor, would be different. And we came to believe that we could build a, a better machine for that problem. That's what we saw. You know, obviously, we didn't see it exactly right. I underestimated it. You know, this is my fifth startup and the, the first time I, I underestimated the size of the market by a lot. (laughs) Um, you know, it, uh ... But what we did get right was that this was gonna be big and it would put a different type of pressure on a processor, and that it would put pressure on the memory bandwidth, that it would put pressure on the communication structure. So, that's what we saw. We dove in. Uh, it's been an extraordinary nine years.

  3. 2:344:07

    NVIDIA’s Biggest Strength Has Become Their Biggest Weakness

    1. AF

    2. HS

      It's been an extraordinary nine years. Can you just help me understand, how does the movement into an age of AI change the requirements from a chip perspective of what is needed for a provider, and how that then resulted in how you built Cerebras?

    3. AF

      The way to think about a, a, a chip is it does two things. It does calculations and it moves data. Right? Th- th- this is wha- what, what, what a chip does. And, uh, sometimes along the way, it stores data. And so, uh, what AI presented was a very unusual combination of challenges. First, the underlying calculation is trivial. It's a matrix multiplication. And an FMAC can be developed by any second-year electrical engineering student. So, you say to yourself, "Holy cow, this has a huge number of very, very simple calculations." The hard part with AI work is results and intermediate results have to be moved a lot. And th- therein is the most complicated part. They have to be moved to memory and from memory, and they have to be broken up and moved among GPUs. And what we saw was that this was gonna be the hard problem. And that if we could solve for that problem, we would build an AI computer that was faster and used less power.

  4. 4:076:24

    What Happens to the Cost of Inference?

    1. AF

    2. HS

      When we think about how we're gonna build and what we're building for, the- there's, to me-

    3. AF

      Mm-hmm.

    4. HS

      ... kind of a couple of core elements, which is like where you're gonna focus.

    5. AF

      Right.

    6. HS

      Are you focusing on, you know, fine-tuning? Are you focusing on, you know, training? Are you focusing on inference? Three. You chose all three.

    7. AF

      Yeah. (laughs)

    8. HS

      (laughs) Why? And I'm sorry for my base questions.

    9. AF

      Sure.

    10. HS

      But I thought like GPUs were specialized towards training and they weren't specialized towards inference. Can you have a mono architecture that does three best?

    11. AF

      The first step in, in computer architecture is deciding what you're not gonna do, right? (laughs) I- that- that's, "What are we not gonna be good at?" is really the first important question. I think to answer your question, you say, i- is the computational work for training from scratch different from fine-tuning? And the, the answer is, it's not different. It's approximately the same. Now, inference and training have some different requirements. And generative inference, in particular, has some very challenging requirements on exactly the communication dimension that I mentioned. In generative inference, you have to move all the weights from memory to compute to generate a single word, and you have to move them again to generate the next word, and again and again. So, if you have a 70 billion parameter model, not a giant model, and each weight is 16 bits, you're moving, what, 140 gigabytes of data to generate one word. This is an enormous amount of data movement across memory, and that's called what, what, what's consumed, that needs is, is memory bandwidth.And if you have an architecture like we saw in the GPU, w- w- that is your fundamental limitation. It's a fundamental architectural limitation. And that was what we went to, uh, wafer scale to solve.

  5. 6:2420:58

    Why Are AI Algorithms So Inefficient?

    1. AF

      They use memory, a memory called HBM, it's a type of DRAM, and it is phenomenal memory, but it's slow. It is slow and high-capacity. And when they set the architecture for graphics, that's wh- that's what you wanted. You didn't have to go back and forth to memory very often. SRAM, on the other hand, is unbelievably fast, but has low capacity. And so we wanted to use SRAM, but if you build a normal-sized chip, you can't hold a model. And so by going to wafer scale, we were able to put down a huge amount of SRAM and get the benefits of speed and enough capacity. If you build a normal-sized chip with SRAM and you wanna do a 400 billion parameter model in inference, you might need 4,000 chips.

    2. HS

      (laughs)

    3. AF

      Or if you wanna do a DeepSeek 671, you might need 6,000 or 8,000 chips. What an administrative nightmare.

    4. HS

      I'm sorry again, but...

    5. AF

      And if you, you can keep it on, on as much as you can on one wafer, two wafers, or four, or 10, y- y- you get all the benefit of the SRAM. And because you've been able to use the wafer, you get this tremendous capacity as well.

    6. HS

      Can I ask you, first, wh- I totally get you're on HBM and kind of the slowness of it. Why is it then that, bluntly, so much of the market just continues to use it and 40% of NVIDIA's revenue is using that, that chip's for inference?

    7. AF

      Look, uh, there wasn't really... Unless you went to wafer scale, there isn't really a credible other choice. Um, that this is the way GPUs had always been made. It's called the graphics processing unit. Right? Th- that, that's, that's the way they were built. And th- they... It was part of their advantage against the CPU, was they were built this way. But now, their dedicated chips like ours, and what used to be their advantage is now their weakness, and that's, that's a fun market to be in when, uh, over a very short period of time, what you're good at becomes your weakness.

    8. HS

      With a market cap like they do, and with Jansen as good as he is, which I'm sure we both agree with-

    9. AF

      Right.

    10. HS

      ... they must know-

    11. AF

      World-class.

    12. HS

      Yeah, but they must-

    13. AF

      They do know this. They do know this.

    14. HS

      So, so what do they do, Andrew?

    15. AF

      There's not a lot of choices. Wh- I mean, look, A, they don't make memory, so they're a consumer of other people's memory, right? And that's SK, right, the HYNIX guys, or Samsung, I mean, Micron. They're, they're only s- three or four or five companies that make huge amounts of, of memory. Not many choices. But it's, it's part of a, a complex architectural trade-off. You know, the flip side, you could say it's worked really well for them, right? Lo- look at where it's taken them. Um, but in comparison to those of us who do wafer scale, it's a small set. It's a set of one, us, right? We have a, we have real advantage ag- against them on, on, on inference.

    16. HS

      How do LPs fit into this? We've got HBM, we've got SRAM with you and, uh, bluntly, having many more of them to make it work and scale. Where do LPs fit into this mix?

    17. AF

      In our business there, there, there are a lot of ways to, to skin a cat. And, uh, you know, our way is, is different than NVIDIA's way, it's different than the TPU, it's different from Cranium, it's... They're different. Um, right now, and every day since August 26th when we launched Inference, our way has been the fastest way, uh, across a whole set of models tested by Artificial Analysis and others.

    18. HS

      Can I ask, when we think about kind of that speed, I am interested, you said that kind of you're one of one with wafer and kind of the architecture associated. What does that mean in terms of cost? With such efficiency, is it inherently more expensive, and what does that look like from a cost profile?

    19. AF

      This isn't our, our, our first dance.

    20. HS

      (laughs)

    21. AF

      We, we've been building computers (laughs) for, for a long time. And when you make a, a choice like wafer scale, you, you have to weigh the trade-offs. We use less power. We use less power because one of the most power-hungry things on a, on a chip are the I/Os, right? Are moving data off chip, and so if you are moving data off chip frequently, you're using more power than if you can keep it in the silicon domain on chip. So, we knew we would use less power. We knew if you went to wafer scale that you had to solve some problems that people said were impossible to solve, like yield. So, we had to invent techniques that allowed us to yield wafers. In fact, we invented techniques that allow us to yield as well or, or better than others who are building much smaller chips.

    22. HS

      Can I just interrupt and ask-

    23. AF

      Sure.

    24. HS

      ... what is yield and why is it impossible to solve (laughs) ?

    25. AF

      Okay. Uh, a wafer become, begins, it's a, a, a 12-inch, uh, diameter circle, a slice of, of silicon. And ch- your, your chip is punched out of this, the way your mother might take a cookie cutter and cut out cookie dough. And, uh, during the process at some point, just like your mom might have done, she lifts up the edges and all the little bits are removed and what's left are just the cookies. Those are your chips.Um, now what happens is, there are a set of naturally occurring flaws, and that's like your mother closing her eyes and throwing up a handful of, of M&M's. Now, the bigger the cookie, right, the higher probability of hitting M&M. The bigger the chip, the higher the possibility that you have a flaw. And traditionally, what you did when you had a flaw was you threw away the chip, or you sold it as a less valuable part. You shut down part of the chip and sold it as a less valuable part, something called binning. So, every wafer's gonna have flaws. The bigger your chip, the higher probability you hit a flaw and the more way- the more part of silicon is wasted when you throw it away. This is what everybody thought was known truth. And one of the, the things our, our team realized was that there are other ways to handle flaws. Like what, what if instead you built your computer, you built your processor out of hundreds of thousands of identical tiles. And say there was a flaw, so you just shut down that tile and worked around it. Say you had a row or a column of redundant tiles, that when you needed them you could just pull in. Now, that had been traditionally the, the technique used in memory-making, and the memory yields are extraordinary. And so it occurred to us that if we could build a computer, build a processor, built of hundreds of thousands of identical tiles, we could use redundancy such that when there was a flaw, we could just leave it there. Shut it down, work around it, and pull in one of the redundant tiles. And that had never been done in a, in a computer before, and th- that's at the heart of our architecture. That allowed us to yield and deliver whole wafers. Nobody ever been able to do that in the 70-year history of our industry. R- r- really, really smart people struggled. I mean, Gene Amdahl, one of the fathers of, of our industry, had a company called Trilogy that, that crashed and burned trying to do this. Um, and, uh, we, we figured it out.

    26. HS

      When you speak about kind of being the fastest and across all benchmarks being the fastest, what matters the most? Is it being the fastest? Is it being the most efficient? Is it being the least costly? How do you think about the stack of prioritization for your customers?

    27. AF

      I think it varies. I, I think, uh, um, look if, if, if you go to, to, to, to get a cancer diagnosis on, God forbid, your mother or, uh, your wife-

    28. HS

      Mm-hmm.

    29. AF

      ... I think, uh, 93% accuracy is just plain not as good as 94% accuracy. And you pay a lot and wait another week to understand what the accuracy is, right? You pay a lot.

    30. HS

      Sure.

  6. 20:5825:26

    Why is it Total BS That We Have Hit Scaling Laws?

    1. HS

      come, and we're seeing that come down.

    2. AF

      Yeah.

    3. HS

      But are we equipped from an energy and a data center standpoint to deliver the inference requirements for a population that is as AI hungry as we are?

    4. AF

      I, I think a couple things. I think the, the first thing is to admit that, that this is a power-intensive problem. We, we consume, our industry consumes an enormous amount of power. The second thing to say is, therefore the burden is on us to deliver exceptional value as an industry. I mean, that's, that, that's, you, you take both, the good and the bad, right? In order to make it worthwhile from a societal perspective to, to expend all this power, you better deliver the goods. Right? We, we, we better use AI to find cures for, uh, for diseases. We better use AI to find a bunch of different, uh, solve a bunch of different societal problems. So, that, that's the, the macro view. Um, do I think that, uh, we are equipped? I think we are in a very unusual situation in the US where we have plenty of power, but it's in all the wrong places, right? We, we, we have power in Niagara. (laughs) We have power, what we, what we don't have is power where you wanna build data centers where we have good fiber. And what we don't have is a national way to relax the local regulations that make getting power difficult. And so when you go to Silicon Valley, if you wanna build a data center, you're dealing with local government and installed interests. And that is not an efficient way to decide if you wanna build a power plant or, or put a new data center in, um, especially if it's large. And, uh, I think those places that have ripped out some of that burden, uh, in Texas, for example, are getting a huge amount of data centers built.

    5. HS

      You know, when I spoke to, you know, Jonathan at Groq before, he said there were a huge amount of data centers being built that were not actually really equipped properly, and that we've seen this massive supply of data centers that are really kind of done by tourists, so to speak. And that is a massive problem, and that the provisioning of these data centers isn't there. Do you agree?

    6. AF

      I, I think the following. I, I think, uh, a data center is a, is a construction project to begin. It's access to power, and then it's a construction project, and it's got a design engineering component. I think there's been a, a huge push for new construction data centers. And I, I think, uh, we, we will see. We, we don't know if they're gonna be good enough. I, I think many of them will be fine. I think the guys who were there early were some of the Bitcoin mining companies, like TeraWulf, the guys at Crusoe, and, and others, uh, guys in Europe. Uh, uh, they, they were early in building buildings near low-cost power in order to run compute, uh, that used a lot of power. And they are some of the leaders now in some of the largest projects.Now, those are certainly not tourists. Those are extremely sophisticated data center builders. Now, sure, there, there are some tourists, but th- there are a lot of, of very, very knowledgeable data center builders building huge facilities right now. I mean, gigawatt-scale facilities, both domestically and internationally.

    7. HS

      How do you think about how the cost of inference goes down? With the surge of demand that we mentioned, you know, over 100X, does the price reduce 100X? Does it follow Moore's Law continuously? How do we think about the ever-reducing price of inference?

    8. AF

      Look, there, there are... Uh, the, the cost of inference is built up of, of several pieces, right? There's the power and space that is consumed to generate the response, right? That, that, that's a data center cost, that's an OpEx item, number one. Number two, there's the, uh, cost of, of the, the computer. We can drive down the cost of the computers with each generation by driving up their performance, etc. But the other thing we can do is we can develop more efficient algorithms.

  7. 25:2636:50

    What Will Be the Ratio of Synthetic to Human Data Used in 5 Years?

    1. AF

      Our AI algorithms today are not particularly efficient. There's a tremendous amount of room. Uh, in a GPU, most of the time, it's doing inference, it's 5 or 7% utilized. That means it's 95 or 93% wasted. So over time, I think, as an industry, w- we get better at things. We can drive the cost of compute down. We can build more efficient data centers with lower PUEs. And our algorithms will get more efficient, so that our utilizations on our now cheaper computers are higher, so you get a higher percentage of the maximum number of FLOPs. You get more tokens per unit time for the same power.

    2. HS

      When you look at the inefficiency of the algorithms, as you mentioned there-

    3. AF

      Sure.

    4. HS

      ... and what that means for the utilization of the chips, why are people suggesting that we're at scaling laws already? That seems to suggest that there is so much room for improvement. How do you think about what you just said in conjunction with the idea that scaling laws, we're, we're hitting this asymptote point? How do you reconcile the two?

    5. AF

      I think that, uh... I, I don't think there's a lot of, uh, of debate among, uh, sort of senior ML thinkers that, that, that we have tremendous room for algorithmic improvement. I, I, I don't think there's, uh, a lot of debate there. There's even debate about whether the scaling laws are over, whether we run out of, ran out of mojo to keep making data or gathering data to fill these ever-bigger models. But OpenAI's work on O1 shows me that the scaling laws, certainly for inference, are, are fully functional, right? The more compute you put on inference, the better answer you get. And so, uh, I, I think that, uh, you know, many of the leading models are now MoEs, right? So they, they're not presenting all of, uh, all of the weights, uh, to, to, to, to, to each token, and, and that's one way to do it. Present the important stuff, not the unimportant stuff. There are other ways to do it that we will invent and learn over time. But, uh, I, I think we know the... We have, we have sort of human models that aren't all-to-all connected. Many of our models today are all-to-all connected. That, that's a lot of unnecessary connections, connections that don't produce anything that we still end up doing math over.

    6. HS

      I'm sorry, what does all-to-all connected mean?

    7. AF

      In many of the layers in a neural network, every element is connected to every other one. And that's not the way, actually, the, the, the learning happens. Uh, some are more valuable and some are not valuable at all, right? And imagine you get, you read 50 books, you wanna learn something. You can read all 50 books, or you could read three books that are really important, or you could read summaries of the three books that are the most important. The problem is, we don't know which they are at the beginning, and there's a process that you could learn. There's things called drop-out and a- all these other techniques to, to use sparsity to help, help solve these problems. We are early in the evolution of AI. Plays right into this point that we'll get better at these algorithms. You know, transformers aren't the end of the world, right? We, we, we'll get better. Better will mean faster, more accurate, and more efficient. And I, I think that's what's exciting about an ever-changing industry. I mean, that's not... That's why I'm not in all these other industries that don't change quickly. Same nine years ago as they are today.

    8. HS

      But this show is kind of strange for me, because I speak to a lot of people, and they think about the three pillars, and they're like, you know, "compute, algorithms, and data." And, you know, actually, a lot of the, uh, common refrain is that, actually, we're very far along in all of them, um, and that has been the refrain. And when I hear you, it's like, actually, it's very exciting, 'cause I'm like, "Shit."

    9. AF

      I, I think they're wrong. I think they're wrong. Uh, I, I, I don't think we're very far along, and it's very difficult to say that we are early in an industry but we're far along on all its underpinnings. Th- then what? Right? I think we are early in all of them.

    10. HS

      If we just take them one by one, in five years' time, how much synthetic versus human data will be used to train models, if you were to put a percent on it?

    11. AF

      Almost all synthetic.

    12. HS

      And the utility value of synthetic is the same as human?

    13. AF

      I think this. I think when you teach a pilot to fly in a simulator-... right? Th- there is a lot of potential data that isn't very useful in teaching a bird to fly, right? The- they spend a lot of time going straight doing nothing, as a pilot. Now, takeoff and landings are where you wanna spend your time, and that's why when we put them in simulators, that's what we have them doing. And in simulators, we can create data where engines blow, where there are a whole set of problems where learning can take place. That's simulated data. And in the same way, as we think about creating data, uh, whether it's for self-driving, whether it's for other forms of AI, what we want is the data that's hard to gather (laughs) , right? Otherwise, we just have a bunch of data of people driving straight on a freeway. Not, not difficult. We've been able to do that for a decade. What we want is an unprotected left turn in the snow. It's snowing, it's hard to see, you've got an unprotected left turn. That's a difficult thing, and you want that thousands of different ways, millions of different ways. That's the sy- where the synthetic data comes along, is to use it to fill in the empty parts where it's really expensive or painful to get that type of data. Think of the pilot, right? You want them spending a huge amount of time on things that are rare in their training. Same with a surgeon, a huge amount of time on things that are rare. Most of the time, it's carpentry, but their expertise is only when it's rare. Something happens, the unexpected occurs, right? That's when their mettle is shown. And I think we, we will get better at synthetic data by, by a great deal.

    14. HS

      I love it. I get it. From a consumer perspective and from an expectations perspective, if we move the needle on compute algorithms and data, what does that mean for the experience of AI?

    15. AF

      It gets faster-

    16. HS

      Faster and cheaper?

    17. AF

      ... and cheaper. Faster and cheaper. Faster and cheaper is the first answer. The second is, is when things become faster and cheaper, new applications emerge. So, it, i- it's used everywhere, right? When, uh, when computers became faster and cheaper, suddenly they were in cars, and then they were in your pocket, and then they were in your dishwasher and in your TV and... Right? That's what happens. I mean, we... 30 years ago, you're like, "I need a computer in my TV. Are you kidding me? I need one in my pocket." Now, you've got powerful computers in your pockets. You've got them in your TV, you've got in your kids' toys, you've got in the car. That's what happens. Diffusion of innovation accelerates-

    18. HS

      (laughs)

    19. AF

      ... when you make things faster and cheaper.

    20. HS

      This is Jevons paradox and Thatcher's belief then, isn't it?

    21. AF

      Yeah. Yeah, that, uh... I know in the VC community you gotta, you gotta cite 19th century English economists. Uh, I, I think Jevon (laughs) -

    22. HS

      I'm English. I, I'm English. Now, this, come on. Right? If, if I'm not allowed to cite an English philosopher, what am I here for?

    23. AF

      Look, I mean, I, I think, uh-

    24. HS

      Are you just like, "Oh, these fucking VCs." Just being like, "Oh, Jevons paradox."

    25. AF

      (laughs) No, that's right. It's like, look, make stuff cheaper and faster. There are no... There are very few examples in our industry, actually none in compute in 50 years, in which by making things cheaper and faster, the market got smaller. Market always gets bigger. Um, always.

    26. HS

      Can I ask, from an architectural standpoint... You mentioned transformers there. Is there a world where we move past transformers?

    27. AF

      There is a world.

    28. HS

      Where transformers 100% does-

    29. AF

      We, we won't be as dependent on transformers in three years or five years as we are now, 100%. They're not the end-all-

    30. HS

      Wh-

  8. 36:5037:16

    What Specifically Was So Impressive About DeepSeek?

    1. AF

      The problem is, right, i- is you gotta be a little bit consistent. Uh...

    2. HS

      Well, Saam's been a guest many times, and we hope he will be again, so I'm not gonna ask any-

    3. AF

      Well, I, I, I hope he is too. No, I'm, I'm, I, I think neither are wrong actually. But I think you have to be consistent.

    4. HS

      Well, the thing is, with it bluntly, DeepSeek is open.

    5. AF

      That's true.

    6. HS

      So everything that they did innovate on, OpenAI can learn from and take too.

    7. AF

      Y- Th- That's... Look,

  9. 37:1638:07

    Why is Distillation Not Wrong and OpenAI Need to Look in the Mirror?

    1. AF

      I think there are few examples of an open source anything having the sort of immediate impact that model had. Right? I mean, that, that model had a giant impact in a technical community of really smart people. And there are very few examples of other open source, uh, software projects that had that type of impact in that amount of time. Usually, you, you know, you're, you're in the business of betting on these guys. They, they, they ramp up, and they, "Oh, look, 10,000. Now it's 100,000 users. It's now a million users. We better, better start a company around that." Right? Get those grad students. But th- this had a loud boom, uh, in the industry immediately. It was like, "Whoa."

  10. 38:0740:13

    Where Will Value Accrue in a World of AI?

    1. AF

    2. HS

      The, the thing I have to think as a venture master is where is enduring and defensible value simply? And how do I get in early and, and build that over time?

    3. AF

      In hardware, Harry? (laughs)

    4. HS

      Well, this was my question, which is like, well... I mean, you have to be a very smart investor like Aravishia to do hardware, to be clear. Um, but on the model side, do you think there is value when you look at the sheer number of players all with relatively comparable models?

    5. AF

      I think to, to demonstrate enduring value, you need both immediate value and a trajectory for more. Right? I, I think the problem is, in some industries, you, you are capable of demonstrating a leadership position for a short period of time. And then someone else, maybe the next generation, they generate the next, and the next generation, the next. And I, I think that ends up in the soft world being, you know, you, you're competing against other people's release cadences. You're four months ahead, they're six month... That, if that's really where you are, there's not a lot of value. But if you can stay at the top, uh, over years, right, even if you're not the best, even if you're, you know, top decile over years, and the people above you are changing constantly, I think there's a lot of value. I think, uh, very large Silicon Valley companies have been built, uh, with sort of not the most compelling technology. It might have started the most compelling technology, and then it got to a point where it was good enough. It was easy enough to use. It was... Well, g- Right? That's, that's when you're at the mature market. But we're a long way from there right now. Right now, we are in the early phases. You're eith- you know, you, you characterized my position exactly right. Data, compute, algorithm. I think we have a ton of room for improvement on all

  11. 40:1348:18

    How Will NVIDIA’s Market Position Change Over the Next Five Years?

    1. AF

      of them.

    2. HS

      You said that, you know, compute and hardware, that's where the value is. How does that value distribution shake out? You know, we've obviously got the 800-pound gorilla that is NVIDIA. How do you think about how the distribution of value shakes out in hardware and in compute over the next five years?

    3. AF

      Hi- historically, um, one, one of the... One of the barriers to entry was s- the capital intensi- intensivity of a prod- of a project. And in, in the world of building chips, there's both scarce resources and expertise, and it's very expensive. Um, a- a- a- And historically, it, it hasn't fit very comfortably in a software company. And the things that software, modern software companies value are not entirely conducive to chip-making. And so when I look down the road, I mean, I, I think, uh, who has endured in, uh, in, in much of infrastructure tech? Uh, people who build systems. Cisco, Juniper, endured. Um, uh, chipmakers have endured. There's a reason that Apple and NVIDIA are among the most valuable companies on Earth. There is, what they do is hard. And I, I think it's, th- th- that's why it's worth challenging. Right? That, that's (laughs) ... If it weren't hard, if it wasn't enormous and difficult, you know, why spend time being the underdog and, and challenging it?

    4. HS

      A lot of people place defensibility around NVIDIA's kind of CUDA lock-in. To what extent is that real versus hype?

    5. AF

      In inference, it's not real at all. There's no CUDA lock-in in inference. None. Uh, w- you can move from OpenAI on an NVIDIA GPU to Cerebras to Fireworks service on something else to Together to Perplexity with 10 keystrokes.I mean, what the, a- anybody who actually uses AI knows there's no CUDA lock-in and inference. I think there is, there was a fundamental effort to dis-intermediate CUDA, first by Google with TensorFlow, and first by some grad students with Caffe and some of these early efforts, but later by Google with TensorFlow, and then Facebook or Meta with PyTorch. I, I think today most, most AI is written in PyTorch. You ought to be able to compile it and run it on, on your hardware. Uh, I, I think NVIDIA has many moats. I think when you are a dominant market share leader, that in itself is a moat. That you're the default solution is a moat. That everybody learns to, to, to think about AI in your structures, those are moats. Um, the software, you know, it, compilers are hard but th- they're tractable.

    6. HS

      I completely agree with you in terms of kind of being the leader is a moat in itself.

    7. AF

      It is.

    8. HS

      When you-

    9. AF

      And it's never talked about that way. (laughs) And, um, I mean, look at it this way-

    10. HS

      Would you put OpenAI in that same... It is the leader, everyone's mother knows ChatGPT, therefore...

    11. AF

      Let, let's look at Intel, right? Intel has made, until hiring Lipu, prior to that, nearly a, a decade of catastrophic decisions, right? And they still own 80% of the x86 market. (laughs) 75% of the market, right? AMD has worked up to, like, 25% or 30%. And after a, a decade of screwing up, and you ask yourself, right, "Th- that's a moat," (laughs) right? How big's my moat? I can make a bunch of bad decisions for a decade and only lose 20% share. (laughs) Uh, that's extraordinary. The moat was just unbelievable. Um, we'll see, I mean, I'm a huge fan of Lipu's, he's an investor in our company, I, I, I wish him well. Um, and I, I think if anybody can change that company, he can. But, uh, I, I think we rarely talk about what being the market share leader means in terms of a moat in, in the right context. Because as a challenger we have to think about it exactly, because it's exactly that that we need to, we need a bridge for, right? It's exactly th- these characteristics of the moat that we need to get over.

    12. HS

      In five years' time, though, is it Uber or is it, like, AWS and cloud? And what I mean by that is, like, cloud is an interesting market where, like, you know, a couple of players, well, several players have relative segments, 25, 30%, and it's shared relatively evenly between them. Not exactly, but relatively. Or is it one like Uber, where Uber has 90%, Lyft has five-

    13. AF

      Right.

    14. HS

      ... and then there's alternative providers with the other five?

    15. AF

      I think it's gonna be between those two. I, I think, uh, in five years from now NVIDIA's gonna have 60. Somewhere between 50 and 60% of market, right? I think right now they have approximately all of it. Um, I think they will come down over time.

    16. HS

      Of NVIDIA's usage, what percent will be training versus inference?

    17. AF

      I think they will continue to have a, a meaningful business on both sides. I think they're exceptional at, uh, uh, at, at, at training. I, I think, uh, they will not roll over and play dead (laughs) in inference. I, I think they are, uh, a really, look, they're a world-class company. I mean, they've had one of the great decades of any company in history. I mean, from 2014 they were worth, what, 10 billion? To where they are right now, that's one of the great decades in corporate history. Uh, and I, I don't think that, that they're gonna roll over and, and... Oh yeah, we're, we're not gonna be in the, in the inference market. That's not gonna happen. They're gonna have meaningful share. But the market's growing and we'll have a piece, I think others will have a piece. I think there'll be some very big companies made in this 100x growth, some very big companies.

    18. HS

      Do you think chip providers will be far larger than model providers in terms of enterprise value?

    19. AF

      In the five-year timeframe? Yes.

    20. HS

      How does that prediction change in a different timeline?

    21. AF

      I think in a shorter timeline, I think, you know, when you price an option, variance and uncertainty increases the option's value, right? If y- if you look at the, the way Black-Scholes works or if you look at any option pricing model, un- uncertainty is a, a, a friend, variability is a friend of the value of the option. And when people are paying these extraordinarily high prices for, uh, model companies right now, I, I think part of that is this extraordinary uncertainty, is this wild variance. Um, and so in the shorter run, it, it, it might not be the case. But in the longer run as markets mature, as we begin to understand the value of these models, we understand w- what their businesses look like, what their long-term, uh, net profitability looks like... What did Warren Buffett say about markets? In the short term they're a voting mechanism and in the long term they're a weighing mechanism, right? At, at some point the weighing kicks in, and, uh, usually it's in the public markets. And then investors say, "Which, which is likely to give me better growth in the

  12. 48:1849:11

    Why is the CUDA Locking for NVIDIA BS? What is Their Weakness?

    1. AF

      future?"

    2. HS

      I mean, listen, you mentioned the word public there, I do wanna just zone in on your business. You're cashflow positive in a world where everyone else literally bleeds cash. Help me understand, what do you do to make you cashflow positive when everyone else is bleeding or hemorrhaging cash?

    3. AF

      Traditionally your, your, your, your gross margins were a measure of, of your technical differentiation, right?And I, I think if you're running a, a negative gross margin business, I, I think you're ... it speaks for itself. You're, you're selling commodity. You're, you're not, uh, you're, you're value creation isn't being recognized in the market. And so, I, I think our technology is creating an opportunity for us to maintain margins where some others can't.

  13. 49:111:01:22

    Why is Trump Better for Business than Biden?

    1. AF

    2. HS

      A lot of your revenue is concentrated to the G42 deal. To what extent is that a strength or a weakness?

    3. AF

      It's both. Look, the way you, the way you catch three large customers is to catch one first. The way you build three large strategic partners is learn to be a strategic partner. That, that's a learned skill. I, I think, uh, we, we didn't arrive knowing how to be a strategic partner at G42. And now that we've worked at it and worked at it, it's a muscle we can replicate. We could be a better partner to any of a dozen different companies in the world.

    4. HS

      What have you learned in the G42 relationship build process that, that, that makes you now a good partner in a way that you weren't?

    5. AF

      Oh, look, we, we deployed, uh, tens of exaflops of compute, vastly more than, than anybody else that, that, that isn't AMD or NVIDIA, right?

    6. HS

      Mm-hmm.

    7. AF

      I mean, at a, a huge amount of compute. Um, we, uh, our software's been hardened on some of the largest AI clusters in the world. Uh, we've gone through the growing pains of increase in manufacturing, 2X and 5X and 2X again.

    8. HS

      Mm-hmm.

    9. AF

      I mean, through unbelievable growth in manufacturing. We've worked with our supply chain partners to, uh, to be sure that they're ready for this extraordinary growth. We've ... I mean, I, I think whe- when you, uh, work with a strategic partner, um, of this size, uh, your organization comes out different on the other side. And there are things you've learned and there are mistakes you've made, and, and, you know, you ... I hadn't done a big relationship in the Middle East. There was a huge amount to learn. And, um, you know, I think you come out a, a much better company and much better prepared to do, uh, business with a hyperscaler, to do business with another massive partner, to do business with, uh, another sovereign. But it, it's ... it takes real work. And your team has to learn.

    10. HS

      You said you'd come out better.

    11. AF

      Yeah.

    12. HS

      Why go public when you did? When this happened, I was like ... It seemed preemptive, respectfully. And my question now to companies is, why go public at all? You know, bluntly.

    13. AF

      (laughs)

    14. HS

      There is so much private capital. The Collisons have shown, I think, very clearly that you can stay for a lot longer than you plan to.

    15. AF

      Databricks has certainly shown that, right?

    16. HS

      Why go public?

    17. AF

      I mean, they're ... those were historically, uh, public market valuations, and, uh, you know, the valuations that Anthropic and OpenAI and some of the others are getting are historically public market only valuations.

    18. HS

      And like you said, your, your S1's live. Anyone can read it. Uh, I wouldn't want people reading mine. (laughs)

    19. AF

      (laughs) We have nothing to hide. I mean, I, I think-

    20. HS

      No, but your, your competitors have got asymmetric information.

    21. AF

      Yeah. We've got asymmetric technology, right?

    22. HS

      Yeah.

    23. AF

      I mean, I, I think you, you have to be pretty ... T- t- to be public, I think there are ... uh, you, you have to be ready organizationally. You have to- b- be ready with your processes. You need to be ready to, to forecast and predict, to be held accountable, uh, in, in a way that, uh, private companies historically haven't been. We think that there's tremendous value. Uh, we think that we will be among the first in the category. We think we, uh ... that some of our, uh, our largest targets, uh, would have, uh, a stated preference for doing business with public companies. Large enterprises in the US have, have done that historically. Um, those were some of the reasons that led us to.

    24. HS

      How many G42 relationships, à la G42, will you have in the next 24 months? How fast can you ramp them?

    25. AF

      Um, that's a good question. Several.

    26. HS

      Sorry, remind me. How big is the G42? It's 87% of revenue. I know that.

    27. AF

      It was big. I mean, when we announced it, it was ... some estimated it was north of a billion.

    28. HS

      Wow. Well done. That must be a bit of a high five, isn't it? (laughs)

    29. AF

      Look, I, I think, uh-

    30. HS

      (laughs) Come on.

  14. 1:01:221:05:23

    Do We Underestimate China in a World of AI?

    1. AF

      than money.

    2. HS

      Do you think we fundamentally underestimate the Chinese's-

    3. AF

      100%.

    4. HS

      ... capabilities?

    5. AF

      100%. I, I think, and it is one of the most obvious and frequent, uh, errors in judgment, is that you underestimate the other side. I think, uh, you, you have to, to look carefully at what they're doing, and their investment in infrastructure has been extraordinary. Uh, the rate at which they generate engineering talent is exceptional. The government's ability to have a policy and implement it, you know, th- th- they're... You know, that's not a democracy. They, they weren't designed to have checks and balances there. Right? Um, the, uh, funding that f- flowed into the development of, of AI technology, that their venture capitalists were backstop by their government, that, uh, they have national champion companies, that they've developed, uh, a belt and suspender strategy to sort of make much of the third world dependent, uh, o- o- on them and their technologies. I, I, I think they absolutely should not be underestimated. They have a lot of people, and we see a tiny fraction o- of it. I, I think they have produced industrial policy that, that has moved their nation forward.

    6. HS

      What was the most significant, do you think?

    7. AF

      I think they, the creation of economic zones like Shenzhen w- was clearly a visionary move. Um, they had, uh, they, they knew that, that their own system was in the way, and so they created zones that, that relaxed their own system.

    8. HS

      Could the US learn from them in that way?

    9. AF

      We did some of the same things, uh, in Trump won administration, right? What, what did we do? We relaxed our, our, uh, our own rules in the, the development of vaccines. We, uh, we knew that in, in this time, it would be very difficult to go through the, the steps that we always go through, and we tried to implement some thoughtful shortcomings, or workarounds, rather. You know, why are they committed to trains as a mode of transportation, and, and we can't build a decent train system in the US, or in California? Or why we have three different standards for train rails, and the rest of the world can, can build extraordinary high-speed trains linking important cities? Um, wh- why is it, w- w- what are we doing wrong in the building of, of our infrastructure, that, that our bridges and our freeways are in disarray? Um, I think those are, are questions we gotta ask ourselves when we see other people doing it differently. I mean, you know, if, if, if, if, if you watch a good football team, and you say, "Whoa, that's an interesting offense," and you're not thinking to yourself, uh, w- "How could our team learn? What, what could we do? What... Why did that work? What was it about the people they had, or the talent, or the structure, or something that made that a successful series of plays? Um, and w- what, what, what can I take away from that? How can that inspire me to do, to do better?" I mean, I'm, I'm always looking for, for inspiration in, in others, in competitors, in, uh, partners, but I think, you know, we... I mean, we have... You know, some of th- our partners at G42, I mean, the work ethic is unbelievable. It inspires me. Um, and the, the scope of the challenge they're undertaking inspires me. And, and I, I think I'm always looking

  15. 1:05:231:14:26

    Quick-Fire Round

    1. AF

      for that.

    2. HS

      Andrew, I could talk to you all day. I do wanna do a quick-fire with you. So I-

    3. AF

      All right.

    4. HS

      ... say a short statement. You ready?

    5. AF

      Yeah, sure.

    6. HS

      What do you believe that most around you disbelieve?

    7. AF

      I think we're closer to peace in the Middle East than people believe because I, I think, uh, th- there is a rise of a, of a moderate business-focused Arab state that it wasn't there 25 or 30 years ago. And I, I think, uh, if you visit the UAE, or Qatar, or, or even KSA, uh, what you see is amazing transformation. And I, I think, uh, th- there is a, a desire for, uh, to, to be included in the West in their own way, but also to, to, to enjoy the benefits of it. Um, I, I think it's, uh... Yeah, I, I think there's, um... We are closer than, than maybe people think.

    8. HS

      What's the most underrated threat to NVIDIA's market share dominance?

    9. AF

      The fundamental architecture of the GPU, uh, with off-chip memory is not great for inferences. Now, they will continue to do well in inference, but it's... It can be beaten, and I think they know it.

    10. HS

      What's a crazy AI prediction you have that most people would call science fiction? You know, Dario at Anthropic said, "We'll live to 150."

    11. AF

      I don't think we're gonna live to, to-... uh, 150. I, I don't think what, uh, 90% of our code will be written by machines in this year. Um, but I do think that within a year or two, AI's penetration will be approximately the same as telephones, cell phones.

    12. HS

      What have you changed your mind on in the last 12 months?

    13. AF

      Oh, I, there are lots of things. Um, many decisions I made (laughs) turned out to be wrong. I mean, I, I think-

    14. HS

      What was, what was the most wrong decision?

    15. AF

      There, there are two ways you can be wrong. You can actively be wrong, or you can fight against what was right. Um, in, uh, 2016, JP, one of our co-founders and chief system architect, laid out a plan that would have us doing water cooling and, for our systems. And nobody else was doing it, and I fought so hard, and I was so wrong. Um, JP was right. Uh, about a year or two later, Google announced that the TPUs were gonna be water cooled. You know, we, we were first, and, uh, now NVIDIA's only selling water-cooled parts. I mean, I, I was dead wrong, and JP was right. I mean, I, I think many, many instances when you make a lot of decisions every day where, where you're wrong. I've been wrong about people. Um, people I thought were pretty good turned out to be extraordinary. People I thought would be extraordinary, uh, were really smart, but couldn't finish projects and get stuff done. Um, I, I think I- if you're not prepared to be wr- wrong a fair bit, you ought not to be making a lot of decisions because it comes with the territory.

    16. HS

      As a venture capitalist, I'm never wrong, so I don't need to be.

    17. AF

      As a venture capitalist, you're wrong nine times in 10, and everybody forgets as long as you're really right.

    18. HS

      And I get a picture of you signing the term sheet with me and we've got

    19. NA

      (laughs)

    20. AF

      That's right.

    21. HS

      ... protected and I go-

    22. AF

      That's right. I mean, I, I think yours is a perfect industry in which nobody cares about the average. On average, you're wrong all the time. And what they care about is the, the occasional time you're really right, and that, that's what moves a fund. And I, I think that's different than, than, than being a CEO. I think we gotta be mostly right most of the time. But if you're making a lot of decisions, you're still making a, a ton of mistakes.

    23. HS

      This is your fifth startup.

    24. AF

      Yeah.

    25. HS

      I mean, you are a sucker for punishment, aren't you?

    26. AF

      (laughs)

    27. HS

      Uh, I mean, really, like five times. Like Christ, Andrew, did you not get beaten alive enough? Uh, my question to you, though, is like, I believe in the value of serial entrepreneurship.

    28. AF

      Right.

    29. HS

      I've spoken to many who don't. Respectfully, how do you think about the inherent benefits that you have, having done it four times before?

    30. AF

      Okay. I think if you are in a business in which running a business is a benefit, then experience matters a great deal. Uh, I think if you are in a business in which you look like your customer, there was a reason why social, social networks were started by people right outta college or in college is be- because dating is top of their mind, right? I mean, that, that, that's... And they look like their customers. And that was more important than knowing anything about running a business. And so in that environment, it will certainly select for people who are of the demographic that, that their customers are. They know that backwards and forwards. But if you wanna have a business has manufacturing in it, it has a supply chain that, uh, uh, has you managing hundreds or thousands of engineers to a timeline, to a schedule, I, I, I don't think anybody would turn around your statement and with a straight face say, "You know, what I'm looking for is an engineering leader with no experience." Right? What, no, and I, I don't want somebody who's led a team of 4 or 500 who has experienced the challenges of growth. What I'm looking for is somebody with no experience.

Episode duration: 1:14:36

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode MW9vwF7TUI8

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome