Jonathan Ross, Founder & CEO @ Groq: NVIDIA vs Groq - The Future of Training vs Inference | E1260

Jonathan Ross is the Founder & CEO of Groq, the creator of the world’s first Language Processing Unit (LPUTM). Prior to Groq, Jonathan began what became Google’s Tensor Processing Unit (TPU) as a 20% project where he designed and implemented the core elements of the first-generation TPU chip. Jonathan next joined Google X’s Rapid Eval Team, the initial stage of the famed “Moonshots Factory”, where he devised and incubated new Bets (Units) for Google’s parent company, Alphabet. ---------------------------------------------- In Today’s Episode We Discuss: (00:00) Intro (01:29) Scaling Laws and AI Model Training (06:48) Synthetic Data and Model Efficiency (09:00) Inference vs. Training Costs: Why NVIDIA Loses Inference (15:12) The Future of AI Inference: Efficiency and Cost (16:35) Chip Supply and Scaling Concerns (19:40) Energy Efficiency in AI Computation (25:37) Why Most Dollars Into Datacenters Will Be Lost (31:41) Meta, Google, and Microsoft's Data Center Investments (43:24) Distribution of Value in the AI Economy (44:17) Stages of Startup Success (45:46) The AI Investment Bubble (47:45) The Keynesian Beauty Contest in VC (51:52) NVIDIA's Role in the AI Ecosystem (57:30) China's AI Strategy and Global Implications (01:02:25) Europe's Potential in the AI Revolution (01:17:13) Future Predictions and AI's Impact on Society ---------------------------------------------- Subscribe on Spotify: https://open.spotify.com/show/3j2KMcZTtgTNBKwtZBMHvl?si=85bc9196860e4466 Subscribe on Apple Podcasts: https://podcasts.apple.com/us/podcast/the-twenty-minute-vc-20vc-venture-capital-startup/id958230465 Follow Harry Stebbings on Twitter: https://twitter.com/HarryStebbings Follow Jonathan Ross on Twitter: https://twitter.com/JonathanRoss321 Follow 20VC on Instagram: https://www.instagram.com/20vchq Follow 20VC on TikTok: https://www.tiktok.com/@20vc_tok Visit our Website: https://www.20vc.com Subscribe to our Newsletter: https://www.thetwentyminutevc.com/contact ----------------------------------------------- #20vc #harrystebbings #jonathanross #groq #CEO #venturecapital #founder #ai #nvidia #modeltraining #inference

Jonathan RossguestHarry Stebbingshost

Feb 17, 20251h 25mWatch on YouTube ↗

EVERY SPOKEN WORD

150 min read · 30,097 words

0:00 – 1:29
Intro
1. JRJonathan Ross
  We did not raise 1.5 billion, that's revenue. That's actually about 30% of the revenue of OpenAI. Your job is not to follow the wave, your job is to get positioned for the wave. You could almost say we're one of the best things that's ever happened to NVIDIA, because they can make every single GPU that they were gonna make, and they can sell it for training, high margin, gets amortized across deployment. You know, we'll take the low margin, high volume inference business off their hands, and they won't have to sell you the margin. We're growing faster than exponential. And when you are growing faster than exponential, there is no amount of profit that you can make that matters. What matters is getting a toehold in the market and becoming relevant.
2. HSHarry Stebbings
  Ready to go? Jonathan, thank you so much for agreeing to do this in Paris. You look fantastic, by the way. I feel so underdressed, but you look great.
3. JRJonathan Ross
  Thank you. I, I could take the tie off if you want, but I'll never be able to tie it again. I don't know how to tie a tie.
4. HSHarry Stebbings
  (laughs)
5. JRJonathan Ross
  No, literally, my Chief of Staff has to tie it for me, it's, and it's, like, a struggle 'cause, like, he's putting it on himself, he's tying it. I, I literally only bought this suit recently.
6. HSHarry Stebbings
  Well, I mean, you look fantastic. I don't, I, I think, have a suit, so you're one up on me. I wanna split the show into two parts today. I wanna talk about the landscape where we're at, and then I wanna dive specifically into Grok, where you're at. You've announced a massive new deal that I think everyone's slightly misunderstanding-
7. JRJonathan Ross
  (laughs)
8. HSHarry Stebbings
  ... what we were just talking about. Um, I just want to start on
1:29 – 6:48
Scaling Laws and AI Model Training
1. HSHarry Stebbings
  where we're at. In terms of, like, scaling laws, everyone says we are at the limits of scaling laws, and then there seems to be exponential innovation happening with the likes of DeepSeek and others.
2. JRJonathan Ross
  Yeah.
3. HSHarry Stebbings
  Where are we at in terms of the limits of scaling laws?
4. JRJonathan Ross
  So scaling laws is a paper that was published by OpenAI, and what it does is it effectively says the more parameters your model has, basically the better it can absorb information. So you'll, you'll see these curves that they draw, and they're, they're amazing. You should show it if you can. But effectively, um, you have these sort of asymptotic drop-offs where y- you keep getting better and better, but you get a logarithmic improvement when you put a linear number of tokens in. This is why you see people doing 15 mil- uh, trillion tokens of training and whatnot. But they're misunderstood because, um, the assumption is that all of the, all of the data is the same quality. So, uh, you have a kid now, right?
5. HSHarry Stebbings
  Mm-hmm.
6. JRJonathan Ross
  So eventually, you're gonna be training your kid, and you're gonna say, and play along with me here, "What's one plus one?"
7. HSHarry Stebbings
  Two.
8. JRJonathan Ross
  What's two times three?
9. HSHarry Stebbings
  Six.
10. JRJonathan Ross
  What's the second derivative of the square of the hyperbolic tangent?
11. HSHarry Stebbings
  Yeah. Yeah, good question.
12. JRJonathan Ross
  But, but that's how we train these models. We give them really simple problems to solve, and then we give them these really hard ones. We, we don't really train them up. We don't do it smart. So what some people do is they will train on the dregs of the internet, and then they'll save some high-quality data for the end to make them better. But what you can do, and this is where I think everyone's getting confused, is it's sort of like with AlphaGo Zero, where it generated its own data and trained. You can have an LLM generate synthetic data, and when it generates the synthetic data, the data's better. You then train on that synthetic data. So what you do is, is you, you train, you train-
13. HSHarry Stebbings
  Why is synthetic data better than real data?
14. JRJonathan Ross
  Because the model is smarter. So, you know, Reddit is great, but not necessarily as high-quality as talking to someone with a PhD in a topic.
15. HSHarry Stebbings
  Sure.
16. JRJonathan Ross
  And so, just like with, um, more expert people who are more knowledgeable and more capable, if you have a better model, it generates better data. So you train the model, it gets better, you, you produce better data, and you produce a, a range of data here, and you get rid of all the parts that are wrong. So now it's the best part, so it's a little better than the model is, 'cause you're pruning it, 'cause you get to do this offline, right? And then you train the model, and the model comes up here, and then you do this again, and then you keep the better data. You train it again, you just keep moving up. So when you do that, the actual scaling laws don't look like these asymptotics. They actually improve-
17. HSHarry Stebbings
  But there has to be a ceiling on efficiency. No?
18. JRJonathan Ross
  Does there? So there's a mathematical limit. So if you-
19. HSHarry Stebbings
  (sighs)
20. JRJonathan Ross
  If you study computer science, you've probably heard of something called big O complexity.
21. HSHarry Stebbings
  Yeah.
22. JRJonathan Ross
  Big O complexity is, um, you know, i- i- if I am solving a problem, and I look at how I solve it, I might need to take more steps if I solve it with one algorithm versus another. So for example, quicksort versus bubble sort. Quicksort, I need n log (n) steps. Bubble sort, I need n squared. What's the difference? If I'm sorting 1,000 numbers, n log (n) , that's 10,000 steps. But with, um, n squared, that's a million steps, because it's either 10 times 1,000 or 1,000 times 1,000. One of the reasons that these LLMs struggle to multiply large numbers is because, um, multiplying is not linear. These LLMs can do anything linear without, you know, needing to think. But just like on a piece of paper how you need to write out all those intermediate steps, these LLMs need that intermediate space in, in order, and, and those steps, in order to compute these things. It's a mathematical requirement. There's nothing, you cannot train a model enough so that it'll see any arbitrarily large number and just be able to multiply it. But you can choose bigger and bigger groupings of numbers for it to memorize, in which case it can do it in fewer steps. And effectively, as you are training the model on more and more data, it's seeing more and more examples, so now it just has the answer for more specific situations, so it doesn't need to do as much reasoning. But it still needs to do reasoning for some of these problems.
23. HSHarry Stebbings
  So what does that mean for the next step?
24. JRJonathan Ross
  In terms of?
25. HSHarry Stebbings
  What happens now? If we have no efficien- like, if we have no efficiency ceiling, what does that actually mean?
26. JRJonathan Ross
  But you need both. So the, the training of the model makes it more intuitive. It means that it can sort of just...... come up with the answer like that, more stream of consciousness. The reasoning part is different. The reasoning is the algorithm on top, right, the- the big O complexity portion. So it's system one, system two thinking, or thinking fast, thinking slow, like Daniel Kahneman's book. And so when you pair them together, when you make it more intuitive, it, you know, you get- you get better this way, right? But when you start adding in the system two portion, you start to get this, right? You- you hear the- the volume is very little, but when you do this, and so you get this, um, polylinear is the term, but you could think of it as geometrically increasing, improvement in the model when you combine it with that improved training, but also the improved what they call test time compute or runtime
6:48 – 9:00
Synthetic Data and Model Efficiency
1. JRJonathan Ross
  compute.
2. HSHarry Stebbings
  Totally get that. So just so I understand. So when we think about bottlenecks, if we have synthetic data that powers the training...
3. JRJonathan Ross
  It gets more intuitive-
4. HSHarry Stebbings
  Yeah.
5. JRJonathan Ross
  ... because it gets to the answer more quickly, sort of like a grandmaster in chess just seeing the right moves.
6. HSHarry Stebbings
  Sure, but synthetic data is not constrained in terms of its supply size. If we think-
7. JRJonathan Ross
  That's true.
8. HSHarry Stebbings
  ... if we think about the other bottlenecks, there is hardware, there is energy efficiency, there's algorithmic limits. What is the bottleneck?
9. JRJonathan Ross
  But- but if I'm tell, if- if your job is to get better at multiplying numbers-
10. HSHarry Stebbings
  Yeah.
11. JRJonathan Ross
  ... and I tell you that I want you to be able to do it with fewer steps, more intuitively, for you to be able to multiply three-digit numbers versus two-digit, you need 10X the data, and you need 10X the examples, right? And so as you get better on the intuitive part, you need more examples to train on. Make sense?
12. HSHarry Stebbings
  Sure. Totally. And so what is the bottleneck then there? Is it the hardware quality? Is it compute? Is it algorithms? 'Cause it's not data.
13. JRJonathan Ross
  It is the compute. It is the data. It is the algorithms. It's all three of them. But, uh, so people misunderstand the concept of a bottleneck. Compute has been more of a- a- w- less of a bottleneck and more of a- a sort of a, you know, soft neck or something, right, where when you provide even more compute, you can sort of overpower the lack of data, the lack of improvement in algorithms. So it's not a hard bottleneck, it's a soft bottleneck. But ideally, you would improve all three. You would be getting better data. You would be getting better algorithms. And the algorithm improvements are gonna be there. The- the data improvements are gonna be there, but compute has always been the easiest lever because it's so fungible. If I just give you more compute, it works better.
14. HSHarry Stebbings
  Has DeepSeek not shown you that actually we don't need the compute and you can do more with less?
15. JRJonathan Ross
  Not exactly. There was an algorithmic improvement on that, and the algorithmic improvement, as I explained, w- you know, is this seemingly silly thing where they just wrote the answer in a box and then they knew what to look for rather than ha- having to have a human being check it or something like that, right? It was very simple. Um, but that was an algorithmic improvement, and it made it easier to generate the data that was then
9:00 – 15:12
Inference vs. Training Costs: Why NVIDIA Loses Inference
1. JRJonathan Ross
  trained on.
2. HSHarry Stebbings
  Can I ask, I think there's misconceptions around compute, data, uh, especially kind of synthetic- synthetic data, as you said there, algorithms. When you think about the biggest misconceptions that people have around AI and s- specifically kind of inference, what do you think they are?
3. JRJonathan Ross
  When we started, the first misconception, which people don't hold anymore, is that training was more expensive than inference. Um, at- at Google, anytime we would train a new model, we would end up using 10 to 20 times as much compute on the inference as the training. So we always ended up j- d- Inference was always the- the critical infrastructure piece that we needed. Um, but then after getting, you know, past that, now everyone understands inference is important. Um, I think one of the-
4. HSHarry Stebbings
  Do you think they fully do? 'Cause when you look at NVIDIA's stock price post-DeepSeek, it was down 15%. If you understood the value of inference, shouldn't be down 15%.
5. JRJonathan Ross
  Well, and Jevons paradox and all that and, yeah, I- I don't agree that NVIDIA stock should have gone down for that. I think that was a- a mis-, um, understanding on most people's part. But it also shows, I think that shows more, like, everyone keeps saying NVIDIA stock can't possibly go higher, right? And they were looking for an excuse for, "Oh, now that's it. That's why we were wrong and we need to sell now." But that has nothing to do with the, that- that's just a- a sort of popularity contest side of the market. That had nothing to do with the weighing machine of the market.
6. HSHarry Stebbings
  So should founders building today, should they build with the assumption that scaling laws will continue? Should they build with what we have today? How do you advise them on that?
7. JRJonathan Ross
  I would advise you to build based on things getting better, but I would also focus a little more on the- the sort of big quantum steps. So the analogy that I like is if you look at the information age, we went through, w- we had the printing press, we had the telephone, we had the telegram, we had the, um, internet, and we had smartphones, right? And if you had built Uber back when we had, um, uh, internet, it wouldn't have worked because you'd book a ride. You'd go somewhere. How do you get home?
8. HSHarry Stebbings
  (laughs)
9. JRJonathan Ross
  Exactly, right? And, uh, we're in the same sort of space now. So w- we don't, the models hallucinate. So it would be hard to build a medical diagnosis company. It would be hard to build a legal company, right? However, if you were doing that and the algorithmic enhancements happen that get the hallucination rate down, you are perfectly positioned, just like Grok. We were around for seven years before we had product market fit, right? We were around. We, our bet was scaled inference, that inference was gonna be the bottleneck, that we were gonna need to run really big, heavy models. Like, everyone was assuming you would have a single PCIe card running inference 'cause training was the complicated part, right? The- the reality was we made the right bet ahead of time and then we were perfectly positioned. Your job is not to follow the wave. Your job is to get positioned for the wave. And that's the hardest thing to do because everyone is trying to talk you into coming on shore again.... right? Almost everyone was telling us, "Don't do LLMs. They're gonna be terrible for you." And we're like, "This is literally what we built for."
10. HSHarry Stebbings
  Did you ever doubt yourself? Seven years is an incredibly long wait time.
11. JRJonathan Ross
  Well, it (sighs) ...
12. HSHarry Stebbings
  Just doubt th- that-
13. JRJonathan Ross
  There was doubt, but there was never a pause, and the reason was... So even back m- before starting the TPU, I was concerned that AI was going to be a technology that would allow some people to have o- outsized control, outsized influence. If you allow that to just happen in potentially not the best hands, it doesn't really matter how rich you are, it doesn't matter... Uh, nothing matters. It, it's the most important technology. So, it didn't matter how hard it got, there was no choice but to be successful, and our goal is to preserve human agency in the age of AI, right? If, if we don't do that, we have failed, and so it wouldn't matter whether there was doubt or not. And yes, there was plenty of doubt. There was a point where we were so close to running out of money, we did this thing that we called Grok bonds. So you know, um, uh, war bonds from World War II?
14. HSHarry Stebbings
  Of course. But for anyone that doesn't, what is a war bond?
15. JRJonathan Ross
  So, a war bond... World War II was funded with bonds. The US government, uh, uh, they had these posters. It was like, "Fund your troops," and, and whatever, and you'd buy them and, and they would pay you a return, and that funded the war effort. We were very close to running out of money at one point. Rather than trying to pretend to be strong, you know, we were vulnerable with our employees, and we said, "We're gonna run out of money. We need you to trade equity, salary for equity." We literally took pictures of the, the war bonds and we put Grok bonds on it instead, and we had an all-hands where we, we said this. And we were worried everyone was gonna leave. Uh, instead of leaving, about 80% of the employees participated, 50%, I think, went to the statutory minimum salary by law. When we finally raised, um, the first bid of our $300 million round, we had, uh, so little money in the bank left that it was less money than we saved doing Grok bonds. So, had we not done that, we would've literally run out of money. So, there were (laughs) some really hard times, and I know every founder has these, and from the outside it's so hard to understand. It's like watching a TV show. You're not in it, but you mult- y- you know, it's like when you are there, everything is 10 to 100 times more intense, because people left their jobs, they left their careers, their families are banking on this, and you have to make decisions like go out there... And what happ- what would have happened if we went out there and asked everyone to do Grok bonds and everyone quit? Then the shareholders would've been ho- like, you have all of these people depending on you, but if you lean towards that vulnerability, people are often gonna go with you
15:12 – 16:35
The Future of AI Inference: Efficiency and Cost
1. JRJonathan Ross
  on it.
2. HSHarry Stebbings
  So, what is a world where inference is so crucial and 20 times more important than training? What does that world look like?
3. JRJonathan Ross
  Uh, I think the simplest way to understand it is equate an LPU or a GPU to an employee, right? If, if you have enough of them, uh, the LPUs or GPUs, you can do work just like with an employee, but it's, it's a little different in the sense that, uh, they can't quit and take another job. Uh, you don't have to retrain th- o- once you get a model to a certain capability, it'll always be at least that capability, right? It's not gonna regress. You, you, you know... um, so you get the consistency out of it. But now imagine that you're a startup and rather than having to go out and hire 100 people, you hire 10 and you buy the amount of compute equivalent to 90 employees' worth. That's a very different way of thinking about the world because now CapEx, or in, in some cases different types of OpEx, can, can be used instead of just employees. And so th- a- and in terms of inference, just to give you a sense of our scaling, we started 2024 with about 640 chips in production. We ended with over 40,000.
4. HSHarry Stebbings
  Wow.
5. JRJonathan Ross
  This year we wanna be at over two million, and next year the number is much, much, much larger.
16:35 – 19:40
Chip Supply and Scaling Concerns
1. JRJonathan Ross
  So-
2. HSHarry Stebbings
  Are we seeing constraints on chip supply? I mean, that is an unbelievable scaling story.
3. JRJonathan Ross
  Yeah. So, for us to hit our numbers next year, which I'm not sharing publicly, we're gonna need almost all of the capacity of the fab that we're, we're using. The, the biggest issue... So, 7 Powers. We love 7 Powers, right?
4. HSHarry Stebbings
  Uh, by far. Yeah, yeah.
5. JRJonathan Ross
  Hamilton Elmers? Okay. Um, you don't normally think of tech companies as having a cornered resource, but NVIDIA has a cornered resource. They're a monopsony, the opposite of a monopoly, a single buyer-
6. HSHarry Stebbings
  Mm.
7. JRJonathan Ross
  ... for HBM, and, and the interposer, the CoWoS.
8. HSHarry Stebbings
  So, what is HBM?
9. JRJonathan Ross
  So, HBM is high-bandwidth memory.
10. HSHarry Stebbings
  Okay.
11. JRJonathan Ross
  And w- w- one... GPUs are, and-
12. HSHarry Stebbings
  And who produces HBM? I'm sorry for the dumb questions (laughs) .
13. JRJonathan Ross
  There's, there's three companies in the world that do this. Um, SK hynix, Samsung, and Micron.
14. HSHarry Stebbings
  Okay.
15. JRJonathan Ross
  And it's a specialty memory. It's only used in high-end servers, so there's a limited quantity that's built. It's very expensive to ramp up. It's a very technically challenging type of memory to build, more so than others, and so there's a very limited supply. And GPUs are f- so fast computationally that if you were using regular memory, it'd be like drinking out of a martini straw.
16. HSHarry Stebbings
  (laughs)
17. JRJonathan Ross
  It would just take forever. This is why you see people preferring to do, um, even inference, but especially training, on, um, GPUs rather than CPUs, because the memory bandwidth is too limited, and, uh, CPUs d- rarely use HBM. They're mostly regular memory.
18. HSHarry Stebbings
  Mm-hmm.
19. JRJonathan Ross
  Um, our architecture... And so y- it's, the, the observation that we had when we started Grok...... everyone knows Moore's Law. Every 18 to 24 months, like clockwork, double the transistors. It means double the compute. But we noticed that AI was getting better faster, and it, it clearly wasn't the algorithms, 'cause algorithms have sort of discontinuous jump. It also, uh, didn't seem to be, um, the data, 'cause there wasn't that much more data, and the transistors were only doubling every 18 to 24 months. So where was all of this capability coming from? Turns out the number of chips was also doubling every 18 to 24 months. So rather than 2X, it was 4X. So the, the question we asked was, if you're effectively gonna have an unlimited number of chips, do you do something architecturally different? And the answer is absolutely. So rather than using external memory, we just use a large number of chips and keep all of the parameters of the model in the chips live, and then we just have this pipeline where the computation flows through it, sort of like an assembly line, right? So imagine if you were trying to build a factory, and the factory was only 1/100 of the size needed for the assembly line. So you'd run a bunch of cars through 1/100, tear it down, set up the next 1/100 assembly line. You'd just do this over and over again. That's the way a GPU works. LPUs, very different. We actually just have the, the computation flow through a whole bunch of chips. So rather than using, um, eight chips, we'll use 600 or 3,000 for a model.
19:40 – 25:37
Energy Efficiency in AI Computation
1. JRJonathan Ross
2. HSHarry Stebbings
  How does that change energy efficiency?
3. JRJonathan Ross
  It, it improves it about 3X, and the reason is-
4. HSHarry Stebbings
  How does it improve it when you use more?
5. JRJonathan Ross
  So-
6. HSHarry Stebbings
  'Cause you use less for more?
7. JRJonathan Ross
  Per token. So the footprint is higher. Think of it as the difference between a factory or a backyard, um, sort of garage. The backyard garage is not gonna be as efficient. However, it has a lower energy footprint. Or another example would be if you were trying to transport a ton of coal from one side of the city to the other, and you did it on mopeds or you did it with freight trains, which one would be more efficient? The moped would use less energy per trip, but it would need more trips and therefore would use more energy overall. In fact, this is one of the things most people misunderstand. They think that edge computing is lower energy. Actually, edge computing is less energy efficient than computing in the data center.
8. HSHarry Stebbings
  Why is that?
9. JRJonathan Ross
  When you're computing in the data center, it's a little bit like that freight train. You're actually getting to do a whole bunch of jobs simultaneously. So the fact that we don't have to read from that external memory means that, um, we don't have to spend the energy doing that. Even with GPUs, you get to batch. But going back to why it's so energy efficient, the amount of energy used in a chip, there's, there's these physical wires, and the physical wires have a width, and when you look at the width and you look at the length, you charge that wire up to set it to a one, and then you discharge it to set it to a zero. Which means it's sort of like charging a capacitor and discharging a capacitor using energy. The longer that wire, the more charge. When you have HBM here and another chip here, you're actually having to charge a wire between the chips and then discharge it every time you send a bit. And so th- that's a long distance to travel, but also the wires are wider than the wires that are inside the chips. So you just use a lot more energy. When we keep that, um, memory in the chip, it's only traveling a little distance using much thinner wires, and therefore it uses a lot less energy.
10. HSHarry Stebbings
  So do we see a world of LPU and GPU, GPU (object clatters) usage in combin- Like, how d- how does that-
11. JRJonathan Ross
  Yes.
12. HSHarry Stebbings
  ... distribution look between LPU usage and-
13. JRJonathan Ross
  So-
14. HSHarry Stebbings
  ... GPU usage?
15. JRJonathan Ross
  There's a couple of things. The first is, um, training should be done on GPUs. W- And actually, I think NVIDIA will sell every single GPU they make for training. Right now, about 40% of their m- you know, market is inference.
16. HSHarry Stebbings
  All right.
17. JRJonathan Ross
  Um, I think if we were to, to deploy a lot of much lower cost inference chips, um, what you would see is th- that same number of GPUs would be sold, but the demand for training would increase, because the more inference you have, the more training you need and vice versa. Um, the other use case is we're actually so crazy fast compared to GPUs that we've actually experimented a little bit with taking some portions of the model and running it on our LPUs and letting the rest run on GPU, and it actually speeds up and makes the, the GPU more economical. So since people already have a bunch of GPUs they've deployed, one use case we've, uh, contemplated is selling some of our LPUs to sort of nitro-boost those GPUs.
18. HSHarry Stebbings
  Well, this is my question, which is that, you know, people have bought GPUs so far ahead of time-
19. JRJonathan Ross
  Yeah.
20. HSHarry Stebbings
  ... that by the time you get them, they're deployed and installed, they're almost out of date.
21. JRJonathan Ross
  Actually, we've, we've spoken with some customers that put orders in over a year in advance. They paid a year in advance and still haven't gotten them. Uh, the, the recent deployment we did in, in Saudi Arabia, uh, 51 days from contract to the first tokens being served in production in country.
22. HSHarry Stebbings
  How are you able to do it so quickly? 51 days as an astonishing.
23. JRJonathan Ross
  Yeah. Um, part of it is architecturally, things are much simpler for us. We don't have a bunch of other hardware components. We actually don't use switches to communicate between our chips. We just plug our chips into our chips. Our chips are the switch. And we don't have all of this network tuning. Think about it this way. When you're going across town in, uh, France, how long does it take to get from one side to the other?
24. HSHarry Stebbings
  A long time.
25. JRJonathan Ross
  A long time, but a variable amount of long time.
26. HSHarry Stebbings
  For sure.
27. JRJonathan Ross
  If you do it in the middle of the night, it might be fast. If you do it in, you know, middle of the day during an event like we've got going on with AI Summit, it could be particularly slow.
28. HSHarry Stebbings
  Rush hour, terrible.
29. JRJonathan Ross
  Yeah, exactly. But it's unpredictable.
30. HSHarry Stebbings
  Uh-huh.
25:37 – 31:41
Why Most Dollars Into Datacenters Will Be Lost
1. JRJonathan Ross
  because-
2. HSHarry Stebbings
  Specsmanship is what?
3. JRJonathan Ross
  It's, well, my specs are better than your specs.
4. HSHarry Stebbings
  Okay.
5. JRJonathan Ross
  My chip is faster than your chip. I get more teraflops per second than you do, right? But who cares, like, just tell me what the, the tokens per dollar is, and tell me what the tokens per watt is. Nothing else really matters. But people will find all of these other weird things to measure that they might be better on. Sort of like, I'll sell you a car with better RPMs. RPMs don't matter, right? What matters is miles per gallon and maybe the speed that you can drive at, although speed limits kind of render that, you know, moot, right? But in the case of, um, uh, enterprise sales, people often... Well, there, there was a time when the way that you would buy soap or you would market soap, the billboards would say, "Our soap has more bubbles than this other brand's soap." Who cares? And what they figured out was, let's put really happy people up on a billboard after they use the soap, and then maybe people will associate that happiness, right? Lifestyle marketing.
6. HSHarry Stebbings
  Sure.
7. JRJonathan Ross
  For some reason, enterprise still hasn't learned this lesson. It's still, "We have more bubbles. We have more teraops. We have more whatever." Things that people just literally don't care about.
8. HSHarry Stebbings
  So you think NVIDIA's, "Hey, we're 30 times faster," is not good marketing?
9. JRJonathan Ross
  I think it worked because it's what people are used to, but our counter was w- we did a press release to that, that said, um, "Grok, still faster." That was it. And, and people went gaga over it, right? Because it was just, w- we are, we're still faster, so who cares?
10. HSHarry Stebbings
  (laughs) Uh, I, I totally get it. Do you think Wall Street understand that though?
11. JRJonathan Ross
  I think they're starting to.
12. HSHarry Stebbings
  Yeah.
13. JRJonathan Ross
  But again, I, I, I don't think there's real competition here. I think if you are competing, you have done something seriously wrong. If you're competing, it means that you haven't found an unsolved customer problem. Because if you're competing, someone else has already solved the problem, so why are you spending time on it?
14. HSHarry Stebbings
  So you don't view NVIDIA as a competitor?
15. JRJonathan Ross
  No, they, they don't offer fast tokens and they don't offer low-cost tokens. It's a very different product. But what they do very, very well is training. They do it better than anyone else and by such a wide degree, it's a solved problem. Why would we bother trying to solve a problem that's already been solved?
16. HSHarry Stebbings
  So you're like cede the training market to them, we'll own the inference market?
17. JRJonathan Ross
  Yeah.
18. HSHarry Stebbings
  And they're saying, "Fuck that. We also want the inference market."
19. JRJonathan Ross
  Of course. It's the way it always works.
20. HSHarry Stebbings
  So what do we do now? So now we are competing in the inference market.
21. JRJonathan Ross
  But are we?
22. HSHarry Stebbings
  Yeah.
23. JRJonathan Ross
  So w- we don't really have people saying, you know, "We're gonna buy GPUs instead of you." We do have people saying, "We're gonna buy both." That happens, but we don't care, because we al- Like, I showed a demo to someone, and he's like, "Should we just not buy any more GPUs?" I'm like, "No, you should buy every single GPU you can get your hands on." And he's looking at me very perplexed, and I'm like, "Well, how are you gonna do training? We don't do training. Buy the GPUs. Get every single one you can, because I want your models running on us to be really good."
24. HSHarry Stebbings
  Totally, but for inference, they don't need to buy NVIDIA anymore.
25. JRJonathan Ross
  They don't need to buy GPUs for inference, but if you can get them, I mean they're a little expensive, but if you're used to it, why not? Plenty of, l- people still sell mainframes. But if you want lower cost and faster, then you want an LPU.
26. HSHarry Stebbings
  How much lower cost is it?
27. JRJonathan Ross
  More than 5X lower.
28. HSHarry Stebbings
  More than 5X lower?
29. JRJonathan Ross
  The, just the memory alone in the latest GPUs costs more than our fully loaded CapEx per chip deployed. And, and on top of that, so w- we talked about the energy efficiency. So we use about a third of the energy per token. About over a three-year period, one-third of our cost is the OpEx, which is mostly energy and data center rent, and two-thirds is the CapEx, which means that since we're one-third of the energy, the cost to run that GPU to produce the same number of tokens for inference is the same as our total cost. Just the OpEx for the GPU is the same as our CapEx plus our OpEx.
30. HSHarry Stebbings
  Why is 40% of their revenue inference, then, and what, why have you not taken so much more of that?
31:41 – 43:24
Meta, Google, and Microsoft's Data Center Investments
1. JRJonathan Ross
2. HSHarry Stebbings
  What do you mean you get more later side?
3. JRJonathan Ross
  So the deals that we do, um, y- the partner will off... 'Cause we don't deploy our... We don't spend money for our own CapEx. The partner will put up the money for us to deploy. We pay back with a m- you know, decent IRR, and, but we split, and most of it goes to the partner, and then once we hit the IRR, it flips the other way. So others are putting the CapEx up for us.
4. HSHarry Stebbings
  What does it look like at the end then?
5. JRJonathan Ross
  It's a little... It, it's not like other business models. So we, we didn't just innovate on the chip. We also innovated on the business model, and, um, we're limited in how much money we can make based on how much we can deploy, not how much money we have, because the partners are putting that money up.
6. HSHarry Stebbings
  Hm.
7. JRJonathan Ross
  So when I'm looking at what we can do, it's all about how much we can scale.
8. HSHarry Stebbings
  What are the limits to your deployment? Is it purely chip constraints?
9. JRJonathan Ross
  Mostly. So you're asking about misconceptions in AI. I think one of them is about power. So it is true that there is a mismatch in the market between people with chips and people with power, but that's partially because you need a data center in the middle and there aren't enough data centers. Those aren't the hardest thing in the world to build. They're not easy, but they're not the hardest thing. It's harder to build up the power. Um, however, because of that mismatch, you have big hyperscalers going around and saying, "I need a gigawatt of power," and they'll say this to, to 60 different potential data center builders. And then all of a sudden you hear this echo, "Well, I heard that, you know, there's a, there's a gigawatt here and a gigawatt here and a gigawatt here," and all of a sudden there's like 60 gigawatts of demand and it's this echo from that first gigawatt. The thing is, I am aware of about 20 gigawatts of power that people wanna make available for data centers now. Right now, there's about 15 gigawatts of data centers worldwide, so more than double the current capacity. M- Concern that I have is that people are now building up more power, and what's gonna happen in the next three to four years is people are gonna be like, "I built up all this power and no one's using it, and this was, like, a complete waste and we're never gonna do this again." Then what's gonna happen, remember that doubling of chips every 18 to 24 months? Well, three to four years, you double that 15 gigawatts twice, and now you're talking about what, 120 gigawatts? There isn't that much power available. And then another one after that, now you're at 240. And so what's gonna happen is we're gonna overbuild slightly right now just because of that mismatch and the, the miscommunication that's going on right now, and then we're gonna dampen our building and we're gonna, you know, close down on that, and then we're gonna have the real need for the power. That's my big concern right now, because that, that power will become a hard bottleneck in three to four years.
10. HSHarry Stebbings
  Okay, just so I understand, why will we have that data overs- or data center oversupply when we are moving into a world of inference which will be 20X larger than training?
11. JRJonathan Ross
  So the problem with data centers is everyone thinks that data centers are real estate, and a lot of people do real estate. Data centers are not real estate. Um, the, the common joke, y- in the industry now is someone says, "I'm gonna have, you know, 100 megawatts of capacity for you, and I'm gonna have it in three months. Are you willing to sign?" And then you ask a question like, "Well, um, what's your uptime?" And they're like, "I don't know. Whatever the, the, you know, power grid is." You're like, "Wait, what? Y- w- where are your generators?" "Oh, I haven't ordered those. I'll order them now." "You know that there's a 90-month lead time on generators right now?" "Oh, really?" And then the next-
12. HSHarry Stebbings
  A 90?
13. JRJonathan Ross
  90. Nine, zero. Mm-hmm. And then the next question is, "So where are you getting the water from?" "W- Wait, data centers need water? I thought it was a bunch of chips. W- What do you mean water?" So there's a bunch of people who have no idea what they're doing going into it because they think it's real estate, and so those people are now building an oversupply of data centers, but they're not really building them. So they're, they're fake data centers that people think are real.
14. HSHarry Stebbings
  What happens to those data centers? 'Cause they're not gonna be utilized, are they? Amazon is not gonna pay for a data center that doesn't have-
15. JRJonathan Ross
  Well, Amazon doesn't fall for this. Amazon has really good people.
16. HSHarry Stebbings
  Mm-hmm. Whoever the, the buyer is, is not gonna pay for a data center that's got no water or got no power.
17. JRJonathan Ross
  Yeah.
18. HSHarry Stebbings
  And so is it just wasted from your-
19. JRJonathan Ross
  Most of these projects will never be developed.
20. HSHarry Stebbings
  Will we build them fast enough? You said about the data c- like, the oversupply.
21. JRJonathan Ross
  Yeah.
22. HSHarry Stebbings
  It ta- it does take time to build a data center.
23. JRJonathan Ross
  Yeah, and, and it's, it's... So it's almost... Okay, if you train a model-
24. HSHarry Stebbings
  Yeah.
25. JRJonathan Ross
  ... you really wanna amortize it for about six months. If you, um, if you deploy chips, you really wanna amortize it for three to five years, all right? We're more on the three-year side, others are more on the five-year side. Uh, if you build a, um, data center, you're probably talking 10 to 15 years. In a power plant, you're talking like 15, 20 years.So, the- the problem we have in the industry is not on- on this... And- and there's this mismatch between the- the sort of financing and the- the needs here. So, you have someone who wants to train a model, they're gonna be doing this for six months and they don't understand why people want three to five-year commitments on the chips, and then the people deploying the chips don't understand why someone wants a 15 to 20 year (laughs) commitment on the data ce- right? It's at seven years now on the data centers. And then the people building the data centers then need a long-
26. HSHarry Stebbings
  So, 70 years commitment?
27. JRJonathan Ross
  Yeah. That's the kind of thing they're asking for.
28. HSHarry Stebbings
  (exhales) .
29. JRJonathan Ross
  So, you've got this complete mismatch throughout the ecosystem, but the funny part about it is while they all want to take zero risk and have a committed, um, you know, sovereign wealth level, um, uh, sort of credit rating on the other side of it with long commits, the longer the payoff time, the more generic the infrastructure is. A model has a pretty specific use, but accelerators like LPs and GPUs can be used for other things besides generative AI or LLMs. Um, the data center can be used for other things besides the accelerators. The power can be used for anything. So, while they're looking for the least risk over here, it's the place where there is the least risk because if- if we don't use it for- for AI, we'll use it to power all of the electric cars.
30. HSHarry Stebbings
  So, is this a case where incumbents win because they're one of the only ones who are able to match the durations required by data center providers?
43:24 – 44:17
Distribution of Value in the AI Economy
1. JRJonathan Ross
2. HSHarry Stebbings
  Do you think that value is distributed amongst many players or concentrated towards one or two? I completely agree with you in terms of the clear value when assigned, but is it e- distributed to some levels evenly or concentrated?
3. JRJonathan Ross
  It's a power law. And the more value there is in the economy, the more risk there is of a single entity being so far on one end that they just dominate, and you see this with the Mag Seven, right? And it's predict- just the bigger the economy gets, the more you will have, uh, you know, big swings in- in the economic outcomes. Right now, the hyperscalers are all sort of even in their market caps. It's strange. You would expect one of them to just be killing it and taking it much further, and so I don't understand why they're so closely
44:17 – 45:46
Stages of Startup Success
1. JRJonathan Ross
  grouped.
2. HSHarry Stebbings
  So, when we think about that distribution, like how do we think about changing that then? Like obviously with a growth, you know, you want to be one of the Mag Seven, you want to be one of the most important companies in the world. How do you see that?
3. JRJonathan Ross
  There's... So the way that you get there and the way that you stay there are two very different things.
4. HSHarry Stebbings
  Mm-hmm.
5. JRJonathan Ross
  And the- there's a l- there's sort of a circle of life that happens in startups. The first circle or the first stage is solve an unsolved problem, right? That's how you go viral. That's how you do well. The second stage is the marketing stage, which is now other people are trying to copy what you've done, right? 'Cause they can't think of something themselves, and now you have to fight it out in- in advertising and marketing and whatnot. And you see CPG companies often get stuck there, right? And it becomes more about where on the shelf they are than- than anything else. And then the final stage is the seven powers. It's once you've found some of those and you've really started improving it and you have, you know, sort of systemic advantages, and then what happens is someone solves an unsolved customer problem and the whole cycle of life continues, right? Now Google, uh, has to redo this because LLMs are better than search, right? So, the way that you start off to become a Mag Seven is you solve that unsolved problem. The way that you stay there is first you find one of those seven powers or- or multiple, but then you have to be ready for when you get disrupted to continue fighting back and- and solving customer problems.
45:46 – 47:45
The AI Investment Bubble
1. JRJonathan Ross
2. HSHarry Stebbings
  Kent, we mentioned the different huge amounts of money that's being spent here. Is this a good bubble that bluntly lays the foundations for an incredible next 10 to 20 years where bluntly the capital actually turns out to be productive but not seemingly so on paper? Or is it where actually just a huge amount of money is incinerated on depreciating assets?
3. JRJonathan Ross
  I can guarantee you that a huge amount of money will be incinerated, but I also bet that in total more money will be made than will be put in. And so this is the problem. You- you have to look at it either in aggregate or individual bets, right? When everyone is making investments in the market, some people are gonna lose money 'cause not every company's gonna be successful. So, what you always see is when there is some real tech improvements or s- things coming, you've got the things that were early that people are investing in heavily that are super successful, and then everyone else wants to get in on it, and you know, it goes from you have AI chips and AI models to now you've got AI, you know, T-shirts, and next thing you know you've got AI thermal grease, right? It- it just, like people just start applying AI to everything. Next thing you know you'll have an AI condo, right?
4. HSHarry Stebbings
  Sure.
5. JRJonathan Ross
  Yeah, and so the- the trick is discerning what is real and what isn't. You're always gonna have all of these really obnoxious charlatans coming in whenever there's something real, and that's unfortunate, but eventually they get cleared away once people start to understand the technology and what's real and what isn't. And so the job is to start educating, and the more educated people are, the less they'll invest in AI thermal grease.
6. HSHarry Stebbings
  (laughs) What is the largest individual bet that will lead to the largest incineration of cash?
7. JRJonathan Ross
  I'm not gonna call anyone out in particular, but I actually think it will happen across every single discipline. The...
47:45 – 51:52
The Keynesian Beauty Contest in VC
1. JRJonathan Ross
  Are you aware of the Keynesian beauty contest?
2. HSHarry Stebbings
  No.
3. JRJonathan Ross
  Okay. So, um, John Maynard Keynes, the economist, he has this great, um... This will explain everything you need to know about VC.
4. HSHarry Stebbings
  (laughs)
5. JRJonathan Ross
  So-
6. HSHarry Stebbings
  I'm nervous, but keep going. (laughs)
7. JRJonathan Ross
  So, um, take a magazine full of models, human models, like, you know, good looking models, and have a whole bunch of VCs in the room, and they're allowed to make bets on who the- the most beautiful model is, and in the end, whoever has the most money on them is the winner, and based on the proportion that you put on that particular model's face, you get the share of all of the money. So if you, if you put money on one that isn't the most beautiful by dollars, then you lose your money to the- the people who bet on that one. And that was sort of the- the bet that SoftBank was making, which was they could win the Keynesian beauty contest. "I'm just gonna put more money in and I'm gonna win." That is problematic when you have true technological advantages as opposed to marketing. When you're solving customer problems, it's a weighing machine. Once the customer problem has been solved, you then get into this sort of popularity contest of marketing.Now, something unusual has happened this time around, which I don't think has ever happened in, in VC before, which is, y- you see people raising billions of dollars, who, who have competitors who've raised billions of dollars. It... Usually, there is a clear winner in the Keynesian beauty contest. You don't have this, like, fight where, you know, i- i- it's, it's sort of like, "Well, I gotta put a little more money in." "I gotta put a little more." "I gotta, you know, put 10 billion in." "No, I'm gonna put 20 billion." "I'm gonna put 500 billion in!" Right? Because the Keynesian beauty contest has gone completely amok (laughs) . And, um, this has never happened before, and so now people don't even understand how to react because it used to be if someone had raised a billion dollars, you're like, "Oh, they're the winner."
8. HSHarry Stebbings
  'Cause-
9. JRJonathan Ross
  Now, it's like there's three or four competitors who have a billion dollars.
10. HSHarry Stebbings
  So who wins and who loses? Like, is Meta gonna incinerate the largest amount of cash ever?
11. JRJonathan Ross
  I think the Keynesian beauty contest no longer applies here, because there's so much money available, being spread out, and, and I think you're... You're gonna see that the people who have the best products are actually gonna be the winners because everyone can be capitalized. But there will be problems for the winners because of this. The problems are gonna be of the sort... You, you had this employee that you were gonna hire, and someone offered them a ridiculous amount of money. Yeah, you see this all the time now.
12. HSHarry Stebbings
  Yeah.
13. JRJonathan Ross
  And they could have gone and contributed to the winner, but now they're contributing to a competitor that shouldn't even exist, right, or is equally likely to win, and now you're splitting the talent.
14. HSHarry Stebbings
  What do you also do when you have such high salaries? We've seen a million, two million for kind of junior to mid-level in some of these companies, and they are living an amazing life, actually, in great places. You think they're living that amazing life in Guangdong when they're working for DeepSeek or any other Chinese alternative? I don't think so. I think they're actually getting paid much less, working their fucking ass off 20 hours a day, and not getting kombucha and being paid two million a year. Fair?
15. JRJonathan Ross
  Not only fair, we have a policy that we never offer the highest because we want people to choose us, not choose the salary. If we win in a bidding war, then that means the next time someone comes along with a higher salary, that's it. They're just gonna go take that other job. There's no loyalty. They don't believe in the mission. Instead, we focus on, "Look, we're gonna build this. This is your opportunity. You're gonna get to work with amazing people. Um, spend some time with the team. Are these the people you wanna be working with? Because frankly, you're gonna make so much cash, it doesn't matter." But bet on the equity, the outcome, right? "Help us make this thing valuable." And people who buy into that, they're so much easier to manage because they're mission-oriented. They all wanna do the same thing. They're not there because they want the kombucha, and they're not gonna complain because the, the cappuccino machine is broken. They'll just go and buy their coffee next
51:52 – 57:30
NVIDIA's Role in the AI Ecosystem
1. JRJonathan Ross
  door.
2. HSHarry Stebbings
  Will you and NVIDIA move into the model layer? Everyone talks about model providers becoming application providers. Will infrastructure providers become model providers?
3. JRJonathan Ross
  We have decided that we're not gonna train our own models. We'll do a little fine-tuning for specific cases or whatnot, but we don't wanna compete. And that's really important because, uh, people are putting their models with their weights on us, right? And, and, and they don't want us to, to learn from and take that stuff for our own benefit. This is the problem you have when you work with a hyperscaler because you know they're also doing everything that you are doing. So we've decided model providers, you make the model, we don't do that. I think there's also the data side of the users and the queries. So the other thing that we could do that we do not do is log the queries, and then we've got data if we wanted to train. We don't train. We have no reason to hold the data. So we, we only temporarily store things in the DRAM so there's no persistent storage. If the power went out, everything's gone. And DRAM is limited, so we can't hold things for a long time. So you know that we don't have your data. Now, people who are building businesses on top of us, you can obviously keep the data from your customers if you want. We have no control over that. That's fine. But we don't take any data.
4. HSHarry Stebbings
  Do you think NVIDIA move into the model providing?
5. JRJonathan Ross
  It's possible, but I think... I mean, if I was them, I would avoid it because I wouldn't wanna give the customers of mine... I mean, NVIDIA is great at training, right? It, it, it's crazy. It'd be like, you know (laughs) , um, being a, um, automotive, uh, uh, uh, a car company and then creating your own taxi service. You're now competing directly with your customer, right? And I think tech companies love to do this. We, we have a management philosophy and it's based on big O complexity, and we only do things that require a sublinear number of employees. So what I mean by that is if someone comes to me and says, "I need 10 people to go do this thing," a lot of people would say, "Well, why can't you do it with five?" I would say, "Okay, you're supporting customers. If we double the number of customers, do you need 20 or do you need 11?" Because I wanna know what's that growth rate. Are they automating everything, right? We completely automated our compiler. We completely automated everything that we... or, you know, large portions of our cloud, and that means that we can scale with a small team. We have 300 people. With 300 people, we built our own chip, we built our own networking hardware and software, we built our own runtime, we built our own orchestration layer, um, we built our own compiler, we built our own cloud. We built all this with 300 people. Now, we would only be able to do this with a small number of people because you don't have the communication overhead. But if we... If... You have to decide what your constants and variables are. What are the things that you wanna preserve? And one of the constants is talent density. We wanna stay small, we wanna stay nimble. And the, the other side of this is growth is a problem. So we measure our growth in what I call problem units. So a problem unit, every time you triple something, you have about the same number of problems as the last time you tripled.... going from 100 employees to 300, 300 to 1,000, 1,000 to 3,000. Each one of those has the same number of problems. We scaled from 620, uh, or 640 LPs last year at the beginning to 40,000. That's four problem units. That's four triplings of the number of chips. If we were also tripling the number of employees, that would be another problem unit. Management bandwidth is limited, you can only solve so many problems.
6. HSHarry Stebbings
  Mm-hmm.
7. JRJonathan Ross
  So you have to decide where you're gonna allocate them. If you build things really well from the beginning and you can scale up with the number of employees you have, then you can scale over here. If you wanna triple the number of customers, there's another problem unit that you have to solve.
8. HSHarry Stebbings
  What's the biggest challenge when you are scaling at that rate, but then the team is not scaling in conjunction with it?
9. JRJonathan Ross
  There's this common belief that the people that you have early on are right for the job, and the people that you get later, maybe they're better in a s- sort of more corporate environment.
10. HSHarry Stebbings
  Yeah.
11. JRJonathan Ross
  I don't think that's the case. I think you should always try and get generalists. Um, otherwise, you, you get stuck and ossified in a particular way of doing things because that's what that one person knew how to do. But there, there are people who burn out. In a s- being in a startup is hard. Like, there are people who just literally burn out. There's also people who were the best that you could get at the time, and then there are people who are just unmanageable wild children and they should go off and start another startup, and they shouldn't be, you know, scaling with you. That's, that happens, but it's the rarer of them. I think saying that you're, you're gonna hire B players because you've gotten large enough is, is laziness and an excuse, and it, it's a lack of creativity in your business model and how you're going... and the algorithm of how you're gonna scale. Think of it this way, Walmart versus Amazon. Walmart wants to double the number of customers, they have to double the number of stores and employees. Amazon does not need to double the number of websites. That's a fundamental advantage. But Amazon still has to double and improve the logistics, right? They don't have as many problems where they have to scale linearly, but they have some. You wanted to disrupt Amazon, what you would do is you'd build a completely robotic logistic system and bring the comple- the, the overhead and complexity of that down, and then you could outmaneuver them, right? That's how you need to impro- Don't just say, "I need more people." Focus on the algorithm of your business.
57:30 – 1:02:25
China's AI Strategy and Global Implications
1. JRJonathan Ross
2. HSHarry Stebbings
  And the last time we spoke, we discussed DeepSeek. I think more has come out over the last few weeks about, bluntly, their innovations, some of the distillation that they used. Where is China better than us today?
3. JRJonathan Ross
  Well, uh, as we discussed, they're more willing to use things that maybe they shouldn't be using. Um, you know, they, they distilled the OpenAI model. Uh, a lot of people have the opinion, "Well, OpenAI was scraping the internet, so you know, good one for DeepSeek." But whether that's right or wrong, most of the model providers had considered that a red line they didn't wanna cross. I don't know if that's gonna change, but, but it might. But the other thing-
4. HSHarry Stebbings
  Given the open source nature of DeepSeek, OpenAI now benefit from the innovations that they did also have.
5. JRJonathan Ross
  Well, and they also probably have all the data that DeepSeek paid them to generate.
6. HSHarry Stebbings
  So?
7. JRJonathan Ross
  Yeah, I... But, but they also were clever, they, they innovated. I think the biggest thing is this is a shot in the arm for morale in China, and it gives them a sense... But, but again, this, you know, as I said, is Sputnik 2.0. It's also woken up the US.
8. HSHarry Stebbings
  It totally has. How do you compare Stargate to the $128 billion that China have now permitted?
9. JRJonathan Ross
  So China has a more complicated situation and a simpler one at the same time. The, the, the s- problem is they don't have the technology that we have in terms of the chip efficiency. On the other hand, they have scale. If they wanted to deploy 150 nuclear reactors, as I think the plan is, no big deal, they just do it. So if the chips aren't as efficient, they can just deploy more of them. On the other hand, if they wanna go out into the world and deploy chips like they did with Huawei and networking gear, that's gonna be complicated because people aren't gonna have the power around the world to run more expensive accelerators. The- At home, I don't think anything's a problem. I only think as they're trying to expand, it's gonna be an issue.
10. HSHarry Stebbings
  China is quite opaque in everything.
11. JRJonathan Ross
  Yeah.
12. HSHarry Stebbings
  What do we not know about China that we would like to know?
13. JRJonathan Ross
  I think the most important thing to understand is where they're gonna end up on the, the censorship and privacy of these models. We come from democratic countries. We have an expectation that companies can build something that says anything. Are they going to be permissive and allow models to make mistakes and hallucinate, or are they gonna shut it down? Because I think if you know that, you know whether or not China has a shot. One of the biggest nightmares that they have is free speech. It's the exact opposite of that vulnerability we talked about earlier. Can you imagine Xi Jinping going out and saying, "Country, we've lost our advantage in AI. I need your help." Never ever. It's always gonna be, "We're the greatest, we're the best." Everyone's gonna know differently, but they're all gonna have to t- toe the party line. Right now, because of that, I think it's really hard for them to just allow these models to, to say anything, say, you know, "The US is great and better at AI." That, that's a bad thing for them, and so that's gonna really tell you a lot about the AI story in China.
14. HSHarry Stebbings
  And so if they aren't permissive of, uh, more open, truthful models, then they're inherently disadvantaged you're saying?
15. JRJonathan Ross
  Well, look what, um... I, I forget if it was... Was it Jack Ma who got in trouble with the CCP?
16. HSHarry Stebbings
  Yeah.
17. JRJonathan Ross
  Yeah. Um-... if they aren't more permissive, then if you're running a Chinese tech company, your fear is that you become Jack Ma. That's really gonna stifle innovation.
18. HSHarry Stebbings
  (exhales)
19. JRJonathan Ross
  If I was in China right now, I'd be looking for the exit for how I could do ... Like, if, if your craft is AI, I would wanna do that some place that's supportive.
20. HSHarry Stebbings
  Do you really buy that they don't have access to Blackwell? This is China. You think Xi Jinping's like, "No, sorry. No Blackwell." (laughs)
21. JRJonathan Ross
  Well, I, I don't, I don't think it matters on whether or not they physically have it, because right now most of the cloud providers are happy if you swipe a credit card to rent it to you.
22. HSHarry Stebbings
  But there is limits to renting, no? I don't, I'm-
23. JRJonathan Ross
  Maybe.
24. HSHarry Stebbings
  ... I'm naive.
25. JRJonathan Ross
  I, I think if you ... So, um, one of the concerns right now is about, um, uh, Malaysia or Singapore, that region over there being a place where people are deploying GPUs with the wink, wink, like, "We're not gonna rent it to China." Right?
26. HSHarry Stebbings
  (laughs)
27. JRJonathan Ross
  Um, but that's, that's a belief that a lot of people are doing that. Otherwise, that's a lot of GPUs for that region. That feels like it's even more of a safety net just in case the tap ever gets turned off at the hyperscalers, because right now, um, you could just write a check to any of the hyperscalers and say, "I need these chips." They'll deploy them, and you can run on them. Doesn't really matter where you're coming from. I mean, if you're a sanctioned country, no. China's not sanctioned.
1:02:25 – 1:17:13
Europe's Potential in the AI Revolution
1. JRJonathan Ross
2. HSHarry Stebbings
  Okay. So we have China, where China is obviously in terms of innovation and actually proving that they, they are in the race, we have the U.S., and then we have Europe, which, which feels like it's languishing.
3. JRJonathan Ross
  Yeah.
4. HSHarry Stebbings
  Is this the ultimate nail in Europe's coffin?
5. JRJonathan Ross
  We talked about how Grok almost died.
6. HSHarry Stebbings
  Yeah.
7. JRJonathan Ross
  But we had the right technology all along. We were just waiting for the thing, for, for the LLMs to arrive. And I think Europe's very similar. I think Europe has amazing talent, amazing talent. Um, but that talent leaves and goes to the U.S. or other places.
8. HSHarry Stebbings
  (exhales)
9. JRJonathan Ross
  So the question is, how do you have Europe's LLM moment? How, how do you position yourselves? And it's, it's not that complicated. The problem is when you surround yourself, y- y- you become the average of your five closest friends, right? If your five closest friends are like, "That'll never succeed. Ah, you should just keep your job. Ah, startups, they're terrible," then you're gonna be risk averse. But if your five closest friends say, "You should do it. That's great. I support you," then you're gonna be more likely to do a startup. And even in Silicon Valley, people make that transition from the big tech company to the startup. And it's hard, right? They're, you know, comfortable, right? They're making those crazy salaries.
10. HSHarry Stebbings
  Mm-hmm.
11. JRJonathan Ross
  The, the big companies take care of them. And they have a fiduciary obligation to their family. How do they make that, that leap? And it's because you've got tons of entrepreneurs trying to hire them, and they hear the pitch all the time, and they get used to it, right? They also see the success around them. And then VCs come in and, and try and close some of the candidates in the early stage too, right? Europe needs the same thing. You need a place where people are surrounded, just surrounded by entrepreneurial people who are risk on and who aren't gonna try and talk people out of joining a startup.
12. HSHarry Stebbings
  From a regulation perspective, Europe is, you know, unbelievably efficient and the masters of regulation. You know, I was speaking to someone the other day in the EU who supposedly hired 1,500 people for AI safety and policing. Um, what would you do if I put you in charge of European AI regulation?
13. JRJonathan Ross
  Well, I wouldn't waste my time regulating something that doesn't exist. Instead of regulating, what are you gonna promote? You wanna promote risk-taking. You wanna promote that enclave of people who are risk on. So, I was just visiting Station F yesterday. Amazing. Macron was there. It was, like, full of people, right? Vibrant, you feel it. I would ... And I was talking to the, the person who runs Station F, Roxanne, um, and Xavier Niel. And with Roxanne, we were talking about, what about a City F? What about, uh, a place where there's ... You start off with, like, 10,000 people in, in the center, right, a little radius. And then once that's full, you expand it. Once it's full, you expand it, and so on, to get to, like, a million people in Europe who are all risk on, a little Silicon Valley here. And I would give it special economic dispensations. I would allow everything that employers need. I would make it simple, and I would say, "You know what? If you don't wanna buy into that, that's fine. Go to other regions in France. Go to other regions in Europe. But if you want to participate in what is going to be the biggest technological revolution in human history, this is the city for you."
14. HSHarry Stebbings
  You're not inherently punishing incumbents then? And what I mean by that is if we are talking about, uh, I'm just using this as an example, uh, AI insurance underwriters, startups. Yeah. There's many companies that are going after insurance underwriting in AI, and you are giving them benefits like that. You are inherently punishing some of the biggest providers of insurance in your region. You're inherently punishing people who hire 200,000 people. That feels unfair.
15. JRJonathan Ross
  So, there is no right to be an incumbent, especially a slothful incumbent that is not reacting to disruption.
16. HSHarry Stebbings
  Fair.
17. JRJonathan Ross
  And you want to encourage disruption. And this is one of the things in Silicon Valley. You can move from one place to another. There, there are no ... I mean, we had non-solicits when I started, but even that's gone, right? So, that free movement of people is very important.
18. HSHarry Stebbings
  Are you allowed to start work straightaway?
19. JRJonathan Ross
  Straightaway, but not before. If you start before, that's bad.
20. HSHarry Stebbings
  Yeah, but we have months, months, like six months.
21. JRJonathan Ross
  There's no such thing. And so in that region, I would say you can immediately start, like literally the next day.
22. HSHarry Stebbings
  That is so good. We have to wait six months.
23. JRJonathan Ross
  It's not good. If you are a company right now, it, it feels like, "Okay, well, it's harder to poach." But what does that do? It suppresses wages.It's harder to hire someone, they're less likely to move. There's less competition, it suppresses wages.
24. HSHarry Stebbings
  And by the way, the company has to pay for the six months anyway. (laughs) It's ... It makes no sense at all. I, I, so I, I totally get you and understand there. Um, can I ask you, you know, you mentioned, like, what would you promote? Uh, a lot of people would promote ... I loved the way you said risk-on. Um, being a European, I actually thought first safety and regulation, uh, but specifically safety. So, uh, sticking with that, all that Dario will talk about these days is safety. Is he losing a step by being so focused on safety when, bluntly, his competitors are talking about product?
25. JRJonathan Ross
  So, safety matters in AI. It's a little bit like nuclear power. Lots of pros, lots of cons. I'm worried about different things than, I think, um, than Dario is, is worried about. I'm more worried about people voluntarily giving up their decision-making authority-
26. HSHarry Stebbings
  Mm-hmm.
27. JRJonathan Ross
  ... because it's so easy, and this is what I mean by preserving human agency in the age of AI. Good analogy is, you, you probably know plenty of wealthy people and the struggles they have bringing up children with wealth. I refer to it as financial diabetes. Right? You, you have children who aren't incented to, to ... They're not, they're not gonna strive to succeed. I was very fortunate when I was growing up, and so I, I actually just told this story for the first time today, so n- no one's heard it, but, um, I was fortunate because my father lost all of his money multiple times. Just ... And I've heard you say the same thing.
28. HSHarry Stebbings
  Mm-hmm.
29. JRJonathan Ross
  Yeah. And he would sell a billion-dollar life insurance policy, and he would get all the commissions from that, and he would have tons of money, and then he would spend it all. And so there was one time we were living in a $20 million mansion, and there was a couple of times where we ordered food. He would talk to the delivery guy, and he would convince him to, to give us the food, and he would pay him back later 'cause we'd get money later. But, um, this time, he was, like, so despondent, he sort of locked himself in his office and wouldn't come out. My little brother came to me and said I had to go and talk to the Chinese food delivery guy and convince him to, uh, to give us the food. And I was, like, mentally preparing how to, like, convince him to do it, and I, I walk out, and I walk up to him, and I'm, like, getting ready to do my whole spiel, and he hands it to me, and I'm like, "Uh, I don't have the money right now." He's like, "Oh yeah, pay me later." I didn't have to ... I fortunately didn't have to do anything. He just trusted 'cause we were living in a $20 million mansion. But that happened multiple times, and when that happens multiple times ... Like, I have a friend who was homeless once for a couple of weeks, and he'd almost been homeless a couple of times, and he said the best thing that ever happened to him was that he was homeless for a couple of weeks because he survived it, and it's like, I've been through it.
30. HSHarry Stebbings
  (laughs)
1:17:13 – 1:18:53
Future Predictions and AI's Impact on Society
1. JRJonathan Ross
2. HSHarry Stebbings
  What's a crazy AI prediction you have that everyone else think is science fiction?
3. JRJonathan Ross
  I would assume that in the next 10 years, and I know this is gonna be crazy, but you, you, you saw that picture of me and my weight loss, right?
4. HSHarry Stebbings
  Unbelievable, dude. I mean, really amazing.
5. JRJonathan Ross
  70 pounds. 70 pounds, yeah. But I was on Mounjaro. So if you know anyone who's overweight and it's hurting their health, get them on Mounjaro as soon as you can. It works.
6. HSHarry Stebbings
  Sorry, what is Mounjaro?
7. JRJonathan Ross
  It's one of those, uh, GLP inhibitors, one of the, the weight loss drugs-
8. HSHarry Stebbings
  Yeah.
9. JRJonathan Ross
  ... that have become popular recently.
10. HSHarry Stebbings
  Huh.
11. JRJonathan Ross
  It works. But my crazy AI belief is that if it is possible, if it is possible to significantly slow or stop aging, I think that you will have a Mounjaro moment in maybe the next 10 years. It... Because that came outta nowhere.
12. HSHarry Stebbings
  Yeah.
13. JRJonathan Ross
  All of a sudden, you know, you, you could just lose weight. Something finally worked. And it's worked for a bunch of people. You probably know a bunch of people who've lost weight. Yeah, exactly. And I don't know if it is possible to slow or stop aging, right? Some, some wear and tear is a real thing and it might just be impossible. But if it is impossible to slow or stop aging, if, then I think in the next 10 years we will do it and it will be sudden. It'll be like the Mounjaro, um, um, and the other one as well. The other... It, it'll be like that moment.
14. HSHarry Stebbings
  I don't see how it is not possible. Like, when you look at the advances that will come in medical research, I don't see how it's not possible that we will at least extend, you know, longevity by 60 years. I mean, Darius has lived to 150. I don't see why that's impossible.

Episode duration: 1:25:52

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode xBMRL_7msjY

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome