Skip to content
No PriorsNo Priors

No Priors Ep. 127 | With SemiAnalysis Founder and CEO Dylan Patel

What would it take to challenge Nvidia? SemiAnalysis Founder and CEO Dylan Patel joins Sarah Guo to answer this and other topical questions around the current state of AI infrastructure. Together, they explore why Dylan loves Android products, predictions around OpenAI’s open source model, and what the landscape of neoclouds looks like. They also discuss Dylan’s thoughts on bottlenecks for expanding AI infrastructure and exporting American AI technologies. Plus, we find out what question Dylan would ask Mark Zuckerberg. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @dylan522p | @SemiAnalysis_ Chapters: 00:00 – Dylan Patel Introduction 00:31 – Dylan’s Love for Android Products 02:10 – Predictions About OpenAI’s Open Source Model 06:50 – Implications of an American Open Source Model for the Application Ecosystem 10:48 – Evolution of Neoclouds 17:26 – What It Would Take to Challenge Nvidia 27:43 – What Would an Nvidia Challenger Look Like? 28:18 – Understanding Operational and Power Constraints for Data Centers 34:48 – Dylan’s View on the American Stack 43:01 – What Dylan Would Ask Mark Zuckerberg 44:22 – Poker and AI Entrepreneurship 46:51 – Conclusion

Sarah GuohostDylan Patelguest
Aug 14, 202547mWatch on YouTube ↗

EVERY SPOKEN WORD

  1. 0:000:31

    Dylan Patel Introduction

    1. SG

      (instrumental music) . Hi, listeners. Welcome back to No Priors. Today, I'm here with Dylan Patel, the chief analyst at SemiAnalysis, a leading source for anyone interested in chips and AI infrastructure. We talk about open source models, the bottlenecks to building a data center the size of Manhattan, geopolitics, and poker as a tell for entrepreneurship. Welcome, Dylan. Dylan, thank you so much for being here.

    2. DP

      Thank you for having me.

    3. SG

      I've been really looking forward to this conversation. Um, you're such a deep thinker about this

  2. 0:312:10

    Dylan’s Love for Android Products

    1. SG

      space. And then also, it's very odd, you clearly have the Samsung watch.

    2. DP

      Yeah. I- I got the Blink-

    3. SG

      The folding phone.

    4. DP

      ... I got the Blink, I got the-

    5. SG

      And the laptop.

    6. DP

      ... the Fold. Yeah, yeah.

    7. SG

      Tell me more.

    8. DP

      So part of the sto- origin story is that I was moderating forums when I was a child, and my dad's first Android phone was the Droid, right?

    9. SG

      Okay, yes.

    10. DP

      And for some reason, I was obsessed with, like, messing with it, like rooting it, like under-clocking it, improving the battery life, all these things, because when we were on a road trip, there's nothing to do besides, like, mess around on this phone. So I posted so much about Android that I became a moderator of slash r/Android on Reddit, and- and like many other subreddits related to hardware and NVIDIA, and Intel, and all this stuff. But because of that, I've just always had Android. Now, I've had work iPhones before, but I just really love Android, and then it's like, if you're gonna like technology, I'm not like someone who pushes it, but like get the best stuff. So I have like the Ultra Samsung watch, which I think looks cool, and the- the folding phone, right? It's fun. It's obviously different and weird. No- no iMessage is- is a travesty.

    11. SG

      What does it dominate at? What is it better at?

    12. DP

      Um-

    13. SG

      Besides the openness of, like, the hackability.

    14. DP

      I don't even hack that much stuff anymore, right? It's like, what do you use your phone for? I think- I think the main thing is, like, you can have, like, Slack and an email up on two different parts of your phone. I think that's probably the main thing. Or like, you can actually use, like, a spreadsheet on a folding phone. You cannot use a spreadsheet on a regular phone.

    15. SG

      Okay.

    16. DP

      And that's not even an Android thing.

    17. SG

      Yeah.

    18. DP

      Like, Apple's folding phone next year will be able to do that just fine, and I'll have no argument then.

    19. SG

      Yeah.

    20. DP

      But I just like it, you know? People- people have their preferences. People are creatures of habit.

    21. SG

      You got to look at the GPU purchasing forecast-

    22. DP

      Yes.

    23. SG

      ... on a sheet, on your phone, I think.

    24. DP

      Yes, I do. I do. No. Like, it's like someone's telling you numbers. You're like, "Wait, this is, like, slightly different than my number," right? Like...

  3. 2:106:50

    Predictions About OpenAI’s Open Source Model

    1. DP

    2. SG

      Okay, so we have a week of big rumored announcements coming up. Tell me your, like, reaction to the OpenAI open source model.

    3. DP

      In theory, it's gonna be amazing, right? Like, I- I assume this is releasing after it's released or...

    4. SG

      Yes.

    5. DP

      (laughs) So that's okay. The open source model is amazing, guys. Like... (laughs) I think the world is going to be really, like, shocked and excited. It's the first time America's had the best open source model in six months, nine months, a year. LLaMA 3.1 405B was the last time we had the best model. And then Mistral took over for a little bit, if I recall correctly, and then the Chinese labs have been dominating for the last, like, six, nine months.

    6. SG

      Mm-hmm.

    7. DP

      Right? So it'll be interesting. It'll also be funny because, like, the open source model probably won't be the best for just regular chat, because it is like more reasoning-focused and all these things. But it'll be really good at code, and that's... I'm excited for that. Yeah. Like tool use, although that's, like, going to be confusing. Like, how do you use the tools if you don't have access to OpenAI's tool use stuff, but the model is trained to do so? That'll be interesting for people to figure out. I think the last thing is, like, the way they're rolling it out is really interesting. They accidentally leaked all the weights, but no one in the open source has figured out how to actually run inference on it, because there's just some weird stuff in the model with the architecture, like 4-bid and, like, the biases and all this other stuff. But what's interesting is, other companies drop the model weights and say, "Go, make your own inference implementation." But OpenAI is, like, actually, like, dropping the model weights and, like, all these custom kernels for people to implement in inference. So everyone has a very optimized inference stack day one.

    8. SG

      And they work with partners on it too.

    9. DP

      Yeah, working with partners on this. But this is very interesting because, like, when DeepSeque drops, it's like... Well, Together and Fireworks are like, "Yeah, we're the best at inference because we have all these, like, people who are really good at low level coding," whether it be, like, Fireworks with all their, like, former PyTorch meta people, or Together with, like, you know, Tri DAO and all the, you know, Dan Fu and all these, like, super cracked, like, kernel people. They have, like, higher performance, right? Uh, but in this case, like, OpenAI is releasing a lot of this stuff. So it's, it's- it's interesting for the inference providers too, like, how do they differentiate now?

    10. SG

      Yeah, I- I mean, my premise on this is, um, in the end, a lot of the model optimization performance layer is open source, and it's a commodity. Um, a- and it will end up being like a fight at the infrastructure level, actually.

    11. DP

      Interesting.

    12. SG

      Um, and so, you know, all of these inference providers, like, as you mentioned, you know, Fireworks and Together and Baseten and such, they- they compete on both dimensions. And the question is, what's going to matter in the long term?

    13. DP

      Why would these model level software optimizations all be open? They haven't been open so far and the advancements are so fast, right? Like...

    14. SG

      Oh, well, I think they... Uh, um, a bunch of them have been partially open, and I think-

    15. DP

      Yeah.

    16. SG

      ... O- OpenAI is also pushing for them to be open as well, right? Um, and so I think there's a lot of force in the ecosystem to, um, open source from both, like, the NVIDIA level up-

    17. DP

      Yes, yes.

    18. SG

      ... and from the model providers down, right?

    19. DP

      Agreed. Yeah.

    20. SG

      And so, uh, I think today, these providers all fight on that dimension.

    21. DP

      Yeah.

    22. SG

      And they also fight on the infrastructure dimension, and I think infrastructure is going to end up being a bigger differentiator.

    23. DP

      That makes sense, yeah.

    24. SG

      You can't open source your actual infrastructure, right?

    25. DP

      Yeah, yeah.

    26. SG

      You just have to have the network and you have to run it, right?

    27. DP

      Yeah, yeah. That makes a lot of sense. Although, like, I see today, the inference, like, providers have such a wide variance, right? Like, the ones you mentioned are on the, like, the leading edge, especially, like, Together and Fireworks, I think are on the leading edge of, like, their own custom stacks, all the way down to, like, there's a lot of people who just take the out of box open source software.

    28. SG

      Yeah, I think there's no market for that.

    29. DP

      But those guys have just-

    30. SG

      Yeah.

  4. 6:5010:48

    Implications of an American Open Source Model for the Application Ecosystem

    1. SG

      move, uh, uh, you know-

    2. DP

      (laughs)

    3. SG

      ... out and a layer down. Like, what does having access to an American open source model mean, or just more and more powerful, like, uh, open source AI models mean for the application ecosystem?

    4. DP

      I mean, I know, like, a lot of people in some enterprises are really iffy about, like, using, like, the best open source model. They're, like, worried. It's like, there's nothing wrong with them today. There's nothing in them today, right? You know, there's the worry that one day they will-

    5. SG

      How do you check?

    6. DP

      I mean-

    7. SG

      How do you know?

    8. DP

      ... you don't, but you can just vibes it out. Like, they're, like, competing with each other to just release as fast as possible, right? Like, like, DeepSeek and Moonshot and all these other la- you know, Alibaba, et cetera. Like, they're competing to release as fast as they can with each other. The Alibaba team's in Singapore. Like, I don't think that they're, like, putting Trojan horses in these models, right? And like, there's some interesting papers that Anthropic did on like, you know, trying to embed some stuff in models and ended up, like, being detectable pretty easily. Again, like, I don't know how to... You know, I'm not, I'm not too much into that space of interpretability and, like, evals, but I just don't think that they are, right? It's just a vibes thing. But some people are worried that they could be, or they're just, like, iffy. Like, "Oh, I don't wanna use a Chinese model." It's like, well, fine, but now you're gonna go use a service that is backed by a Chinese model, which is fine. Like, you know, like, uh, but they, you know, they're fine with that, they just don't wanna directly use the model. I don't know, I think, I think it's, it's interesting for some enterprises who are still stuck on LLaMA, but it's mostly just really interesting because it continues to move the commodity bar up. Now, with this tier being open source, and sure, like, probably won't be, like, drastically better than Kimmy, but Kimmy is so big, it's so difficult to run. Like, people aren't running it, whereas the OpenAI model's, like, relatively small, so you can run it without being, like, giga brain of infrastructure. You end up with that commoditizing so much more of the closed source API market. I don't know, I think that's just gonna be great for adoption, right? Like...

    9. SG

      Yeah. One of my, um, hopes is, uh, for our companies that are doing more with reasoning, it is, like, they're still blocked on cost and latency.

    10. DP

      So, this is something that I've found very interesting, is that w- we, we've been trying to build a lot of alternative data sources for token usage. Um, who's using what tokens, what models, where, et cetera, why? And it's very clear that people aren't actually using the reasoning models that much in API. Like, Anthropic has eclipsed OpenAI in API revenue, and their API revenue is primarily not thinking. It's Claude 4, but it's not in the thinking mode. You know, code is, code being the biggest, uh, use case that's skyrocketing. Um, and the same applies to, like, OpenAI and, and DeepMind, and, and from what we see, querying big users and other ways of, like, scraping alternative data, because the latency issues, because the cost issues especially, right? The cost is just ridiculous.

    11. SG

      You're... Exactly. So I guess my view is, um, you're not allowed to have a tech podcast without saying the words Jevons Paradox now. And I, I think, like-

    12. DP

      Oh, nice.

    13. SG

      ... I, I think the behavior is gonna be, like, we see a lot of people use reasoning because it's so much cheaper to run if you take out a big piece of the margin layer and you make it smaller. And so I think, like, we have a lot of companies that are at scale who are using it, but it's so expensive that they restrain themselves.

    14. DP

      For a long time, OpenAI was charging more per token for the reasoning model, right, o1 and o3, uh, than they were for GPT-4o even though the architecture's, like, basically the same. It's just the weights are different. And there's, like, some reason for it to be a little bit more expensive per token because the context length is on average longer. But in general, like, it made no sense for it to be, like, what was it, like, 4X the cost per token? That didn't make any sense. And then finally they, like, cut it. Uh, but for a long time, not only was it, like, way more tokens outputted, it was also a way or higher price per token, even though... And they were just taking that as margin.

    15. SG

      Because they could, right?

    16. DP

      Yeah.

    17. SG

      Because they had the only thing out there.

    18. DP

      Yeah.

    19. SG

      Yeah.

    20. DP

      And then, you know, DeepSeek dropped and, and Anthropic and Google and others started releasing models and it, like, you know, commoditized quite a bit. But this is gonna just, like, kneecap, like, like, take, cut everyone off at the hip, right? Uh, and bring margins down again. So that'll be fun.

    21. SG

      Oh, who has an API business, you mean?

    22. DP

      Yeah, yeah.

    23. SG

      Yeah, yeah.

    24. DP

      For API for models that aren't, like, like, super leading edge.

  5. 10:4817:26

    Evolution of Neoclouds

    1. DP

    2. SG

      What do you think, uh, evolves in the sort of neo-cloud layer over time?

    3. DP

      It's funny, every day we still find a new neo-cloud. Like, we have, like, like, 200 now. And still every day we find new ones, right? Like-

    4. SG

      Should they all exist?

    5. DP

      (laughs) Obviously not, right? Like-

    6. SG

      Okay.

    7. DP

      So, so to, to some extent it depends on what the neo-cloud business is, right? Like today, it's, there is quite a bit of differentiation between the neo-clouds. It's not just, like, buy a GPU, put it in a data center. Otherwise you wouldn't have some neo-clouds with, like, horrible utilization rate, and you wouldn't have some neo-clouds who are, like, completely sold out on four, five, six-year contracts, right? Like CoreWeave, for example, who doesn't even quote most startups, or they just give them a stupid quote 'cause they're just like, "I don't want your business." Or, like, they want a long-term contract, right? Which a lot of people don't wanna sign. And so, like, there's quite a bit of differentiation in, in financial performance of these neo-clouds, time to deploy, reliability, the software they're putting on top, right? Like, many of them can't even install Slurm for you. It's like, what are you doing? Like, and you should have some sort of, like, Kubernetes-

    8. SG

      So, very low-level hardware management, yeah.

    9. DP

      Yeah, yeah. It's, like, very... And it's like, to some extent from the investor side we see a lot more debt and equity flowing in from the commercial real estate folks.

    10. SG

      Mm-hmm.

    11. DP

      As commercial real estate has been really poor over the last couple years, few years, they've been starting to pour money into cloud space. And obviously the return profile is quite different because it's, like, a short-lived asset versus, like, a longer lived asset. But at the end of the day, like, these companies, they're okay with a 10, 15% return on equity, right? Uh, and, and over time net falling. That is...... not okay for venture capital, right? And yet, a lot of these neoclouds are backed by venture capital. So a lot of these companies will fail, either because it no longer makes sense for them to continue to get venture funding, or they end up getting out-competed because they just can't get their utilization up, unlike, you know, some other clouds, right? Like, like the, like the, uh, CoreWeaves and Crusoes and, and such of the world, right? So there, there's sort of, like, a rock and a hard place for a hundred of these neoclouds. And, and there's many of them who are, like, "Oh, no, I purchased these GPUs. I have a loan. It costs me this much, and because my utilization is here, I'm, like, burning cash."

    12. SG

      Mm-hmm.

    13. DP

      Right? And they, they should a- the very least not be burning cash, right? And so some of them are, like, you know, they're desperate to sell the remaining GPUs, so they go out to, like, you know, companies and give them insanely low deals. There, there are some startups who I really commend because they, like, really figured out how to get the desperate neoclouds to give them GPUs. But those neoclouds are gonna go bankrupt at some point because their cash flow is worse than their debt payment. But at the end of the day, like, there's gonna be a lot of consolidation. There is gonna be differentiation, right? There's a lot of software today, uh, but we have this, like, thing called ClusterMax where we review all the neoclouds, um, and, and major clouds, and it's like, like actually, some of these neoclouds are better than Amazon and Google and Microsoft in terms of software.

    14. SG

      In terms of, um, uptime and availability, or however you, uh, measure that.

    15. DP

      Yeah, uptime, availability-

    16. SG

      Yeah.

    17. DP

      ... um, reliability, network performance. Like, there's just a variety of things that they don't have all the old baggage. Uh, but the vast majority are worse, and we, we measure across, like, you know, a bunch of different metrics, including the ones I mentioned, insecurity and so on and so forth. But our vision of, like, ClusterMax is that it starts at, like, a really low stage today, which is, like, does the cloud work? And how long does it take the user to, like, get a workload running? Because you have Slurm installed or you have K8s installed and, um, you know, your network performance is good and your reliability is good and it's secure, right?

    18. SG

      Mm-hmm.

    19. DP

      Like, these are, like, table stakes. Like, what we consider gold or platinum tier today will be just, like, table stakes in, like, you know, six months, a year, a couple of years. Uh, there'll be a whole layer of, like, software on top, and then it's, like, do neoclouds build this software, right?

    20. SG

      Mm-hmm.

    21. DP

      And some of them are, right? Like Together, Nebious, um, are offering inference services on top, right?

    22. SG

      Mm-hmm.

    23. DP

      So they are, they are saying, "Hey, we actually wanna provide an API endpoint, not just rent GPUs." And CoreWeave, r- rumored by the information to be, uh, attempting to buy Fireworks for the same reason, right? Like, do you move up or do you just slide down into, like, "I'm making commercial real estate returns"? Or you have to go crazy, right? Like, Crusoe's like, "We're gonna build gigawatt data centers," right? Like, okay, there's no competition there. There's, like, a few companies doing that, right? So it's very different. So you either have to go, like, really, really big or you need to move into the software layer, or you just make commercial real estate, uh, or you go bankrupt, right?

    24. SG

      (laughs)

    25. DP

      Like, these are the paths for all neoclouds, I think.

    26. SG

      I really have to believe there's a, a reason for being for these companies. And my, like, simple framework for it is, I think the software layer is really hard for people coming from this operation to, to try and build, right? There's actually a lot of very specialized software, so I think people will buy or partner into it.

    27. DP

      Yeah.

    28. SG

      But if you think about other inputs, it could be, like, "I'm very good at, like, finding and controlling power agreements," right?

    29. DP

      Yeah.

    30. SG

      Um, it could be like, "I build at a scale other people are incapable of doing so," as you mentioned.

  6. 17:2627:43

    What It Would Take to Challenge Nvidia

    1. DP

      just be able to pay away.

    2. SG

      I mean, I, I feel like the multi-trillion dollar question that you, um, have thought about for perhaps longer than, um, almost anyone else is, like, what does it take to actually challenge NVIDIA? You know, asking for a friend, what would it take? (laughs)

    3. DP

      The, like, you know, simple way to put it is, like, it's a three-headed dragon, right? Like you have, you have ... They're actually just really, really good at, you know, engineering hardware and GPUs. Like, that is difficult. Um, they're really, really good at networking. And then they're really ... I would actually say they're, like, okay at software, but everyone else is just terrible. No one else is even close on software. But, you know, and, and I guess in that argument, you can say they're great at software. But, like, actually, like, you know, installing NVIDIA drivers is not, like, not always easy, right?

    4. SG

      Well, there's great, and there's also just, like, well, there's, like, 20 years plus of work in the ecosystem, right?

    5. DP

      Yeah, yeah. I- yeah. And I think-

    6. SG

      There's today's capability and, like, usability, and there's just, like, mass of, like, libraries. (laughs) Yeah.

    7. DP

      Yeah, so I think NVIDIA is really hard to take down because of those three reasons. And it's like, okay, as a hardware provider, can I do the same thing as NVIDIA and win? No, they're an execution machine, and they have these three different pillars, right? I'm sure they have a lot of margin, but, like, you have to do something different.... right? Um, in the case of the hyper-scalers, right, Google, Amze- uh, with TPUs, Amazon with Trainium, Meta with MTIA, they are making a bet of, "I can actually do something pretty similar to NVIDIA." If you squint your eyes now, like Blackwell and TPU is starting... Like, the, the NVIDIA architecture with TPU architectures are actually converging, like same memory hierarchies and similar sizes of systolic arrays. Like, it's actually not that different anymore. It's still quite different, right? Uh, but hand wave U, it's, like, pretty similar. And Trainium and TPUs are very similar. Architecturally, the hyper-scalers are not doing anything crazy, but that's okay because they can just, like, do the mass, the margin game. That's fine. But for a chip company to try and compete, they must do something very unique. Now, if you do something unique, it's like, okay, all your energy is focused on that one unique thing, but on every other vector, you're gonna be worse. Like, are you gonna be there at the latest process known as fast as NVIDIA? No? Okay, that's like 20, 30%, right, on cost/performance and power, right? Are you gonna be on the latest memory technology as fast as NVIDIA? No, you'll be, like, a year behind. Great. Same, same penalty. Are you gonna be the same on networking? No? Okay. You know, you just stack all these penalties up, it's like, "Oh, wait, your unique thing can't just be, like, two to 4X faster. It has to be, like, way faster." But then the problem is, if you really look at it simplistically, right? Like, a flop is a flop, right? Uh, again, like, this is super simple, but, like, there is not 10X you can get out of doing a standard von Neumann architecture on efficiency of compute. Um, in which case, do all of these things that NVIDIA will engineer better than you because they have a team of 50 people working on, you know, just memory controllers and HBM, and just, like, a- and networking, or actually, like, thousands of people working on networking. But, like, each of these things, do they just cut you by 1,000? And that's like, "Oh, actually, what would have been 5X faster is now only, like, 2X faster. Plus, if I, like, misstep, I'm, like, six months behind, and now the new chip is there." Right? And you're screwed. So, or, or supply chain, or, like, intrinsic, like, challenges with, oh, okay, getting other people to deploy it now, or rack deployments. There's all these supply chain challenges, right? Like, literally, in Amazon's most recent earnings, they said their, like, chip architecture is not aggressive. Their, their rack architecture is very simple. It's not that aggressive. They were like, "Yeah, we have rack integration yield issues, which is why we've had..." Uh, which is they, like, blamed their miss on AWS for their Trainium not coming online fast enough because of rack integration issues. And when you look at the architecture, like, we have an article on it, but it's like, it's not, like, that crazy. Like, it's like what Google was doing, like, four or five years ago, right? It's like, "Oh, wait, supply chain is hard." And Amazon couldn't get everything in supply chain to work, and so therefore they missed their AWS revenue by a few percent, right? Which caused the whole stock market to freak out. But it's like, there are so many things that could go wrong in hardware, and the timescales are so long. And then the last thing is that, like, model architecture is not stagnant. If it was, NVIDIA would s- optimize for it. But model architecture and hardware, right, software-hardware co-design is the thing that matters. Right? And these two things, you can't just, like, look at one in individual, right? Like, there's a reason why Microsoft's hardware programs suck, right? Because they don't understand models at all, right? Meta, Meta, their chips actually work for recommendation systems and they're deployed for recommendation systems, because they can do hardware-software co-design. Google is awesome because they do hardware-software co-design. Uh, why is AMD not catching up despite being awesome at hardware engineering? Well, yeah, they're bad at networking, but also they suck at software and they can't do hardware-software co-design. You know, there's, like, much deeper reasons why you can get into this, but you have to understand the hardware and the software, and they move in lockstep. And whatever your optimization is doesn't end up working, right? So one example is all of the first wave AI company, AI hardware companies, right? Cerebras, Grok, uh, SambaNova- Yup. ... uh, Graphcore. Graphcore, yup. All of them made a very similar bet. Now, they were very different, right? Well, some of these are architecturally pretty weird, really. Yeah. Right, they're architecturally pretty weird, but they made the same bet on memory versus compute, right? "We're gonna have more on-chip memory and lower bandwidth." Right? Off chip, right? Because tha- that was the trade-off they decided to make. So all of them had way more on-chip memory, uh, than NVIDIA, right? NVIDIA, their, their on-chip memory has not really grown much from A100, H100, Blackwell, right? It's up 30% in, like, three generations. Whereas these guys had, like, 10X the on-chip memory, right? All the way back in, like, A... when they were competing with A100 or even the generation before. But that ended up being a problem because they were like, "Oh, yeah, we could just run the model on the chip," right? Mm-hmm. "If we put the whole weight, all the weights on there. And then, you know, we'll-" Totally. "... be so much more efficient." Yeah. And then th- the models just got way too big, right? Yeah. And Cerebras was like, "Oh, wait, but our chip is huge." Yeah. "Oh, wait, but still the model's way too big to fit on it." This is, like, very simple, right? You know, the same thing's happening in the other direction, right? Like, some companies are like, "Oh, we're going to make our s- like, systolic array, your compute unit, super, super, super large, because let's say LLaMA 70B is an 8K hidden dimension and your batch and all that, like, it's, it's a pretty large MATMUL. Oh, great. Okay, we'll make this chip," and then all of a sudden, all the models get super, super sparse MOEs. Mm-hmm. Right? Like, the hidden dimension of DeepSeque's models are, like, really tiny because they have a lot of experts, right? Instead of one large MATMUL, it's a bunch of small ones, you do route, route, right? Like, and all of a sudden, like, if I made a really, really large hardware unit, but I have all these small experts, how am I gonna run it efficiently? You know, I, I, no one pre- They didn't really predict that the hardware would go that way, but then it ended up going that way. And this is like... This is actually the case with at least two of the AI hardware companies today. I don't wanna, I don't wanna shit talk them just because, you know, it's a let's be friendly. Uh, but like, this is, like, like, clearly, like, what's happening, right? So it's like, you can make a b- decision. It's a hardware bet that will actually be way better on today's architectures, but then architecture evolves and the generality of, like, NVIDIA's GPUs or even, like, TPUs in Trainium is, like, ge- more general than, like, as an architecture. But then it doesn't beat NVIDIA by that much, right? In which case, they're just gonna destroy you with their six months or a year ahead on every technology because they have more people working on it and their supply chain is better, right? So you, you, you... It's kind of really tough to make that architecture bet, have the models not just go in a different direction that no one predicted because no one knows where models are headed, right? Even like, you know, you could get Greg Brockman and he might, like, have, like, a good idea, but, like, I'm sure he doesn't even know where our models will look like in two years. So there's got to be a level of generality. And it's hard to, like, hit that intersection properly. And so I'm very hopeful people compete with NVIDIA. Um, I think it would be a lot more fun. There'd be a lot less margin eaten up by the infra and there'd just be a lot more deployment of AI potentially.... um, if someone was able to compete with NVIDIA effectively. But NVIDIA charges a lot of money because they're the best. And like, if there was something better, people would use it, but there isn't. And it's just really hard to get be- be better than them.

    8. SG

      I mean, you have to give the first gen AI hardware companies some credit because they, like, made a secular-

    9. DP

      Yes.

    10. SG

      ... correct decision about the workload. But then the architectural decisions, like, ended up being hard to predict correctly.

    11. DP

      Yeah.

    12. SG

      Right? Then you have the cycle of NVIDIA innovation, which is really hard to compete with.

    13. DP

      Yep.

    14. SG

      Um, both hardware and also, as you said, supply chain issues.

    15. DP

      Even just putting together servers is hard.

    16. SG

      Yes. Um, I think the thing that you point out that, like, people oversimplified was with maybe a- a current generation of AI chip startups, they're like, "We're betting on transformers," and it's a lot more complicated than that in terms of-

    17. DP

      Yeah.

    18. SG

      ... like, workload at scale and continued evolution in model architecture. And it's also not exposed so that if you're not working with the soda labs at, like, from the beginning, and then you can't make predictions because nobody can make a lot of predictions right now, it's very hard to, like, say, "I'm going to be better at the workload two years from now," in a, like a very comfortable way-

    19. DP

      Yeah.

    20. SG

      ... with no other changes happening. Like, I can't make that bet right now.

    21. DP

      Yeah, and it's like, one- one of the interesting things about OpenAI's open source models, it's like all their training pipelines, but on a quite boring architecture. Right? Like, it's not their crazy, like, cool architecture advantages that they have in their closed source models, which are, make it better for long contacts or more efficient KV cache or all these other things, right? They're doing it on a standard model architecture that's publicly available. They, like, intentionally made the decision to open source a model with a boring architecture that's pretty much open source, right, already. Like, people have already done all these things and kept all the secrets internal that they wanted to keep. And it's like, what's- what's in there, right? Are they even doing standard scale dot product attention? Probably, but like, there's probably a lot of weird things they're doing which don't map directly to hardware, uh, like you mentioned, right? Like, transformer chip architecture is like, there's a lot more complicated here than just like, oh, it's optimized for transformers, because like, so is an NVIDIA chip and a TPU, and their next generation is more optimized for it. Like, they take steps towards it, they don't leap, but as long as they're, like, close enough to where you are architecturally optimized for workload, they'll beat you because of all the other reasons.

    22. SG

      And I think your description of like, how might a, like, a chip startup win or any vendor win by specializing, like, that ex- actually is really hard in this era, like, generalization may-

    23. DP

      And- and-

    24. SG

      ... continue to win to a degree, yeah.

    25. DP

      And it happened with all the edge hardware companies too. You know, we talk about the first gen AI hardware companies for data center, there were a handful, but for the edge there were like 40, 50. And like, none of them are winning because it turns out the edge is just, take a Qualcomm chip or an Intel chip that's made for PC or smartphone and deploy it on the edge, right? Like, that ended up being way more meaningful. So- so it ends up being like, the incumbents, they can take steps towards what you're going for, and if you didn't execute perfectly or if the models didn't change their architecture away from what you thought it would

  7. 27:4328:18

    What Would an Nvidia Challenger Look Like?

    1. DP

      be, you end up failing.

    2. SG

      If you had to make a bet that something becomes competitive, what is the configuration or company type that- that does that?

    3. DP

      I don't want to shill any company that I've invested in or anything like that, and so therefore I'm just-

    4. SG

      Not investment advice, okay, yeah.

    5. DP

      No, no, no. But like, I've like, I like, I would just say like, I probably think that like AMD GPUs or Amazon's Trainium will be probably more likely to be a best second choice for people, or Google TPU of course, but I think Google's just more interested in it for internal workloads. I- I just think that those will be much more likely options, uh, to succeed than a chip hardware startup, yeah. But I mean, I really hope they do 'cause there's some really cool

  8. 28:1834:48

    Understanding Operational and Power Constraints for Data Centers

    1. DP

      stuff they're doing.

    2. SG

      If we zoom out to, um, the macro and we think about just the scale of, um, hardware and data center deployment for these workloads, people talk a lot about the operational constraint on building data centers of this size, the, uh, power constraints. I think in particular on the power side, it's very interesting how that practically shows up. Is it generation, at scale, at cost? Is it grid issues? Is it real... Like, how- how should, you know, more people in technology understand this?

    3. DP

      Yeah, so supply chain is always, like, fun because, like, people want to point at one thing is the issue, but it always ends up being these things are so complicated, like, if one thing was solved, you could increase production another 20% and then something else would be the issue.

    4. SG

      You think it's a multi bottleneck issue?

    5. DP

      Yeah, or like-

    6. SG

      Yeah.

    7. DP

      ... "Hey, for company A it's actually, because their supply chain is this, this is the issue. And for company B it's, this is the issue." But, you know, that's sort of in generalities, but like, I think zooming out, right, like, Noah, Noah Pinion, like he had a really fun blog about like, is this AI hardware build out going to cause a recession? I- I think it's actually funny because you can flip the statement and be like, actually, the US economy would not be growing that much this year if it weren't for all the AI build outs, and as a result, data center infrastructure. As a result, electricians' wages have soared. As a result, power deployments and other capital investments which have 15, 30 year lifespans are being made, and all of this CAPEX is in turn actually growing the economy. And, like, actually, maybe the economy wouldn't even be growing much or at all if it weren't for all of these investments.

    8. SG

      One thing that is perhaps looked over from the, um, White House AI action plan was the view of like, we're going to build these AI data centers in the United States, we're actually going to need like a lot of general investment beyond the GPUs and the power, which are everybody's-

    9. DP

      Yeah.

    10. SG

      ... first two items-

    11. DP

      Yeah.

    12. SG

      ... into, like, labor, for example. Right? So if you just, you know, for simplicity's sake be like, it's the size of Manhattan and we have to run it, and it's a new system with changing topology and like very high degree of relatively novel hardware with failure-

    13. DP

      Yeah.

    14. SG

      ... and like lots of networking, then I'm like, hmm, like kind of feels like we need to have a bunch of new capacity, like from a labor or robotics sort of view.

    15. DP

      In like '23 it was very simple, it's like, NVIDIA can't make enough chips. Oh, okay. Why can't NVIDIA make enough chips? Oh, CoWoS, right? Chip on wavefront substrate packaging technology. And it was like, oh, HBM, right? Like those were like... It was like very simple in '23-

    16. SG

      Bonders, yeah. Yeah.

    17. DP

      ... '24. Like, yeah, all these tools involved in that supply chain. It was great. But then it like very quickly became much more murky, right? Then it was like, oh, data centers are the issue. Oh, okay. We'll just build a lot of data centers. Oh wait, substation equipment and transformers are the issue. Oh wait, power generation is the issue. It's not like the other issues went away, right? Like actually, you know, ch- CoWoS...... uh, is still a bottleneck, and HBM is still a bottleneck. Optical transceivers are still a bottleneck, but so is power generation and data center ph- physical real estate, right? Like, w- I mentioned, like, META is literally building these, like, temporary, like, tent structures to put GPUs in because building the building takes too long.

    18. SG

      (laughs)

    19. DP

      And it takes too much labor.

    20. SG

      Yeah.

    21. DP

      Right? As you mentioned labor, right? That's, like, one way they were able to remove, uh, part of a constraint.

    22. SG

      Yeah.

    23. DP

      They're still constrained on power and they had to delay the, uh, bring up of some GPUs in Ohio because the AEP, the grid in Ohio, like, had some issues, right? The utility, right? Th- uh, would like, bring in a generator or something, right? Oh, okay, great. Uh, we'll, we'll buy our own generators and put them on site. Oh, wait, now there's an eight-year backlog or whatever, a four-year backlog for GE's turbines.

    24. SG

      Yeah.

    25. DP

      Oh, okay, great. Um, I'm, I'm Elon, I'm gonna buy a power plant from overseas that's already existing, we're gonna move it in. Okay, great, now there's, like, permits and people protesting against me in Memphis. Like, you know, there's, like, there's, like, a bajillion things that could go wrong, and labor is a huge one. I've literally had people in pitches be like, "No, no, no. We've already booked all the contractors, so no one else is going to be able to build a data center in this entire area of this magnitude besides us."

    26. SG

      Because we took all the people. (laughs)

    27. DP

      We took all the people. They're gonna have to fly them in. But it's like, okay, fine. Like, you can fly them in, but it's like, there's just, like, not that many electricians in America. And as a result, we've seen the wages rise a lot for people building data center infra. There's a group of, like, these Russian guys who used to work for Yandex, Russia's search engine, who, like, wire up data centers, who now live in America, and they get paid a ton. Like, and they get paid bonuses for being faster, and therefore, they do, like, certain drugs to be able to finish the build-out faster.

    28. SG

      What? (laughs)

    29. DP

      Because they get bonuses based on how fast they build it, right? Like, it's like, there is crazy stuff going on to alleviate bottlenecks, but it's like, there's bottlenecks everywhere. And it really just takes a really, really hyper-competent organization tackling each of these things and creatively thinking about each of these things. Because if you do it the layman old way, you're gonna, you're gonna lose and you're gonna, like, you're gonna be too slow, right? Which is why OpenAI and Microsoft partially, like, Microsoft is not building Stargate for OpenAI, right? Just because it would have just been too slow, and they're doing it the layman old way. You have to go crazy. You have to go... That's why Microsoft rents from CoreWeave a ton, right? Because, "Oh, wait, we, we, we need someone who can do things faster than us," and, "Oh, look, CoreWeave's doing it faster." And now, like, you know, OpenAI is, like, going to Oracle and CoreWeave and, and others, right? Nscale in Finland and all these other companies all around the world. The Middle East, right? G42. Like, anywhere and everywhere they can get compute because you put your eggs in many baskets and whoever executes the best will win. And this infrastructure is very, very hard. Software is, like, fast turnaround times. Like, you know, it's, it's still hard. Software is not easy. But it's like, the cycle time is very fast for, like, try something fail, right? Try something else. It is not for infra, right? Like, what has xAI actually done to deserve their prior funding rounds? They haven't released a leading edge model, right? And yet their evaluation is higher than Anthropic today, right? At least, you know, Anthropic's raising, but whatever, right? Like, it's Elon A and B, they've tackled a problem creatively and done it way faster than anyone else, which is building Colossus, right? Like, and that's, like, commendable because that is part of the equation of being the best at models, right?

    30. SG

      It's the input. Yeah, besides the talent, yeah.

  9. 34:4843:01

    Dylan’s View on the American Stack

    1. DP

      and tackled creatively.

    2. SG

      Speaking of, like, the policy, uh, and, uh, geopolitics implication here, like, what do you think about the, you know, White House, um, implication that America needs to, like, export the AI stack or, like, needs to control important components of it? Like, it's better for us to be exporting NVIDIA chips than to foster a new industry. It's better for us to have, like, a globally leading open source model, et cetera. Like, what actually makes sense to you there?

    3. DP

      I want to tell a crazy story. I was in Lebanon-

    4. SG

      Okay.

    5. DP

      ... uh, for a week.

    6. SG

      It's a good start. Yeah, yeah.

    7. DP

      Yeah. (laughs) This is completely unrelated-

    8. SG

      Yeah.

    9. DP

      ... but it just popped into my head. I think it'll be entertaining.

    10. SG

      Yeah.

    11. DP

      I was in Lebanon. I was with a few of my friends. Uh, so it's, like, two Indian people, two Chinese people, then a Lebanese person, right? Um, and these, like, 12-year-old girls ran up to the Chinese woman that was with us, like, my friend, and they were like, "Oh, my God, your skin's so beautiful. Do you like sushi?" Right? It's like, fine, you're just ignorant. But what was really interesting is, like, when they asked where we're from, we're like, "San Francisco." They're like, "Do people get shot in the streets?" Because their entire worldview was built from TikTok-

    12. SG

      Okay, yeah.

    13. DP

      ... of politics. And it's like-

    14. SG

      Yeah.

    15. DP

      ... when you think about the global propaganda machine that is Hollywood, and it's not intentional, it's just American media is pervasive, it built such a positive image of America. Now, like, with monoculture broken and it's more social media based, a lot of the world thinks America is, like, people are getting shot all the time and it's, like, really bad and it's, like, bad lives and people are working all the time, it's unsafe and, like... You know, like, Europe has a certain view of America and, like, I don't think it's accurate. Like, random Lebanese 12-year-old had a really negative view of some... Like, they liked America, they loved Target for some reason-

    16. SG

      (laughs)

    17. DP

      ... because some influencers posted TikToks about Target-

    18. SG

      Yeah.

    19. DP

      ... but, like, they had negative views of America. And it's like, from a sense of, like, what is important is, like, the world should still run on American technology, right? Uh, and they generally do still, in terms of the web, although, you know, ByteDance TikTok has broken that to a large degree. But in this next age, do you want them to run on Chinese models which now have Chinese values, which then spread Chinese values to, uh, the world? Or do you want them to have American models, to have American values? Like, you talked to Claude and has a worldview, right?

    20. SG

      Yeah.

    21. DP

      And it's like, I don't know if you want to call that propaganda or what. There's a worldview that you're pushing, right? And so I think it makes sense that we need that worldview espoused. Now, how do you do that, right? The prior administration, current administration had different viewpoints on this, right? Prior administration said, "Yes, we would love for the whole world to use our chips, but it has to be run by American companies." And so it was like, "Microsoft, Oracle, we're cool with you building shitloads of capacity in Malaysia. We don't want random other companies doing it in Malaysia." And so the prior diffusion rule had a lot of technical ways in which, like, you know, you could be, you, you could have these, like, licenses and all this. And it was very hard for, like, random small companies to build-... large GPU clusters, right? But it was very easy for Microsoft and Oracle to do it in, in Malaysia. Of course, the current administration tore that up and they have their own view on things. I, I, I mean, I think there was a lot of things wrong with the diffusion rules, right? They were just too complicated. They pissed a lot of people off, et cetera. Now they have a different view, which is like what did they do in the Middle East, right, with the deal they signed? Well, actually, most of those GPUs are being operated by American companies or rented to American companies, right? Either or, right? Like G42, operating them, but renting them mostly to, like, OpenAI and such for a large part. Or Amazon and Oracle and others are operating the GPUs themselves in the Middle East. So it's like, okay, that's effectively the same thing, but in a very different way. That is still, I think, a view, right? Which is, like, we want America to be as high in the value stack as possible, right? If we could sell tokens, or if we could sell services, we should.

    22. SG

      Mm-hmm.

    23. DP

      Okay, but if we can't sell the service, let's at least sell them tokens. Okay, if we can't sell them tokens, at least sell them, like, infra, right? Um, whether it be data centers or renting GPUs or just the GPUs physically. Um, and it sort of, like, makes sense, right, in the value chain. Like, give them the highest value, highest margin thing where we capture most of the value, and, like, squeeze it down to where, like, actually, for, like, the bottom of the stack, right? Like the tools to make chips, maybe you shouldn't sell. And so, like, ex- current export controls and policy dictate that, yes, uh, you know, it's better to sell them services, but sell them both, right? Like, give the option, let us compete, uh, and don't let anyone else win. I think the challenge here is that, like, how much are you enabling China by selling them their RPGUs? Like, how much fearmongering around, like, Huawei's production capacity is there? Like, how realistic it ver- is it versus not because of the bottlenecks of, like, Korea, sanctions that America's made Korea put on China for memory or Taiwan on China for, uh, chips or, you know, US equipment on China, right? Like, there's a lot of different sanctions. Many of these are not well enforced/have holes, but it's sort of like a... It's a very difficult argument on, like, how much capacity of GPUs should be sold to China. A lot of people in, in San Francisco, frankly, don't sell China any GPUs. But then they cut off rare earth minerals and, you know, like, ostensibly, most people think that, like, the deal was that you get, you get GPUs and also EDA software 'cause the administration banned EDA software for a little bit, just for, like, a few weeks basically, until China was like, "Okay, we'll ship rare earth minerals." You can't just ban everything because China can retaliate. If they banned rare earth minerals and magnets and such, car factories in America would've shut down and the entire supply chain there would've had, like, hundreds of thousands of people not working, right? Like, you know, like there is like-

    24. SG

      Yeah.

    25. DP

      There is a push and pull here.

    26. SG

      There's a standoff here, yeah.

    27. DP

      There is a push and pull here. So like, do I think China should just have the best NVIDIA GPUs? No, like that, that would suck. But like, you know, can you give them no GPUs? No, they're gonna retaliate. Like, there is a middle ground and, like, Huawei is eventually going to have a lot of production capacity, but there's ways to slow them down, right? Like properly ban the equipment because it's not, there's a lot of loopholes there. Uh, properly ban the sub-components, the, uh, of like, of memory and wafers 'cause Huawei is still getting, uh, you know, wafers in Taiwan from TSMC through, like, shell companies, right? Like, it's like, you know, there, there's a lot of enforcement challenges because parts of the government are not, like, funded properly or not competent enough and has never been competent, right? So it's like, how do you work within this framework? Well, like, okay, fine, we should sell them some GPUs so that they, you know, that kind of slows them down on a Huawei standpoint, although not really, right? Um, but also, like, gets us back the rare earth minerals, but don't sell them too many, right? Like, how do you find that massive gray line is what the administration's grappling with in my view.

    28. SG

      Implied in that opinion is your belief of they are going to be able to build NVIDIA-equivalent GPUs eventually.

    29. DP

      Um.

    30. SG

      If forced.

  10. 43:0144:22

    What Dylan Would Ask Mark Zuckerberg

    1. SG

      is the game.

    2. DP

      (laughs)

    3. SG

      Um, I wanna ask you, like, a wild card, uh, question to, uh, finish out. Um, we're trying to get Mark to do the podcast.

    4. DP

      Zuck?

    5. SG

      Yes. Uh, you can ask him any question. What would you ask? Mark, you gotta do the podcast.

    6. DP

      I thought, like, the like... Did you read the doc- the page they put up? I thought that was very interesting that they were like, "We want AI to be your companion." So my question to him is not, like, around his infra stuff 'cause I feel like I know most everything. Like, you can figure that stuff out from supply chain and, like, satellites and all this stuff. But, like, the interesting thing I'm curious about is philosophically what exactly, like, does the world look like if everyone is talking to AIs more than other people or if they're interacting socially with the AIs more than other people? Do we lose our human element? Do we lose our human connection? It's not the same thing as, hey, I'm posting on social media and we're interacting with our social media posts, which that already breaks the brain of a lot of people. What happens when it's, like, always on your face like meta... You know, his worldview is like meta reality. Labs makes these, like, devices that you wear and they're always... They have all this AI on them and you're talking to the AI companion all the time. How does that change the human psyche? Like-This human-machine evolution, like, is- what are the negative ramifications of it? What are the positive ramifications? How do we- how are you going to make sure that there's more positive ramifications from this than like, you know, the sloppification and like complete brain rot of like our youth, right? Which I- I like love my brain rot, right? Like

  11. 44:2246:51

    Poker and AI Entrepreneurship

    1. DP

      it's like, okay.

    2. SG

      Obviously, the coding wars continue to be like very central.

    3. DP

      Mm-hmm.

    4. SG

      And we were talking about cognition's relevance and like how- how to think about the strategy here. But I do think it's really funny what flipped your bit on cognition. Can you tell the story?

    5. DP

      I- I thought cognition, NGMI, right? Like, you know, like OpenAI, cr- Anthropic, xAI, et cetera, they're just gonna make better code models. Like, you know, they just have way more resources. General models will win. You know, I hadn't- hadn't really met too many people there. It was just like a pure vibes-based thing. And I had, you know, I'd used a little bit of- of Devin, but I was like, "Whatever," right? Like, it was like, "Cloud Code seems better," and we use that internally. But like, I went to Coatue's East Meets West event. It's an awesome event where there's people from Asia. Like, there was like, you know, all these like CFOs and CEOs of like major Chinese companies, uh, East Coast of US, all these finance bros. Also, West Coast, like a lot of tech people, right? So you and I were both there. There were people from governments and major companies. And Scott was there. Um, I spoke with him like very briefly, but then what was interesting is like, it's like, you know, they have a poker night one night and everyone gets plastered. The- the like leader of Coatue is like-

    6. SG

      LaFont, yeah.

    7. DP

      ... very good at poker. These hedge fund guys are just good at poker generally.

    8. SG

      And they love it.

    9. DP

      Technically, they like poker as well.

    10. SG

      Yeah.

    11. DP

      You know, there's a big poker culture in- in the Bay. I was playing. I'm okay, right? Um, but I see- I see sc- I look over at the super high stakes table. Scott's just dominating everyone, right? I'm like, "What is going on?" Like, how are you like- you're like taking chips from like CEO of major Chinese company. I don't want to name people's names because I think there's-

    12. SG

      Yeah.

    13. DP

      ... like some terms around them, like naming who's there. But like, you know, it's like you're- you're like winning like a lot of chips from a lot of big people. And it's like all of a sudden my vibes were like, "I don't know, maybe like maybe he can win. Maybe he can take from the lion, you know?" Uh, so I was like very excited about that. Uh, you know, I thought it was funny. Uh, I still have zero... Like, I- I have not done much due diligence on their code product. Like, you know, like it's like nor have I owned like Cloud Code besides the fact that we use it. But it's like, you know, cool.

    14. SG

      Well, I think Windsurf acquisition part two is like a- a pretty good hand to play here. Um, and, uh, you know, as somebody who invests a lot at a, you know, violently competitive application level-

    15. DP

      Yeah.

    16. SG

      ... poker game is live, man. Everybody, they're-

    17. DP

      (laughs) Exactly.

    18. SG

      You just invested live players.

    19. DP

      Exactly. And- and so I- I just loved that, you know, that was how he, uh, he dominated everyone. And it's like- it's like it's such a stupid reason 'cause I pride myself on being analytical and like data-driven. And it's like, you know, vibes.

    20. SG

      Correct. For any entrepreneurs listening, I think like, you know, Dylan might angel invest or we might back you fully if you- if you win the cognition poker game.

    21. DP

      (laughs)

  12. 46:5147:17

    Conclusion

    1. DP

    2. SG

      Uh, and we'll host at Conviction. Um, okay. Think we got it? Good. Awesome.

    3. DP

      Yeah. Thank you.

    4. NA

      (instrumental music)

    5. SG

      Find us on Twitter @nopriorspod. Subscribe to our YouTube channel if you wanna see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way, you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.

Episode duration: 47:17

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode vGhlJqnECd0

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome