Arthur Mensch: Open vs Closed - Who Wins and Mistral's Position | E1146

Arthur Mensch is the Co-Founder and CEO of Mistral AI. Since its inception in May 2023, Mistral has raised over $520M in funding from investors like Andreeseen Horowitz, General Catalyst, Lightspeed Venture Partners, and Microsoft with a current valuation of $2 billion. Before founding Mistral, Arthur was a research scientist at DeepMind, one of the leading AI institutions in the world. ----------------------------------------------- Timestamps: (00:00) Intro (00:47) Background (07:08) Efficiency vs. Scale in Model Development (10:21) Challenges & Opportunities for Improving Model Quality (24:53) The Decision to Close Some Models (25:53) Balancing Research & Sales Teams (30:06) The Readiness of Enterprises for AI Adoption (34:57) European vs. US Investors (40:18) Does the Source of Funding Matter for Scaling Constraints? (46:45) Quick-Fire Round ----------------------------------------------- In Today’s Episode with Arthur Mensch We Discuss: 1. From Models to Team Building: Arthur’s Greatest Lessons at DeepMind: What were Arthur’s biggest lessons from his time at DeepMind? How did DeepMind shape how Arthur built Mistral? Why does Arthur believe smaller teams are better for AI? Why did Arthur decide to leave DeepMind and start Mistral? 2. Scaling Mistral to $2 Billion Valuation Within a Year: What made Mistral 7B so successful? What did Arthur learn from the model release? What are the biggest barriers at Mistral today? How does Arthur balance the sales and research teams at Mistral? What does Arthur know now that he wishes he had known when he started Mistral? 3. How to Win in AI: Open Source, Cost, & Adoption: Why did Arthur open-source some models? Why did he close some? How quickly will the cost of compute go down? Why does Arthur believe marginal costs will not go to zero? How will open-sourcing LLMs affect the marginal cost? Does Arthur think open source is ready for enterprise adoption? What questions should enterprises be asking about AI adoption today? What are the biggest challenges to AI adoption today? 4. The Future of LLMs: What does Arthur think are the largest bottlenecks of model quality today? Does Arthur think future models will be more generalized or vertical-focused? What does Arthur think about the future of commoditization in models? Why is Arthur optimistic about the profitability of the application layer of AI? How should models differentiate themselves today? ----------------------------------------------- Subscribe on Spotify: https://open.spotify.com/show/3j2KMcZTtgTNBKwtZBMHvl?si=85bc9196860e4466 Subscribe on Apple Podcasts: https://podcasts.apple.com/us/podcast/the-twenty-minute-vc-20vc-venture-capital-startup/id958230465 Follow Harry Stebbings on Twitter: https://twitter.com/HarryStebbings Follow Arthur Mensch on Twitter: https://twitter.com/arthurmensch Follow 20VC on Instagram: https://www.instagram.com/20vchq Follow 20VC on TikTok: https://www.tiktok.com/@20vc_tok Visit our Website: https://www.20vc.com Subscribe to our Newsletter: https://www.thetwentyminutevc.com/contact ----------------------------------------------- #20vc #harrystebbings #arthurmensch #mistralai #mistral #samaltman #ai #ceo #founder #venturecapital #startup #opensource #llms

Harry StebbingshostArthur Menschguest

Apr 29, 202450mWatch on YouTube ↗

EVERY SPOKEN WORD

100 min read · 19,733 words

0:00 – 0:47
Intro
1. HSHarry Stebbings
  Do you feel like you have enough cash now? (cash register dings)
2. AMArthur Mensch
  Uh, I guess a startup is always fundraising.
3. HSHarry Stebbings
  (laughs) What are the biggest barriers to Mistral today?
4. AMArthur Mensch
  We are still bottlenecked by compute for sure, but that's because we don't have many of it. We have 1.5 k800, which is a few percent off our competitors.
5. HSHarry Stebbings
  Was it a mistake for you to not scale that quicker?
6. AMArthur Mensch
  I mean, you can't really scale that much quicker. You can't raise, like, two billion on the seed round. I mean, at least you couldn't in 2023.
7. HSHarry Stebbings
  Ready to go? (instrumental music plays) (mouse clicking) Arthur, I am so excited for this. JC introduced us quite a long time ago now. I've known you for a while. I've been wanting to make this happen for a while. So thank you so much for joining me today.
8. AMArthur Mensch
  Thank you for having me. Um, it's a pleasure.
0:47 – 7:08
Background
1. AMArthur Mensch
2. HSHarry Stebbings
  Ah, the pleasure is mine, my friend. But I wanna start, what would your, or how would your parents or teachers have described the young Arthur? I'm just always intrigued by the characteristics and traits of the best founders. How would they have described a nine, 10-year-old Arthur?
3. AMArthur Mensch
  I guess I was a bit curious and a bit stubborn, uh, I should say. Uh, and, uh, uh, not very nice (laughs) to my, to my brothers, I think. But, uh, that, that improved over time, uh, uh, was the eldest of, of them also. I don't know. You should ask them, uh, but it's, uh, I don't know, I think they, they have good, uh, good memories hopefully.
4. HSHarry Stebbings
  Do you know what? Sadly, your mother wasn't in our reference list, so we missed that one out, but, uh- (laughs)
5. AMArthur Mensch
  Yes. Okay.
6. HSHarry Stebbings
  ... you know, J- JC provided some great commentary. So I do wanna start though also, you know, growing up, what was your first exposure to AI? You're a kid in France, how did you first get exposed to AI and machine learning, and what was that first passion point?
7. AMArthur Mensch
  I think that was Andrew Ng flying a helicopter, an helicopter backward. Uh, it was, uh, flipped around, and, uh, it's a, it's a control problem which is not easy to solve, and I'm not sure if it was really AI related. I think he was saying that it, he was using an R network to control all of this. But, um, that's the latest I remember of, uh, well, the first memory of, uh, me being, uh, shown what you could do with, uh, machine learning at the time. Uh, and so that, that was in 20- 2013, I think.
8. HSHarry Stebbings
  (laughs) M- most recently though you spent, you had two and a half years, three years at DeepMind. Can I ask, what are the biggest takeaways for you from that experience and how did they impact how you think about building Mistral?
9. AMArthur Mensch
  A team of five is, uh, faster than a team of 50, uh, except if you organize the team of 50 to be, uh, 10 teams of five that are sufficiently uncoupled. Uh, and that's, I think, one, one finding that I learned the hard way at DeepMind and the reason why we created the company in a slightly different way in term of organization of the science team. Um, that's, uh... And also the reason why we knew we, we had a chance to do interesting things, uh, with a smaller team.
10. HSHarry Stebbings
  Can I just ask, sufficiently uncoupled, do you not lose efficiency or is there not a leakage between those silos, and it creates actually inefficiency by having such silos?
11. AMArthur Mensch
  You have to share some things. So you share the infrastructure, you share the code base, uh, you share findings. Uh, but, you know, when you involve... We- we're doing general-purpose models. And general-purpose models, you need to evolve them in different directions. So you need to make them speak different languages, you need to make them be able to code, be able to do mathematics, be able to reason, uh, you need to add multimodality to them. All of these things are loosely coupled. It's useful if, uh, you use the same framework for optimization, for data, for training. But, uh, you don't want to have your team spend their entire day in meetings for coordination, and it's actually pretty hard to figure out. Uh, and, uh, I think so far we've managed to scale it relatively well. Although the team is only 25 people, so that's actually not super challenging. Uh, it will become more, more and more of a challenge. Uh, but yeah, that's the, that's what I remember from DeepMind. It worked very well at the beginning. Gemini was a bit too slow, and I think they recovered sufficiently well since. Uh, but, um, yeah, that's the, uh... We, we have optimized the team, uh, to be as fast as possible and to ship as fast as possible.
12. HSHarry Stebbings
  Was it an easy decision to leave to start Mistral? You know, you're at DeepMind, one of the best institutions in the world for AI with some incredible talent around you. Was it an easy decision? And just take me to that moment when you decided to leave to found or co-found Mistral.
13. AMArthur Mensch
  So it's not a zero to one decision. It's not a binary decision. You start to think, uh, like, "I'm 10%," uh, "leaning on leaving," and then it grows, and at some point, you cross the threshold, then you say, "Okay, well, I guess now that I'm sufficiently decided, there's no way in which I stay more than a, uh, more than, uh, than a few days." Uh, otherwise I wouldn't be candid with my colleagues. And so that's, uh, that's how you, you, you get started. You say, "There's no turning back actually."
14. HSHarry Stebbings
  What was that point for you?
15. AMArthur Mensch
  Uh, my, that, that point for me was probably, uh, around March, uh, or no, more, more, yeah, around end of March at the, uh, last year, where, which, uh, precedes, uh, three day... I mean, I decided to leave on Friday, and I resigned on Monday. (laughs)
16. HSHarry Stebbings
  (laughs) I love that.
17. AMArthur Mensch
  And so, so you did. You, you can't stay if you've decided to resign. It's, uh, all the way, it's not very fair.
18. HSHarry Stebbings
  No, I, I totally agree with you. Now, I do wanna d- run this with some chronology. I spoke to so many of your advisors, investors, uh, users even, um, and I wanna start with actually the first model, you know, Mistral 7B being one of the most popular released, you know, a while ago now. Why do you think it was so popular? What do you think you did so right? And what did you learn from that?
19. AMArthur Mensch
  I think it, it served two purposes. So the first was to, uh, show that, um, there was a lot of, uh, slack in compressing models. And so from a scientific perspective, it was a good finding and, and a good learning from, for- from the community. Um, it also filled the gap, uh, in the efficiency to performance, uh, 2D space of models, uh, where there was definitely something missing. Uh, 7B is the size, uh, that allows to run efficiently a model on your MacBook or on your smartphone. Um, and...... and we made it sufficiently smart so that it was still useful. So there was already 7B models before, but they weren't good enough to do interesting applications. And so by, by targeting this specific space, uh, we talked to the developers immediately because developers, like the casual developers running on a, like, gaming GPU or on its MacBook. So it, it created a lot of curiosity and adoption because it was a missing spot in the, in the performance-to-efficiency space.
20. HSHarry Stebbings
  I totally get you. When you look at, like, lessons from that and how it impacts future releases, any that really stand out for you?
21. AMArthur Mensch
  I guess, uh, it, uh, taught us that there was a lot of interest for efficiency, uh, rather than scale, and so that's why we continued of targeting very efficient models with the Mixtral HX 7B and, and more recently Mixtral HX 22B, ensuring that, uh, for a certain, uh, cost and for a certain size, we were reaching the top performance of the market. Uh, and so that's, uh, that has been our, uh, our major motivation for us to target efficiency, uh, while simultaneously scaling to other
7:08 – 10:21
Efficiency vs. Scale in Model Development
1. AMArthur Mensch
  larger and larger models.
2. HSHarry Stebbings
  I spoke to Sara Guo before the show and she said the core question that I think is, you know, bluntly with the focus on efficiency and the efficiency frontier, does scale matter?
3. AMArthur Mensch
  Well, scale matters in the sense that, uh, if you spend more training compute, you can make the models more compressed. So you do need to have some compute to compress models. Uh, now, scale isn't the only reci- uh, the only, uh, ingredient to, uh, the recipe you need to scale, but you also need to have proper data. Otherwise, you reach, uh, some data quality limit. You need to have proper techniques, uh, for training. Uh, you need to, yeah, figure out a few... There's... I mean, people call it compute multiplayer, I guess. Uh, how do you actually make some efficiency gain that are not costing you compute? Because computing is, is expensive. And so the... one of the thing that we do at Mixtral is to try and harvest these, uh, compute multipliers.
4. HSHarry Stebbings
  Can I ask, in that chasm of efficiency gain without costing more compute, is there much more efficiency we can eke out? I'm totally naive here. Like, is there a lot for us to eke out or are we working really at marginal improvements already?
5. AMArthur Mensch
  I think it's an open question. Uh, I believe there is. I believe we can make models that are much better for a certain size, but it's, it's as open a question as, uh, can you find... can you make a much better model on the same kind of data by making it bigger and training it for longer? Uh, it's things you need to discover them also on the way. You can try and predict, uh, the kind of performance you reach out, uh, you, you will achieve. At the end of the day, you need to try it out. So that's the... I mean, it's really much a research field. You need to do the research and you need to try things. And, uh, that's what, uh, we've been doing.
6. HSHarry Stebbings
  So I asked Sam Altman this question. What is the end state for the model landscape? Most people say, "Ah, it'll become commoditized, and actually there'll be 12 players and it'll be a race to the bottom." What is the end state for models in your mind and how do you think about the commoditization question?
7. AMArthur Mensch
  I think the end state is, uh, to have a more developed, uh, develop- well, uh, more features on developer platforms that allows to do customization, that allows to make low-latency models that serve a certain purpose, that allows to evaluate them and to improve them over time. And so the model is only, uh, like a tiny part. I mean, it's a central part, but it remains a tiny part of an application. And what you want to do across time and when you deploy an application, uh, that you expose to users, you want to ensure that it works, ensure that is... that its latency reduces over time, ensure that its quality increases over time. And so I think that's... The, the end state is models are effectively going to be a starting point for any AI application developer. Uh, they need to be surrounded by tools, by, um, a lifecycle management, uh, platform basically, and that's the one thing that, uh, we started to build. Like general-purpose models are a bit undifferentiated, but the differentiation that you need to create for, uh, for your application comes from the data you put into it, the user feedback that you gather and the intelligence that you have to figure out what the application should be doing. And that is not commoditized at all. There's no recipe that, uh, allows to go from a, a general-purpose model to a model that is super good and better than all of the others at your specific task. And I think this is something... this is a missing piece in the puzzle, and that's o-o-one of the aspects where we're putting our strength, uh,
10:21 – 24:53
Challenges & Opportunities for Improving Model Quality
1. AMArthur Mensch
  on the product side.
2. HSHarry Stebbings
  Sam and, and Brad said the other day that models just aren't actually that good any- like yet, and they need to improve a lot in quality. What are the largest constraints or bottlenecks on model quality today, and what needs to change for them to improve?
3. AMArthur Mensch
  I think the, the data quality is, is a constraint. Uh, how do you ensure that the model... how do you leverage the entire world knowledge and ensure that the model follows a certain path toward learning more and more complex things? Uh, that's very important part, and I think it has been a neglected part. There's obviously compute, but, uh, given the amount of data you have, uh, we have at hand, uh, compute is already running into... uh, is no longer the bottleneck. The bottleneck is more the data at that point, if you look at text-to-text models. Uh, and so the question is, how do you refine the data and how do you feed very high-quality data to the model itself, uh, in order to improve it over time? And I think in, in that setting, it becomes a bit... Y- you... O- one bottleneck that is associated to, uh, bringing better model performance is the question of how do you evaluate these performances? You need to have very good evaluation that targets very specific topics. Like, you want the model to be good at, uh, helping diagnosis in, in, in med- uh, in hospital, but in French. And oftentimes you're a bit out of domain compared to the data you have, and that's where you, you should identify a, a gap and you should try and fill it out. Uh, so the pushing the model capabilities become, uh, like also a question of mapping where they're failing and figuring out ways of improving it. For instance, they are failing at mathematics, how do you improve, uh, their mathematic thinking? How do you improve the way they demonstrate theorems? And this is... The answer to this is very different from the way you answer the question to how do you improve, uh, yeah, uh, uh, a medical diagnosis in French, for instance.
4. HSHarry Stebbings
  Will we see large-scale generalized models that are able to answer huge swathes of very complex problems, or do you think we'll see much more vertically specific, smaller, small specific models that are much more vertically aligned?
5. AMArthur Mensch
  Yeah, we- we believe that. And actually, these vertical models are not going to be out there. They're going to be built by the application makers, uh, because the only way you can make a low-latency model that is super good at a specific task is to, uh, get rid of the general-purpose aspect because a general-purpose model is a bit bloated. It can think about everything. But if you- you want your model to think thoroughly about a specific topic, so that you can call it in your AI application while maintaining a good user experience with low latency.
6. HSHarry Stebbings
  What role do you play in that world, then? I'm so sorry for asking blunt questions, but it's, like, if it's actually in the application layer where you have that specific model creation where that kind of value accrues, where do you play in that?
7. AMArthur Mensch
  Uh, it's a very hard job to make a specialized model. Uh, so it's actually very tied to the way you create a pretrained model. Uh, and so bringing, uh, the tools that allow to do it, uh, in a foolproof way, so allowing developers to create a customized model, um, that are performing very well at their task, uh, but that doesn't require expert AI knowledge, which is hard to find, uh, is definitely something where we're insisting.
8. HSHarry Stebbings
  So (laughs) I'm an investor stay, and I'm- I'm pleased that you just said that there will be value accrued at the application layer 'cause I look and I worry that, bluntly, everything is going to get steamrolled by some of the players that we mentioned. How do you answer the question of will value accrue at the application layer? And for me as an investor stay, Arthur, you know me, how would you advise me, genuinely?
9. AMArthur Mensch
  There's two opposing directions. So the first is that the models are getting better and better. So it means that creating a verticalized application, as long as you have the data for it and a good understanding of the use case you're facing, is going to be easier and easier if you have access to the tools that facilitate it. Uh, so that's the first aspect, uh, which would make me think that, uh, the application layer is going to grow thinner and thinner. But then there's also the fact that the models are- are getting, uh, cheaper and cheaper because we manage to compress them, uh, because we make a lot of improvement on their efficiency. And so that means that effectively this plus the competitive pressure there is on the model layer means that, uh, the price around the model, the- the dollar per intelligence unit, let's say, is definitely going to reduce. So, uh, there's these two aspects of growing ability, compressed price, uh, which on one side says that the application layer is going to grow thin and on the other side says that the model part is going to grow thin. So for us, the- the approach that we are taking is that- is to assume that the model part is still going to be, uh, big enough and that we need to build this platform on top of that, uh, because that's where we are going to enable all of the vertical applications that will be interesting for humanity.
10. HSHarry Stebbings
  How do you think about that positioning and brand? 'Cause there are other players who are much more direct in saying, "Hey, you know, we'll s- you know, we're- we're gonna dominate a lot of different verticals," and kind of be afraid. How do you think about that enabler to vertical applications or not in that positioning?
11. AMArthur Mensch
  We are not a verticalized, uh, company. Uh, we started Mistral to, uh, bring value to developers and to bring freedom to developers. So, uh, when we started, there was basically, uh, one API out there, uh, soon two, um, and the field of generative AI was starting to look like it would be very centralized around a couple of players. And we took this platform approach where the model that we're making and the technology that we are making, we are allowing developers to own it, to modify it. And so bringing freedom to developers and AI application makers is, I think, the best way in- in, um, distributing generative AI as widely as possible, which is our objective as a company. Making AI ubiquitous, uh, bringing Frontier AI into everyone's head, uh, is the reason why we started. I think we started, we did a good job at it. We, I mean, we- we still have a lot of things ahead, but, uh, this open source part was, I believe, a good enabler for the community and made people realize that they could build very interesting technology by modifying the models themselves instead of depending on the APIs of a- a couple of providers.
12. HSHarry Stebbings
  Dude, what do AI dele- developers care about? Everyone kind of gets on Twitter and goes, "Oh, did you see X's performance this week is better than Y's performance last week?" What do they care about? Efficiency, scale, cost? What drives their usage and decision-making?
13. AMArthur Mensch
  They care about, um, cost for sure. Uh, they care about customization, uh, being able to modify the models at will, and on that aspect, I think we are only scratching the surface of what can be done. Like, the fine-tuning aspect that has been, uh, like, the go-to solution is probably a little too low level from what, uh, we should be doing. They care about being able to deploy anywhere. So they operate in a certain space, in a certain cloud. Uh, they, uh, might be operating on-prem. They might have some edge devices to deploy to. Um, and they want to be able to put that technology there. And so they also care about portability, which in turn offer data control. Uh, usually LLMs, AI is- becomes very useful when you connect it to knowledge bases or to, uh, anything that is related to a certain business. In that respect, it becomes a very sensitive part of your application because it sees everything. It sees ev- all of the data you have. And so, uh, enterprises, for instance, do care about ensuring that the proprietary, the proprietary data they have, um, is accessed in something that they com- they can completely secure. Uh, and that's the reason why we- we deployed our platform on Azure and AWS, for instance. That is bringing the security layer that they need.
14. HSHarry Stebbings
  We're gonna get to enterprise. Can I just ask you, does brand matter in this segment? You know, when we think about building brand, both in terms of developer adoption brand, corporate brand, is brand a large determinant of adoption in this segment?
15. AMArthur Mensch
  Brand, uh, seems to be critical, uh, and this is something that, uh, we have learned on the way. Uh-
16. HSHarry Stebbings
  (laughs)
17. AMArthur Mensch
  ... people use certain models because they are known to be good. Uh, you can't afford to evaluate everything out there.Uh, and so having some form of, um, uh, community, uh, vouching is super important. Uh, and the, the approach we took with, uh, APHG, uh, distributed models has contributed to what I think has become, well, at least a known brand. Uh, and, and we believe that it's definitely going to be important. Brand is important because trust is important in that domain. And open source brings trust in term that provide, uh, some trusted brand.
18. HSHarry Stebbings
  You mentioned the word open source there. I'm, I'm gonna get to that. I do just wanna touch on, the, you mentioned cost also. I wanna touch on cost. Uh, h- hard question. How, when, and who will make marginal revenue that exceeds marginal cost in LLM-based products?
19. AMArthur Mensch
  (sighs) Uh, you should be telling me. You, you, I mean, you are, you're the investor, right? So I guess you have, uh-
20. HSHarry Stebbings
  (laughs)
21. AMArthur Mensch
  ... you have made your own content. (laughs)
22. HSHarry Stebbings
  Yeah. I- I'll... That means I know nothing, okay? And what, what-
23. AMArthur Mensch
  I can tell you who is doing the most, uh, the most margin at the moment, but, uh, but it's, uh, it's probably going to evolve over time. And-
24. HSHarry Stebbings
  W- who's doing the most margin at the moment?
25. AMArthur Mensch
  NVIDIA is, uh, uh, at that point. Uh, the cloud providers are pretty much at cost. Uh, LLM providers, we are not at cost, uh, of, uh... hopefully, but, uh, the margin that are known to be lower than the typical software margins. AI application makers, uh, some of them, the one that are most used seems to be doing a pretty good margin. Uh, I think it's going to be quite a moving space. Uh, as I've said, the, the capacity of models makes the, the cost of making an application, uh, lower and lower. I don't think there's any way in which, uh, the marginal cost, uh, and, and the margin of, uh, the most important part of, uh, of that technology, which is really the foundational layer, uh, become zero, 'cause otherwise there's definitely going to be a, I guess, uh, a fairness problem.
26. HSHarry Stebbings
  What do you mean by the fairness problem? Talk to me about that.
27. AMArthur Mensch
  Usually, the value tends to accrue where most of the difficult part is and most of the defensibility is.
28. HSHarry Stebbings
  Mm-hmm.
29. AMArthur Mensch
  Uh, and, uh, it was for a while, it has been on foundational models. Um, I think it's obviously evolving with time, and there's no moat that isn't disappearing or evolving with time. That will remain the part where most of the innovation will, will be made and where most of the... well, at least a significant part of, uh, of the accrued value will, uh, w- well, the value will accrue.
30. HSHarry Stebbings
  Is there actually much of a barrier to creating a foundational model company today? I know that's a really broad, stupid question in many respects, but you have so many different players now, and new ones popping up every day. Is, is the barrier just reducing day by day?
24:53 – 25:53
The Decision to Close Some Models
1. AMArthur Mensch
2. HSHarry Stebbings
  You started off completely, like, very open source, uh, very much, uh, open to the community. Now you have small models open and then larger ones closed. Am I right?
3. AMArthur Mensch
  We also have large models that are open now, uh, 'cause we released... I mean, depends on, uh, the threshold for small and large but, uh, the eight times 22B is actually relatively large, uh, uh, by any standard.
4. HSHarry Stebbings
  What was behind the decision then to close some models? Is it just a business case where you need to make money?
5. AMArthur Mensch
  Um, opportunistically, there was a opportunity to grow the business using that asset as something that we were selling. Uh, it's still the case that we're growing that, our business on top of, uh, commercial models in particular. There's also a good, uh, way of, uh, cementing some, uh, strategic relationships with cloud providers. And it's going to be, to continue to be the case. Like, we still intend to, to be a leader in the open source part, uh, and to have some unique assets that we can, uh, license, and, and to have some unique platform that, uh, developers
25:53 – 30:06
Balancing Research & Sales Teams
1. AMArthur Mensch
  can use.
2. HSHarry Stebbings
  How do you think that... It's hard when you suddenly have, you know, some closed and you start building an enterprise team. For you as a founder now, how do you think about that balance between a research team and a sales team, and making sure that the two cultures come together well?
3. AMArthur Mensch
  I think some important thing is to create, uh, empathy. So ensure that, uh, the science team also understand, uh, the problems that the users are facing. It improves the science because at the end of the day, the general-purpose technology we are making is only general purpose if you identify the use cases. So that comes back to the earlier discussion we had. So ensuring that the science team has some relatively direct exposure to the product and to the business team is actually important to make them understand what, where the model is failing and how it could be improved significantly. Um, and on the other side, the go-to-market team has to understand it, it's a very technical sales, uh, uh, sales motion, 'cause you, you're selling not the product, but you're selling something that is going to power the product. So you need to tell the customer, uh, how, how these things should be used to actually make something that brings value to the business. Uh, and that only goes through strong enablement of the go-to-market team. So it's, I think it's a challenge. Uh, they all- don't operate on the same scale. Uh, the science team has cycles of several months. The go-to-market team, uh, goes faster, uh, has shorter cycles, let's say. Um, uh, but e- I think so far, uh, we've managed to recruit go-to-market people that have some technical interest, and, uh, and technical people that have some business interest, and I think that's how, uh, that's how you ensure that, uh, you don't have silos at the end of the day.
4. HSHarry Stebbings
  One of my worries with, with, bluntly, this space as we move into enterprise is that brand matters so much in terms of enterprise actually, and they already have existing agreements with Microsoft, and I worry that actually product or model quality doesn't matter as much as distribution. Microsoft just tack on existing clients with new products. How do you think about that as a core challenge, and am I wrong to be worried about it?
5. AMArthur Mensch
  So I think it's true. Uh, distribution is very important. Uh, a shortcut to distribution is to create demand through open source models.
6. HSHarry Stebbings
  Mm-hmm.
7. AMArthur Mensch
  Uh, and that's, uh, been part of our thinking around the, that strategy.
8. HSHarry Stebbings
  Do you think open source is ready for enterprise? Or do you think enterprises are ready for open source? And do they care about it enough?
9. AMArthur Mensch
  It depends on the enterprises, but some have been early adopters and are using, uh, a lot of these top models into- in production. So for sure, they're ready enough. Uh, now, in order to, like, uh, bring them to the next level of putting things into, uh, like, uh, large-scale production, et cetera, I think they're still lacking some product around, like, managing, uh, correctly, load balancing, uh, customizing the models, because you can do it with DIY solutions. But if you want to make it robust enough and scalable enough, it's actually not easy. And if you want to actually increase the quality of the models, the custom models, uh, the recipe are, are a bit hard to, to set. So the most technical, savvy co- uh, enterprises are definitely ready for it, and there's a few, there's actually many, uh, uh, use cases that are in production using, uh, uh, using open source models. Now, in order to widen the adoption, there's definitely some tuning to be brought to the market.
10. HSHarry Stebbings
  Obviously, every enterprise today is sitting in a boardroom going, "What's our AI strategy?" What do you advise them on? What questions should they be asking?
11. AMArthur Mensch
  Start thinking about how they're going to change all of their products, uh, using AI as a premise, uh, using the existence of, uh, like, very clever agents, because you can build very clever agents today. Um, assuming that presence, and s- and, and working backward to understand the consequence in term of organization. Uh, so to be not thinking about generative AI as a way to inc- as a way of g- in- increasing productivity and, uh, work processing, but rather as a way to change completely the way you operate your core business, uh, which usually involve, uh, taking models and customizing them pretty heavily, uh, to, uh, create the differentiation that you will need in like five years time when everybody will have adopted the technology in its core
30:06 – 34:57
The Readiness of Enterprises for AI Adoption
1. AMArthur Mensch
  business.
2. HSHarry Stebbings
  So my question to you is, you're in France, I'm in London. We both know that European enterprises do not move very fast. Most do not even have Slack today. My concern is that we drastically overestimate adoption in the near future, and maybe underestimate it in the 10-year, 20-year future. Do you think that's the case here? And do you worry about the lethargy of a lot of enterprises, especially in Europe, in adoption?
3. AMArthur Mensch
  I mean, it's a general, uh, phenomenon in, uh, in tech that you always, uh, over- overestimate the speed but underestimate the impact. I think it's probably occurring today. It's slightly different, uh, in the sense that there's some, uh, executive support for, uh, pushing generative AI solutions, even in Europe.
4. HSHarry Stebbings
  Mm.
5. AMArthur Mensch
  So there's some delay compared to the US market, for sure.... uh, but it's not, I wouldn't say it's, uh, it's, it's very significant. Uh, it's one-year maximum in term of, uh, delay. The challenge here is that it's a technology that is, that can take many forms. And so trying to focus on some specific thing that you can bring to the market that have AI in it, uh, is a prioritization challenge. And so you need to be very strategic around that. And it's, I don't think this is super easy for enterprises generally. It will become easier once they try out, like, uh, off-the-shelf solutions a bit more, uh, once they realize that there are some developer platforms that allows to do it, uh, without hiring, uh, very expensive and hard-to-find AI scientists in-house. Uh, and so we expect that this is going to accelerate in the, in the coming years.
6. HSHarry Stebbings
  Do you think that we're still just playing in the experimental budget game or do you think that we're moving into core budgets as well?
7. AMArthur Mensch
  It depends, uh, it depends on the enterprise. Uh, it's moving into core budget for customer support, for instance, where, like, areas where the application of AI is pretty obvious, uh, it's definitely moving into core budgets. Uh, it's also at the experimental stage, uh, in ma- many of the functions and for c- core applications, uh, in the industry, uh, in, I guess, telecom, uh, the, the telecom industry and in healthcare. This is still in the playground, but I, I think it's going to evolve in the next year.
8. HSHarry Stebbings
  Can I ask, as you build out enterprise, it's another expensive thing to build out on top of compute and talent. It costs real money. And I spoke to Paul at LightSpeed before, and he was mentioning to me bluntly how much less capital you've raised compared to a lot of your competitors, most obviously OpenAI and Anthropic. He, uh, said, "In a world where capital equals compute equals quality of model, how does Mistral keep up and stay relevant?"
9. AMArthur Mensch
  So the good thing is that capital is correlated to compute. Well, no, capital is equaled compute. Then compute is correlated with quality. Uh, it's not completely dependent on it. And as I've said, there's some strong opportunity for, uh, providing models that are the best of their class, uh, because they might be sufficient to actually solve certain use cases. Uh, so that's, that's where we're playing. In addition to be playing on the scaling path because obviously you do need to keep, uh, to stay relevant, you need to, to keep your, uh, technical team, uh, motivated. And to keep your technical team motivated, you need to give them, uh, the experimental bet they need, uh, to make new discoveries and to, to progress science. And that, that is where you need compute, uh, in addition to growing the model, uh, across time. Uh, so, I mean, we're growing our compute, uh, like every companies. We are convinced that we don't need to grow at the same rate (laughs) , um, because, uh, there's a lot of barriers that are not compute related, uh, that are appearing on the way and that we're already seeing. Uh, and so that's the, that's... so we think we, we can, we can scale and we are also convinced that we, on the, on the efficiency front, we are already very well-positioned and we are strengthening that position.
10. HSHarry Stebbings
  What are the biggest barriers to Mistral today?
11. AMArthur Mensch
  We've had a few delays with our compute providers, for sure, uh, that, that has been a barrier. So, like, the, the last answer to your question is to be taken with a grain of salt. We are still bottlenecked by compute for sure. Uh, but that's because we don't have many of it. We have 1.5k H100, which is a few percent, I think, of the capacity of our competitors. Uh, and so that's definitely a bottleneck that is going to improve significantly in the coming months.
12. HSHarry Stebbings
  Can I ask, finally, was it a mistake for you to not scale that quicker? Like-
13. AMArthur Mensch
  Uh, I think-
14. HSHarry Stebbings
  ... an- with the benefit of hindsight now, would you wish you'd scaled it quicker?
15. AMArthur Mensch
  Mm, you can't really scale that much quicker because you, you can't raise like two, two billion, like, uh, on, on a seed round. Uh, I mean, at least you couldn't in 2023, maybe you can, uh, uh, today but you can only hire that fast. Uh, you can only scale your infrastructure to manage more GPUs that fast and you can only raise capital that fast. So, there's, there's some acceleration constraints that are pretty hard to fight and that are pretty much the fir- first principles of, uh, of starting a
34:57 – 40:18
European vs. US Investors
1. AMArthur Mensch
  business.
2. HSHarry Stebbings
  I totally agree with you. You mentioned about the scaling constraints and, and, and cash. Does it matter where your cash comes from? Does it matter if you have European-funded, Saudi-funded, US-funded? Do you think that matters?
3. AMArthur Mensch
  I guess governance matters. Uh, what is important for a young company like us is to be in, under con- under the control of, uh, uh, of the founders, uh, because there's a lot of things to be invented and, and the vision, uh, it can only be carried by them, uh, by us. We have, uh, very good governance term, a very simple and clean governance that makes us a, a for-profit company, uh, that is, uh, growing a business to actually push the science frontier. Uh, and this is something that we're very attached to, uh, being able to, uh, control the company, uh, leverage, uh, our, uh, funding partners, uh, appropriately to grow in different part of the world, in the US, in the EU, it has been critical as well, uh, and in different part of the world where there's a lot of interest for AI. So it does matter in the sense that, uh, we want to have partners, uh, that are, uh, supportive of, uh, and long term, uh, because we are in the field that is fast-moving, uh, where value, uh, where we don't know yet exactly where the value will accrue. And so being flexible, uh, and being smart, uh, is, is definitely a requirement, uh, when you raise money.
4. HSHarry Stebbings
  Would you take money from Saudi or China?
5. AMArthur Mensch
  Good question. Uh, it's, uh, it, uh, it depends, uh, depends on the term. China is a bit hard. For us, it's even hard to operate in China. I mean, we don't operate in China 'cause you can't really operate in US and China without being, like, a very, very large cor- corporation. Uh, and so you need to make some choices.
6. HSHarry Stebbings
  What chance do you think that Europe has in AI? I know it sounds deterministic and defeatist and so you might be going, "Oh, fuck, Harry. Shut up." But it's like, what chance do you think Europe has in AI and what does it take for us to stand up as a serious AI industry with Europe?
7. AMArthur Mensch
  I guess the chance it has is that it's a revolution, and it's changing the way we do software. And so as every revolution, it opens a lot of opportunity for, uh, for new actors. And there's no reason why there shouldn't be an actor that is, that was created in Europe, uh, and that, that could grow pretty fast. And that's the mission that we, we gave ourselves. Uh, we have the talent. Uh, capital can cross, uh, oceans, uh, without, without too much problem. Uh, we have the market. The market is more fragmented than in the US, for sure. The ecosystem, the digital native ecosystem is definitely smaller, but it exists and it's growing. So there's, uh, there's local opportunity for business development. Uh, on the talent side, uh, we can hire 23, 24 years old people that we can onboard in four months, and they operate as well as, uh, any, uh, software engineer in the Valley. So it's, uh, it's people are quite talented here. And so we ma- if we manage to keep them and to, uh, to convince them not to go to the US, uh, we have a lot of opportunities.
8. HSHarry Stebbings
  When we look at computers, mobile, cloud, the kind of core technology shifts, the way that it's worked is Europe has kind of ceded control to the US, and then just taxed US companies for access to our citizens, if one's being defeatist. Is it different now?
9. AMArthur Mensch
  I mean, Europe is paying the, the, the price of not, uh, setting up a VC system in the '60s. Uh, and so, but setting it up, uh, like, 40 years later or even 50 years later. Uh, and so, obviously-
10. HSHarry Stebbings
  And by, and by, and by the way, the dirty secret is that the VC ecosystem in Europe is US-funded.
11. AMArthur Mensch
  Um, yeah. It was. I think it, is it still? Uh-
12. HSHarry Stebbings
  Hon- honestly, in large part, yes.
13. AMArthur Mensch
  Yeah.
14. NANarrator
  Normally.
15. HSHarry Stebbings
  There's government institutions which are backfilling it, but largely backfilling it with bad players who aren't very good. But the best providers in Europe are largely US-funded by top US institutions.
16. AMArthur Mensch
  Okay. I think, yeah, as I've said, it takes time for an ecosystem to build, so you have layers of entrepreneurs and investors that, uh, stack on top of each other. And, uh, the US has 60 or 70 years of venture capital, uh, investments. Uh, f- I think Europe has only 20 years, uh, maximum. Uh, and that's, um, I mean, it takes time. It takes an incompressible time to build an ecosystem. Uh, it takes also some willpower. Uh, and I think now we are seeing that willpower. We are seeing entrepreneurs creating companies. We are seeing VCs like you not going to the US, uh, and so I think that's, uh... Everything is positive. It just takes time, uh, and I'm, I'm adamant that, uh, we'll manage to do something interesting.
17. HSHarry Stebbings
  On the engineering side, do you feel like you have the depths of talent pool to hire from as you scale now?
18. AMArthur Mensch
  Yeah.
19. HSHarry Stebbings
  You do?
20. AMArthur Mensch
  On the engineering side and on the AI side, we do. Uh, we, we have a team in the US though, uh, which is, uh, working on, like, specific topics. Uh, like for senior AI scientist, uh, you find them more in the Valley than, uh, uh, than in France. For junior AI scientist, uh, it's, uh, there's, there's a wealth of talent, uh, in France, in Poland, in the UK, uh, and that's, uh, that, I think w- one of the strength of the area.
40:18 – 46:45
Does the Source of Funding Matter for Scaling Constraints?
1. AMArthur Mensch
2. HSHarry Stebbings
  When you were raising money, was it very different speaking to European investors versus US investors?
3. AMArthur Mensch
  I guess in the seed round, no, it wasn't that different because it was a seed round. Uh, for the series A, which was a bigger round, it was, uh, uh, we, I mean, uh, European funds weren't structured to, uh, do the kind of deal that we were, uh, proposing. Uh, so we didn't even have a lot of conversation because they, they just couldn't get their head around the investment that needed to be made, uh, as, whereas we were a proven new company. Yeah, I think what is lacking, and it's related to, to the ecosystem part, uh, in, in, in Europe are, are growth funds, uh, that are able to, uh, take huge bets, uh, with la- uh, lots of conviction. And that in turn should improve over time, especially if, um, if we manage to, uh, to use European wealth and channel it more into the growth funds, uh, than it is today.
4. HSHarry Stebbings
  Yes. I think you have more hope than me.
5. AMArthur Mensch
  I guess you know it. Yeah.
6. HSHarry Stebbings
  I think you have more hope than me on that one. That is not gonna happen. We are not gonna see many more European growth funds be built in the next few years, for sure. Not in the next three to five.
7. AMArthur Mensch
  I think, yeah, it, it hinges on a few political decisions. Uh, and I-
8. HSHarry Stebbings
  I think, I think it hinges on supply of capital and, and belief in a future European ecosystem that can contend with other large ecosystems.
9. AMArthur Mensch
  It's a chicken and egg problem, but at some point, you do need to, uh, uh, you, you, I mean, this could be nudged into the right direction if, uh, uh, if politics wants to do it, uh, if a couple of companies show that you can actually have, uh, companies that grow fast in Europe, and that's what we're trying to do. I'm not too pessimistic. I mean, I, I find you too pessimistic.
10. HSHarry Stebbings
  Uh, you find I'm-
11. AMArthur Mensch
  You should come to France. You should come to France, I think that you will get more optimistic.
12. HSHarry Stebbings
  (laughs) Do you know what? If a Parisian is telling me that I'm too pessimistic, then shit, I really need to be more optimistic.
13. AMArthur Mensch
  Yeah.
14. HSHarry Stebbings
  Uh, my, my question to you is like, you know, you just mentioned that the speed of scaling. Hardest thing to do does scaling with your company at the same speed. What was the hardest thing about yourself scaling as CEO with such speed of scaling of the company?
15. AMArthur Mensch
  Things to learn and, uh, and m- I mean, we are learning on the job, uh, with, with, uh, Guillaume and Timothée. It's, uh, effectively you have, uh, organizational challenges. How do you, how do you ensure that, uh, 45 people communicates well together? Uh, how do you manage your time in term of representation time, uh, how, in term of business development time? 'Cause I mean, we're still at the stage of the company where we get involved a lot in the, in the deal-making aspect. And how do you, yeah, ensure that, uh, you set proper directions and, uh...... maintain the team in the state of, uh, of, uh, cu- tranquility despite, uh, the amount of noise that there is on the competi- on the competitive side, and, uh, the fact that direction is obviously going to be changing over time because there's a lot of uncertainty in, uh, in, in that field. So this is, I think, the hard part. Uh, I don't think we're, I don't think I'm doing it properly. Uh, but, uh, we are actively, uh, trying to find sources of, uh, of information to learn new things, let's say.
16. HSHarry Stebbings
  If you could call yourself up to, like, the night before you became CEO and founder of Mistral and give yourself some advice with the now knowledge that you have, what would you say to yourself, Arthur?
17. AMArthur Mensch
  Maybe stage a bit more of the product development and go-to-market development. Uh, we did start the go-to-market motion at the time where we had absolutely nothing to sell.
18. HSHarry Stebbings
  (sighs)
19. AMArthur Mensch
  Uh, it did work out, uh, did create some brand awareness despite the absence of anything. Um, but, uh, I think it might have been slightly simpler, uh, to stage things maybe a little more, uh, developing the product a little before developing the go-to market. But since it's such a fast-moving field that we did start everything a bit together with, uh, some organization that was a bit lacking, and now we are, uh, we are solidifying it, uh, on the fly. Uh, so it's, it has worked out. It hasn't been optimal for sure. Um, and so in hindsight, we can always give you, I, I could give me, like, a few tactical advices on, uh, who to hire when. Generally, I think the strategy we had one year ago hasn't changed much. Uh, we did realize that we would need more capital. We did realize that we would need, uh, a strong project, uh, and that, uh, we could not operate only from Europe and that we needed to go to the US very quickly. Those were findings that we did on, on the way. Uh, I don't think that they would have helped, it would have helped that much to know it, uh, a year ago.
20. HSHarry Stebbings
  One, do you feel like you have enough cash now?
21. AMArthur Mensch
  Uh, I guess a startup is always fundraising.
22. HSHarry Stebbings
  (laughs)
23. AMArthur Mensch
  No, I mean, it's, it's a field where, uh, for the years to come, uh, the investment are going to exceed the, uh, the revenue by design because you do need to scale, and you do need to stay relevant, uh, uh, on the, as, as a ƒ4 company. So effectively, there need to be some investment. Um, the revenue is ramping up, so there will be some, some also, some revenue to reinvest. Um, but I think today and for the years to come, the speed for developing research should be faster than the speed at which you can, uh, develop your, your go-to market.
24. HSHarry Stebbings
  Before we do a quick fire, when you look at the landscape today, which competitors did you, do you most respect and admire?
25. AMArthur Mensch
  I mean, they all delivered. Uh, we were surprised by Cohere recently. Uh, they did, uh, came up with new good models, uh, and I think that was a surprise for us. And obviously, uh, OpenAI and Anthropic and my friends at Google are also doing a good job. So, uh, it's a, it's a competitive landscape, uh, and we respect all of them. We also all work, uh, in the same direction, I th- and eventually with, uh, the same, uh, like, higher goals. So, uh, it's great, uh, to have respect for, uh, one another.
26. HSHarry Stebbings
  Is it too late to start one now? We see, like, Holistic starting now. Is it too late?
27. AMArthur Mensch
  Holistic, I know them well. Uh, is it too late? I, I wouldn't recommend going into the foundational layer business. I know Sam, uh, didn't recommend to do that, uh, one year ago, and we did, and it seemed to have, uh, so far, uh, changed a few things. So, uh, I, I wouldn't be, uh, uh, I think it would be arrogant for me to say that there's no chance for a new competitor to arise and beat
46:45 – 50:59
Quick-Fire Round
1. AMArthur Mensch
  us, for sure.
2. HSHarry Stebbings
  Listen, my friend, I wanna move into a final thing, which is just a quick fire. So I say a short statement, you give me your immediate thoughts. Does that sound okay?
3. AMArthur Mensch
  Yeah, let's do that.
4. HSHarry Stebbings
  So what worries you most in the world today?
5. AMArthur Mensch
  Global warming. There's a race of, uh, the planet heating up and us finding solutions for it. Uh, I think AI is part of this, of the solution, uh, 'cause it brings more control, it brings, uh, potentially more efficiency in some of the processes. Uh, but there's effectively a race for survival. So, I think this is something that we should be a bit more aware of.
6. HSHarry Stebbings
  What have you changed your mind on most in the last 12 months?
7. AMArthur Mensch
  I think I've changed my mind on, uh, a lot of, uh, management premises that I had and that I had never tested, uh, uh, in real.
8. HSHarry Stebbings
  What was the biggest one?
9. AMArthur Mensch
  Uh, transparent feedback is actually, uh, super, uh, useful for a company. And so, uh, operating in a almost fully transparent manner, uh, has helped out, uh, growing, uh, uh, without breaking.
10. HSHarry Stebbings
  What element has been the most unexpectedly challenging in the scaling of Mistral?
11. AMArthur Mensch
  Uh, the amount of demand that we had to manage, uh, which is, uh, too high for, uh, the, uh, for what we can handle. And, uh, I guess the, the brand success, the fact that people knows us, uh, was a bit unexpected. We knew that it would be noticed. Uh, we had no idea that, uh, uh, people would start, uh, using us, uh, that fast.
12. HSHarry Stebbings
  What do you do to calm down? You have a lot going on now, Arthur, and you have a lot of expectation and cash on your shoulders. What do you do to just ... (exhales)
13. AMArthur Mensch
  I run, I cycle. Uh, I think my partner will yell at me, but I, I try to take care of my daughter.
14. HSHarry Stebbings
  (laughs) Okay, you've recently become a father. What do you know now that you wish you'd known when you first had your daughter, you know, very recently?
15. AMArthur Mensch
  I had no idea that, uh, w- you needed so much energy to, uh, to, to care for, uh, for small children, for a small child, let's say.
16. HSHarry Stebbings
  (laughs) Uh, where do you think AI will take the world in the next 10 years? Like, what does the future of society look like in a world where AI is embedded into everything?
17. AMArthur Mensch
  Well, it's changing the way people work, uh, significantly, um, in the sense that it requires to be, uh, uh, more creative and to bring more value, uh, beyond what can be automated. So it's a very structural change, uh, on the, uh, on the job market, uh, which means that there should be some adaptation that are taken, uh, that are made pretty quickly, uh, in training, um, in, uh, education that people can get a sense of what is going to be expected from them in their daily job, assuming that there's some AI out there.
18. HSHarry Stebbings
  Do you think the fears of job replacement are grossly over-exaggerated?
19. AMArthur Mensch
  I think they are. I mean, depends on who you're speaking to. Uh, I think job are going to be displaced for sure. Uh, some will be replaced, uh, some will open up, I think, uh, also, 'cause we're just trying to move humanity to a higher level of, of abstraction. So we can now talk to machines, I think, that, and machines can understand and answer, uh, in a human-like fashion. Uh, this is not so much of a ƒ (baradik?) change compared to what we were doing with computers. Uh, I think what's happening right now is that probably the, the speed in, in our elevation toward higher abstraction level is probably occurring at an un- at an unmatched, uh, rate, uh, in history. Though that means that the society adaptation is going to be more challenging and needs to be anticipated.
20. HSHarry Stebbings
  Final one for you. We do a show in 2034, 10 years time. If everything goes right, where's Mistral then?
21. AMArthur Mensch
  Mistral has some, uh, very relevant models, uh, commercial and open source, uh, and it has a very strong developer platform that allows to do, uh, everything that you need to create your AI application. Uh, and so that, that's, uh, that would be a, that would be a good achievement.
22. HSHarry Stebbings
  Arthur, listen, I've so enjoyed doing this. Thank you for putting up with me going in many different, uh, fast-moving directions. You've been incredibly patient and a brilliant guest. So thank you so much, my friend.
23. AMArthur Mensch
  Thank you for, uh, hosting me.

Episode duration: 50:59

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode e7Y84vpWhkU

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome