Cohere's Chief AI Officer, Joelle Pineau: Why Scaling Laws Will Continue & Future of Synthetic Data

Joelle Pineau is the Chief AI Officer at Cohere, where she leads research on advancing large language models and practical AI systems. Before joining Cohere, she was VP of AI Research at Meta, where she founded and led Meta AI's Montreal lab. A professor at McGill University, Joelle is renowned for her pioneering work in reinforcement learning, robotics, and responsible AI development. ----------------------------------------------- Timestamps: 00:00 Intro 01:16 How Meta Shaped How I Think About AI Research 02:22 Challenges in Reinforcement Learning 08:33 Is It Possible To Be Capital Efficient in AI 13:47 AI in Enterprise: Efficiency and Adoption 21:51 Security Concerns with AI Agents 28:06 Can Zuck Win By Buying The Superstars of AI 32:11 The Rising Cost of Data 36:38 Synthetic Data and Model Degradation 38:42 Why AI Coding is Akin to Image Generation in 2015 51:16 If Joelle Was a VC Where Would She Invest? 51:50 Quick-Fire Round: Lessons from Zuck, Biggest Mindset Shift ----------------------------------------------- Subscribe on Spotify: https://open.spotify.com/show/3j2KMcZTtgTNBKwtZBMHvl?si=85bc9196860e4466 Subscribe on Apple Podcasts: https://podcasts.apple.com/us/podcast/the-twenty-minute-vc-20vc-venture-capital-startup/id958230465 Follow Harry Stebbings on X: https://twitter.com/HarryStebbings Follow Joelle Pineau on X: https://twitter.com/jpineau1 Follow 20VC on Instagram: https://www.instagram.com/20vchq Follow 20VC on TikTok: https://www.tiktok.com/@20vc_tok Visit our Website: https://www.20vc.com Subscribe to our Newsletter: https://www.thetwentyminutevc.com/contact ----------------------------------------------- #20vc #harrystebbings #joellepineau #scientist #cohere #zuck #ai #sytheticdata #scalinglaws #aitalent #meta

Joelle PineauguestHarry Stebbingshost

Nov 3, 202559mWatch on YouTube ↗

EVERY SPOKEN WORD

105 min read · 21,444 words

0:00 – 1:16
Intro
1. JPJoelle Pineau
  The scaling laws have been remarkably robust. There's a lot we don't know yet in terms of the vulnerability of these systems.
2. HSHarry Stebbings
  Today, we have one of the leading minds in AI, Joelle Pino. Joelle is the chief scientist at Cohere. If you don't need to buy the Galacticos, why do you have, like, an Andrew Tulloch, a Daniel Gross, an Alex Huang, and the Galacticos assembling?
3. JPJoelle Pineau
  I used to be quite skeptical that neural networks were necessarily the ultimate solution to machine learning. I seem to be quite wrong on this one.
4. HSHarry Stebbings
  Knowing what you know, what do you not let your children do?
5. JPJoelle Pineau
  Ah. (laughs) Eat too much sugar. I don't have a lot of patience as a scientist for people who are predicting the extremist scenarios, the catastrophic risks of AI. You know, AI becomes our overlord kind of scenario.
6. HSHarry Stebbings
  If I gave you $10 billion, what would you spend it on first? Ready to go? (instrumental music plays) Joelle, it is so great to have you in the studio. I've heard many great things from Nick, Aidan, Schreppe. So, thank you so much for joining me.
7. JPJoelle Pineau
  Thank you. Happy to be here.
1:16 – 2:22
How Meta Shaped How I Think About AI Research
1. JPJoelle Pineau
2. HSHarry Stebbings
  Now, you spent over six years at Meta, and I want to start there because it's a very transformative time and place. What are the biggest takeaways for you from that time? And how did that shape your mindset to how you think today?
3. JPJoelle Pineau
  Well, I was there from t- 2017 to 2025. And you have to see just how much AI changed over that period of time. And what we were really focused on is fundamental AI research. Um, and what, you know, one thing that I've learned is just sometimes how long it takes to prove out a hypothesis. We feel like AI is moving at the speed of lightning. But in fact, there's some things that it just takes a few years to mature, to get the right optimizer, the right compute, the right data for that to really make a difference.
4. HSHarry Stebbings
  I look at where we are today, and everyone kind of goes, "It's here, it's here, it's here."
5. JPJoelle Pineau
  Yeah.
6. HSHarry Stebbings
  And then you actually look at what a lot of the leaders have been saying recently, where it's like, actually, you know, um, Andrej was saying, "It's not the year of the agents, it's the decade of agents."
7. JPJoelle Pineau
  (laughs)
8. HSHarry Stebbings
  Sam's kind of pulling back too.
9. JPJoelle Pineau
  Yeah.
10. HSHarry Stebbings
  Have, have we got over our skis and we're actually kind of all pulling back, realizing that time is the factor we need
2:22 – 8:33
Challenges in Reinforcement Learning
1. HSHarry Stebbings
  to rely on?
2. JPJoelle Pineau
  Well, I'll give you an example. You know, I've been in research for, for a couple decades now. I've been working on reinforcement learning for over 20 years. And suddenly, everyone's talking about reinforcement learning, you know? (laughs) Since the advent of reasoning models, agents and so on. So, you know, sometimes you have to be a little patient with these ideas. And the right algorithmic tweak, the right context, the right problem domains just opens up the magic.
3. HSHarry Stebbings
  I, I was listening to Andrej yesterday, and he said in his show that reinforcement learning is terrible. Uh ...
4. JPJoelle Pineau
  (laughs)
5. HSHarry Stebbings
  (laughs)
6. JPJoelle Pineau
  Less terrible than 20 years ago.
7. HSHarry Stebbings
  Have we overinvested in RL-based methods at the expense of maybe, like, more scalable alternatives?
8. JPJoelle Pineau
  Oh, I'm still super bullish on RL in that, like, the concept itself is so fundamental. You know, this idea of training through a system of rewards, of indicating what's valuable and what's not valuable through numerical values, like, that is so fundamental. It's not going away. Now, you know, where we're maybe getting a little bit ahead is thinking that just RL out of the box is going to give us AGI. That part, a lot less so. You know, if you look at the curve of progress, RL is terribly inefficient. And so the amount of signal you need to get in order to really shape the behavior of a model is far from where we are today. And so we'll need to figure out how to, to really deal with this, with this learning efficiency problem.
9. HSHarry Stebbings
  You're probably thinking, "What did I get myself in for?"
10. JPJoelle Pineau
  (laughs)
11. HSHarry Stebbings
  Um, and I don't blame you. Uh, I ask questions that I think everyone else thinks, but I'm not afraid to say, "I don't know." Why is RL so inefficient?
12. JPJoelle Pineau
  Um, there's a few reasons. You're going to get me on like a-
13. HSHarry Stebbings
  This is great.
14. JPJoelle Pineau
  ... a deep dive-
15. HSHarry Stebbings
  I, I really love them.
16. JPJoelle Pineau
  There's a f- there's a few reasons. One is the fact that RL is about sequential decision-making. So, you know, think about you're starting at a point, you need to figure out what you're going to do next, and you might pick the right side of the branch or the wrong side of the branch. And then, like, the road keeps on splitting. So, every time you make a mistake, it sort of compounds ...
17. HSHarry Stebbings
  Mm.
18. JPJoelle Pineau
  ... through the length of the series of actions you're making. So, that means, like, the amount of error you can make can be very, very large. And to get it right is quite difficult. Sometimes people compare it to, like, a needle in a haystack is like finding the right solution in RL. So, there's that part. The other part that's hard is the fact that to train the system, to train the models, you have to essentially take actions to learn. You can't learn from static data. You can learn some things from static data, but actually, you know, to, to get the right policy, you need to test it out. And so that means you need a simulator. You need to get the synthetic data. All of that can be really expensive also. And so we have an ... difficulty getting, like, just a variety of environments and simulation to test RL.
19. HSHarry Stebbings
  When we look at the cost curve for RL, you said you've been working on it for 20 years.
20. JPJoelle Pineau
  Yeah.
21. HSHarry Stebbings
  Have we seen that dramatically come down? Will we see it continue to dramatically come down? Or is it a case of it is just a fundamentally expensive method of training?
22. JPJoelle Pineau
  Hmm. Um, it's come down, especially in domains where we have good reward functions. So, the place where most people started hearing about RL is around the AlphaGo time. You know, the game of Go, which was sort of one of the goals for AI. Many people thought we were, at that time, we were still a decade away from being able to have machines play Go at the level of humans. And, and out comes a team from DeepMind, you know, uh, goes off, plays against, uh, (laughs) the world champion and shows that, that RL can basically do it. Um, and so ...I would say, you know, in cases where we clearly know what's the goal, we can write down precisely the reward function, we're good. We can make a ton of progress. So, that's why you're seeing progress in mathematics, um, very well-defined reasoning tasks, games, g- these kinds of things. RL to shape the behavior of models to get them to be social creatures, that we have no idea how to do. I mean, I don't know if you have children, but, like, shaping their behaviors, you know, the number of times you can repeat the same thing and still they do something else. And so, there's something there, you don't know how to write that out mathematically. And that's where I think we're still in for, for some hard work.
23. HSHarry Stebbings
  (laughs) Okay. So, w- we still have some hard work. When we look at the training versus the inference market today, we've had so much weighed on training so far, and it's been incredibly costly and expensive. And then I hear everyone say, "Well, actually, inference is 95% of the market and that's where it's all going. That's where NVIDIA will make most of their money." How do we think about the cost curve applied to training versus inference and where it sits today?
24. JPJoelle Pineau
  Hmm. I think there's a lot of different variance. And, you know, if you'll allow, maybe I'll pivot to, like, you know-
25. HSHarry Stebbings
  Pivot anywhere.
26. JPJoelle Pineau
  ... wh- where I'm, where I'm going with Cohere now.
27. HSHarry Stebbings
  Yeah.
28. JPJoelle Pineau
  Just, you know, I joined Cohere less than a month ago, super exciting company. I think one of the things that Cohere is doing is actually to develop AI models that run on premise, so that means enterprise bring it in, they run it locally. So, the company has to worry about the training of the models. Obviously, we want world-class models for the needs of enterprise. Doesn't have to worry about the inference. Doesn't have to worry about the inference cost. The, the, you know, the, the client's customers have to figure out what's the right way for them to digest the AI. That means, like, there's a lot of motivation to have very efficient models, uh, so that they can run really efficiently on premise. So, you know, we get caught up into one, one paradigm-
29. HSHarry Stebbings
  Respectfully, if, if, if they're-
30. JPJoelle Pineau
  ... but there are other paradigms as well.
8:33 – 13:47
Is It Possible To Be Capital Efficient in AI
1. HSHarry Stebbings
  totally get you. Um, what's the biggest challenge about capital-efficient AI today? I know that sounds strange, but when you look at the e- economics, so to speak, what's the biggest challenge?
2. JPJoelle Pineau
  Um, there's a lot of challenge today. I think, uh, in terms of the economics of AI, I think one of the biggest challenge is the fact that it's very hard to have predictability, right? Everyone wants to know, when are we going to hit the breakthrough? Everyone wants to know, how many GPUs do I actually need? Everyone wants to know, like, what's the return I can expect? There's just a lot of uncertainty built into the system. A lot of that is because there's a lot we don't know about this technology. And so, that means we have to take in quite a bit of risk when you're building out. Whether you're building out your data center, whether you're building out your workforce, whether you're trying to figure out, you know, how much data to, to curate. And so, that makes it difficult for a lot of people. People want answers, and this is a world where we don't have that level of predictability compared to other industries.
3. HSHarry Stebbings
  Does progression happen in kind of a linear fashion, or does it happen in step functions like AlphaGo, like, uh, Deep Seek, which, uh, d- depending, depending on kind of what you believe-
4. JPJoelle Pineau
  Yeah.
5. HSHarry Stebbings
  ... suggests a lot of efficiency in terms of model improvement. D- is it step function or is it linear?
6. JPJoelle Pineau
  I tend to decompose different ingredients that lead to progress. You know, people often talk about, like, the algorithms, the data, the compute. I think, in general, compute and data have a more linear effect on progress. You build more compute, you run bigger models, you can typically get better performance, you feed in more data. It's not just quantity. You need to worry about quality and diversity as well. But roughly, it's more linear-ish with respect to the data. The algorithms are the ones that have the non-linear effect. And so, you can explore lots of ideas, and then something like the transformer comes along and just changes the paradigm. And it's not just your transformer, you know. On the optimization side, suddenly we hit upon Adam, which is a technique to, to do the optimization of your model, changes in the paradigm. Reasoning, suddenly, we start thinking about how to put that in-the-loop reasoning, and it changed the paradigm. So, those ideas tend to have a non-linear effect. The challenge with these algorithmic ideas though is that actually it may take a long time to prove themselves out. So, like, the paper can be sitting out there, there's thousands of papers coming out, the idea's sitting out there, and we may not think to try it with the right data, at the right scale, with the right combination of hyperparameters. And so, you don't notice that effect for a while. So, it's hard to predict, and it's non-linear more on the algorithmic side than I think on whether it's data, compute, even talent or other things.
7. HSHarry Stebbings
  With respect to Google, I mean, Transformer's obviously birthed in Google and sat as papers for many-
8. JPJoelle Pineau
  Mm-hmm.
9. HSHarry Stebbings
  ... well, a couple of years.
10. JPJoelle Pineau
  Mm-hmm.
11. HSHarry Stebbings
  Um, when we loo- you mention there, you know, compute, algorithms, data, if we should just kind of go through them to understand, everyone suggests that it's weird there's two different worlds. It's like scaling laws exist, just throw more compute at it. When you look at data center investment, when you look at all the kind of desirability of compute, and then again you have GPT-5 seemingly focusing on efficiency and other signals, do scaling laws play out from here?... and if so, for how long?
12. JPJoelle Pineau
  They've been... The scaling laws have been remarkably robust. Um, they don't play exactly as we expect, but still, they've been remarkably robust. Lots of people have bet against scaling laws in the past. And I would say overall, you know, we've seen a ver- pretty, pretty, pretty robust effect. Um, it... They don't work alone. We also need these algorithmic innovations. Um, but, but most of the time, you know, I w- I wouldn't bet against it.
13. HSHarry Stebbings
  O- o- on the algorithm side, is that the hardest to innovate on? You could think about, "Well, you can buy more compute. It might be hard, but you can buy more compute and data. There are different ways, whether it's synthetic or human." Is algorithms the hardest to innovate on?
14. JPJoelle Pineau
  It's certainly the most creative work to be done. And, you know, the space of ideas is so wide, um, that I would say, i- i- it's hardest in the s- sense that, like, you can move in... You know, I'm researcher at heart. You can move in so many different directions, and picking the right one, you don't know till you get there whether it was the right one or not. It's a little bit like reinforcement learning. (laughs) So, in that sense, I think it's, it's the, it's the most interesting one, it's the most frustrating one, and it's the most difficult one (laughs) , certainly from investors' point of view, to know where to put your chips.
15. HSHarry Stebbings
  Is... M- Speaking of, kind of knowing where to put your chips, and moving from purely a research lens with, with Meta-
16. JPJoelle Pineau
  Yeah.
17. HSHarry Stebbings
  ... to now also building product-
18. JPJoelle Pineau
  Yes.
19. HSHarry Stebbings
  ... is there ever this, like, inherent conflict between intellectually interesting research with the need to productize and monetize, and how do you think about
13:47 – 21:51
AI in Enterprise: Efficiency and Adoption
1. HSHarry Stebbings
  that?
2. JPJoelle Pineau
  I mean, one of the reasons I w- I'm really excited to be joining Cohere, actually, is because, like, we're at a stage where AI is really starting to be useful. Maybe not as useful as people think it is, but we are there. Um, and by working on AI that's going into enterprise, I feel we're gonna get such an interesting signal of what works and what doesn't work. You know, we keep on talking about, you know, AGI and AI for the masses and so on. But, actually, like, when you need to sell AI to a business, you get a real signal of what works, what doesn't work, um, and that's what I'm most curious to see. Um, and, uh, you know, we've been using these academic benchmarks for many years, you get some signal. But it's not the same as, as getting this to do productive work. Um, so, I'm curious to, I'm curious to learn out of that. You know, we're gonna get new types of data. We're gonna get, I think, a lot of insights that are then going to drive the research ideas. Um, I think that's the, that's the other thing to think through. When you have a large space of ideas to explore, getting that feedback signal from the real world is super useful to guide you through that search of ideas.
3. HSHarry Stebbings
  I just had a great chat to David Cohn at Sequoia, who said that he thinks a good barometer for utility value within enterprises is, like, does it have the ability to replace the work of your bottom 5% in any category?
4. JPJoelle Pineau
  Mm-hmm.
5. HSHarry Stebbings
  He says, like, "We overestimate a lot."
6. JPJoelle Pineau
  Yeah.
7. HSHarry Stebbings
  "Can it replace the bottom 5%-"
8. JPJoelle Pineau
  Mm-hmm.
9. HSHarry Stebbings
  "... in any function? And if it can, that's a very meaningful improvement." Do you think that's a good barometer, and how would you s- advise an en- enterprise on whether something's useful or not as a yardstick?
10. JPJoelle Pineau
  Mm-hmm. I... Y- you know, I, I, I, I prefer, in terms of a barometer of, um, eh, productivity, something a little bit different, which is to say, can most of your employees do 10X the amount of work with AI versus on their own? That, to me, is actually a better barometer. I think human and AI have very complementary abilities, so to just, like, flat out replace a portion of your workforce is actually pretty unrealistic. Um, some may, some may try and some may be slowing down their hiring, but I actually think-
11. HSHarry Stebbings
  Respectfully, I think 10Xing your work feels more unreal- is that, is that not a bigger ask? I'm almost more intimidated by 10Xing my work.
12. JPJoelle Pineau
  Oh, I don't think that's unrealistic at all.
13. HSHarry Stebbings
  Wow.
14. JPJoelle Pineau
  Yeah.
15. HSHarry Stebbings
  I- in a timeline that is next couple of years?
16. JPJoelle Pineau
  Yes. Yeah.
17. HSHarry Stebbings
  I'm sorry, how d- how does that actually shape out, then?
18. JPJoelle Pineau
  I think you have to identify very concretely the types of work that you are delivering. But I think we're starting to see, like, you know, Hollywood-quality productions being made in a matter of hours. We're seeing, you know, to take a super concrete case, like machine translation. If humans are doing the machine translation compared to machines doing it, you'll go from hours to seconds on long-form text, multi-page documents. And so, for a lot of work, it's not like AI can do all of the work. Humans still need to ask the right question, they need to verify the information, they need to shape the tasks. But once the task is well-defined, the product are clear, like, all the design considerations are fed into the prompt, you press the button and you've got an answer in seconds for something that used to take sometimes weeks and months.
19. HSHarry Stebbings
  So, completely hear you and understand that. I'm just kind of re- trying to re-evaluate a belief that I had for the last few months, which is, like, fo- I- I'm a venture investor first, and-
20. JPJoelle Pineau
  Mm.
21. HSHarry Stebbings
  ... for all of us to make money-
22. JPJoelle Pineau
  Mm-hmm.
23. HSHarry Stebbings
  ... we need to see the transition from kind of human labor budgets to AI spend.
24. JPJoelle Pineau
  Mm-hmm.
25. HSHarry Stebbings
  And it's with that transition where we obviously see the TAM massively increase-
26. JPJoelle Pineau
  Yeah.
27. HSHarry Stebbings
  ... and we make a lot of money. But when I hear you say that, I suddenly question that assumption as, like, the barometer for whether we make money, because you're suggesting that actually we don't replace the human labor budget, it just makes us 10X more efficient.
28. JPJoelle Pineau
  Mm-hmm.
29. HSHarry Stebbings
  Is that correct?
30. JPJoelle Pineau
  Yes. And I think, you know, uh, eh, eh, people have s- l- you know, there's a lot of nuance to all of that. And some work will be harder to get that same level of efficiency gain, whereas other work, you'll see 100X in terms of efficiency gain. But I, I do think that for a lot of the work that's happening right now, that's absolutely feasible.
21:51 – 28:06
Security Concerns with AI Agents
1. JPJoelle Pineau
2. HSHarry Stebbings
  Security is a topic that we quite often glaze over, especially when in- uh, investing in kind of application layer AI tools. What does no one know about AI security that people should know?
3. JPJoelle Pineau
  With respect to AI security, I think there's a new front that's opening up with, um, the development of agents. And frankly, there's a lot we don't know yet in terms of the vulnerability of these systems. With LLMs, we're starting to get a better understanding. We've had, you know, quite a bit of red teaming exercise and jailbreaking and so on, and so people have identified different risk vectors, prompt injections, things like that, which are vectors for malicious actors to interfere with the system. With AI agents, we haven't seen that. And one of the features of computer security in general is, often, you know, i- i- it's a bit of a cat and mouse game, quite frankly. Like, there's a lot of ingenuity in terms of breaking into systems, and then you need a lot of ingenuity in terms of building defenses. And so, we just have to stay very active in that sense.
4. HSHarry Stebbings
  What are the potential vulnerabilities though in an agent world?
5. JPJoelle Pineau
  In terms of agents, you know, we, we worry a lot about, um, hallucinations in LLMs. The parallel in agents is impersonation, so agents that come along and are essentially impersonating entities which they don't legitimately represent. And in doing so, taking actions on the behalf of these entities where they don't legitimately represent, whether it's, uh, infiltrating, you know, banking systems and, and s- and so on. And so, I, I do think we have to be quite lucid about this, develop standards towards it, develop ways to, to test for that in a very rigorous way. There's ways to reduce that risk drastically. You run your agent, you know, completely cut off from the web, you're reducing your risk e- exposure significantly. Um, but then you lose access to some information. So depending on, depending on your use case, depending on what you actually need, there's different solutions that may be appropriate.
6. HSHarry Stebbings
  Totally get you. That's a really hard one, because like then verification becomes the most important thing.
7. JPJoelle Pineau
  Mm-hmm.
8. HSHarry Stebbings
  But then it's like, who's the arbiter of verification? Is it governments? Is it, is it companies?
9. JPJoelle Pineau
  Mm-hmm.
10. HSHarry Stebbings
  How, how does one think about that?Who says you're a valid agent versus an invalid agent? (laughs)
11. JPJoelle Pineau
  Um, governments can be good for defining standards on which we all agree. Companies are much better at building the solutions at scale and deploying them.
12. HSHarry Stebbings
  Do you think governments are good at setting the standards? When you look at AI and where we're at, and then when you look at the sophistication levels of government, um, programs or decision-makers, with respect, they're just a little bit behind. Do you think they are actually equipped?
13. JPJoelle Pineau
  I don't think you should look at where government are in terms of necessarily AI regulation. Like, AI as a field is so incredibly young and fast-moving, and, and by nature, and there's some good in this. Governments are moving a little bit more cautiously and, and, and in ... usually need to benefit from our knowledge to make good policies. Um, and so, I, I do think you, you can look at other fields in terms of, of regulation. You know, you look at, um, uh, aviation. The security record for aviation today compared to where we were 50 years ago is just incredible. And governments have played a role in defining that, in terms of standards and in terms of what are the, the norms and so on. So, I'm quite hopeful, uh, uh, I'm an optimist about this, maybe it's my Canadian side, but governments can play a useful role, um, in many cases. You know, clear standards actually means reducing uncertainty for a lot of companies in this space. But we shouldn't expect that to be ahead of the technology. I think that would be the wrong order of things. In some sense, we need to develop that technology with enough of a creative space, and we need to learn fast, and then develop the right, the right guardrails for that technology from, from, from the real earnings we have.
14. HSHarry Stebbings
  Uh, we, we mentioned that kind of, uh, governments and their role. Uh, when I had Nick on the show, he was saying, actually, the benefits of not being an American company, um, given some geopolitical challenges sometimes. (laughs) Um, I'm just intrigued. Do you think we will have these government sovereign models for each geo? You know, we have Mistral in France, we have Cohere in Canada, founded in Canada, but then you've got global kind of HQs.
15. JPJoelle Pineau
  Mm-hmm.
16. HSHarry Stebbings
  Do you think we will have these sovereign models and regionalized winners?
17. JPJoelle Pineau
  I do think it's healthy that there are models that are getting built in different places around the world, not just in, in the US and, and China right now. I think this is healthy in terms of diversity of, of thoughts. I think it's healthy in terms of having a greater amount of people with access to technology. I do think for Cohere, you know, the vision isn't to be a Canadian company. Like, the vision is to be a global AI company, and I think, yes, you know, we have a headquarter in, in Toronto. We have teams that are distributed around the world. We have a great team m- and here in London, as well as in the US, uh, and, and France and other places. And so, you know, having that ability to deploy models that operate across the world, I think, is gonna be absolutely an important part of the, the strategy for Cohere. I think there's a great opportunity. Um, a- what it gives us to be headquartered in Canada is like a s- sensitivity to the fact that it's not always a one-size-fits-all solution. You know, I go back to the research we've done, um, we've done leading work in terms of multilingual model. Um, and it turns out it matters. You go to Japan, you go to Korea, and they do want models that work well in their language. People in the workforce are still operating in, in the language of the country. So, having a company that, that is attuned to that, that values that, that internationalization of model is actually important on the
28:06 – 32:11
Can Zuck Win By Buying The Superstars of AI
1. JPJoelle Pineau
  global market.
2. HSHarry Stebbings
  Totally get that. On the team building side of it, Canada has great talent. You mentioned, obviously, some in London as well. What have been your biggest lessons/observations on team building in this, like, talent frenzy that we're in also? How, how will you analyze that?
3. JPJoelle Pineau
  One of the things that's, that's important when you're, you're building a, a team for AI, I do think you need people who have v- vision, who have, like, a sense of, like, "What can we create?" Just because we're in a space where there's so much innovation that is still needed. So, you need an ingredient of vision that can be one, two, three people who bring that, that ingredient of, of vision. You need people who have amazing execution muscle. Like, they don't care that it's their idea. They care that if the team agrees on an idea, they are just gonna push this and get it done. They're gonna build a system. They're gonna run the experiments. They just have that technical rigor to execute. And then you need people who kind of like keep the team together, who have, like, the sense of, like, who needs what to operate well and who are that social glue. You know, humans are still social beings, and that social glue in a team matters a lot. Where I've seen it fail is to have just sort of one type of person inside the team. Um, I don't think it, it becomes that productive to, to put a bunch of AI superstars all together in a room without the execution machine, without the social glue. I don't think you get necessarily the same results. So, I'm, I'm a big believer in building teams with, with diverse complementary, uh-
4. HSHarry Stebbings
  So, you can't just buy the Galacticos?
5. JPJoelle Pineau
  I don't think you need to. I think you, you really have to be thoughtful about putting people in a group. The other thing that helps a lot is for the team to have focus. You know, if it goes in all sorts of different direction, you'll lose the, that, that power that you get from, from people working together, so having a lot of clarity. What's the North Star? What's the goal? Where are we going? Even if over time that needs to change, but that level of clarity is required for everyone to be working in the same direction.
6. HSHarry Stebbings
  Can I be so blunt? If, if-
7. JPJoelle Pineau
  Please.
8. HSHarry Stebbings
  ... you don't need to buy the Galacticos-... why, why do you have like an Andrew Tulloch, a Daniel Gross, an Alex Wang, and the Galacticos assembling? Like-
9. JPJoelle Pineau
  Yeah.
10. HSHarry Stebbings
  ... i- is, is that wrong?
11. JPJoelle Pineau
  Y- you do, you do need a few of these, like, uber talents in the team. Th- there's a relatively, you know, short number of people who just understand this technology very deeply. You do need some of this talent, and if you can afford it, you should, you should get some of that talent. But you don't need all of your team. You n- you need, like, a, a, a team with complementary skill as well.
12. HSHarry Stebbings
  D- does that create a good team? Like, if I gave you, you know, $10 billion to go build a team, and you could buy a couple of these luxury star players, I feel like it's Top Trumps cards for, like, sports teams, but you can buy a couple, does that create a good team when one is a $3 billion person, and then the rest are, are just average $50 million people?
13. JPJoelle Pineau
  Yeah. Uh, I- I- I wouldn't say no if someone offers me the (laughs) opportunity to, to hire. There's definitely some really talented people in the field. And they deserve to be fairly compensated. This technology is going to have prof- make a lot of people very rich and have major effects in terms of society. And so, you know, we should be rewarding the, the talent. But I'd be very thoughtful about, what are the teams that I put together, and how do they work together, rather than just, like, you know, hire a roster of superstars without being thoughtful of how they're gonna work together.
14. HSHarry Stebbings
  So, you, uh ... It's so funny. So, because of the impact that you can have in these teams, actually, the multi-billion dollar price tags that you see can even be justified.
15. JPJoelle Pineau
  Time will tell. I don't think it's necessarily needed to go at that scale, but
32:11 – 36:38
The Rising Cost of Data
1. JPJoelle Pineau
  time will tell.
2. HSHarry Stebbings
  If I gave you $10 billion, what would you spend it on first?
3. JPJoelle Pineau
  One of the things you need is a balance between talent and compute. I think, you know, if you have too much talent and not enough compute, you're, you're, you're wasting your time. So, usually like an equilibrium between those, those two pieces. I think we often underestimate the importance of data. And, uh, data is getting more and more expensive. And so, I would certainly spend a good chunk of it on, uh, on data as well.
4. HSHarry Stebbings
  So many things to unpack there. Um, do you feel you have sufficient compute today?
5. JPJoelle Pineau
  I think we are reasonably well-resourced in terms of compute, uh, in, in, in building the models that we want to build. Yeah.
6. HSHarry Stebbings
  So, access is not a massive problem?
7. JPJoelle Pineau
  No.
8. HSHarry Stebbings
  Okay, great. Why is data becoming more expensive?
9. JPJoelle Pineau
  Data comes in, in different forms. And, um, on the one hand, you know, the, the notion, the days of, like, having data labelers who can say, "This is a cat," and, "This is a dog," are somewhat over. Like, the easy task, the AI can do. So, we're getting in a space where we need more specialized tasks. So, you know, imagine you're building AI for enterprise, there's a particular business logic. You need to make sure that you're catching the errors. You're gonna need someone with, like, deeper understanding of the tools. So, that's more expensive talent to come in and actually prepare the data. Um, there's also a lot of data that's synthetic data. So, when you're building agents, you need to build environments, and to build environments, you need some pretty creative folks who are going to build you, like, synthetic simulators. Um, we've seen this on the robot side for many years, b- people building robot simulators. Now, you're building AI for enterprise, so you need to think of, like, how are you gonna simulate these work processes in a reasonably realistic way that the AI can train on that? And so, that generation of environments and benchmarks and dynamic domains, uh, can, can be pretty expensive too.
10. HSHarry Stebbings
  When you loo- ... Again, many things to unpack from that.
11. JPJoelle Pineau
  (laughs)
12. HSHarry Stebbings
  When you look at, um, the expanse of data, and then you said, "Oh, you know, cat, dog, uh, lamppost, we've got these capti- you know, captures?" You know, the, the, you click the ones which have like a z-
13. JPJoelle Pineau
  Yeah.
14. HSHarry Stebbings
  I get them wrong.
15. JPJoelle Pineau
  (laughs)
16. HSHarry Stebbings
  I legitimately get them wrong. I'm like, "Jesus-"
17. JPJoelle Pineau
  They're getting harder.
18. HSHarry Stebbings
  They're getting so hard.
19. JPJoelle Pineau
  They are, they are. It's not just you.
20. HSHarry Stebbings
  I, I, uh ... The other day, I called up my CFM, like, "I failed the Revolut, uh-"
21. JPJoelle Pineau
  (laughs)
22. HSHarry Stebbings
  "... I'm so sorry. I'll try again in a half an hour (laughs) ."
23. JPJoelle Pineau
  (laughs) "Please let my A- AI agent answer that one for me."
24. HSHarry Stebbings
  Honestly, it was embarrassing. Um, but, um, the question that I have is, you know, when you look at Core, when you look at Surge, when you look at Turing, how do you evaluate that market which is providing a lot of that talent?
25. JPJoelle Pineau
  Mm-hmm.
26. HSHarry Stebbings
  Is that an ongoing, enduring market? Or is that just a, "Hey, for the next three to five years, we'll need it in the training phase of these models, but I don't know what it looks like beyond that"?
27. JPJoelle Pineau
  I don't think it's a phase in the sense that I do think this partnership, we'll call it, between humans and machines, where human provide guidance to machine, like, we are in this for a long time. What will change is the nature of the information that the AI provides versus the information that the humans must provide as a complement. And so, some of these firms may not be around in five years, but this notion of having humans guide the behavior, guide and train the behavior of the AI system, that is here to stay.
28. HSHarry Stebbings
  It- it's super interesting. I kind of, um, as an investor in, in, um, one of them, um, see all of them converge around, like, needing to do three things now. They used to just kind of be talent acquisition. "Oh, we'll get you these people." And now they're like, "We'll get you these people, and we'll get you high-quality data that you can really use." And now, it's like, "Oh, shit, we need this third pillar," which is, "We'll also help you implement that data into your models, do training, and help you with benchmarking and proving that it's actually valuable."
29. JPJoelle Pineau
  Mm-hmm.
30. HSHarry Stebbings
  And now they need all three.
36:38 – 38:42
Synthetic Data and Model Degradation
1. JPJoelle Pineau
2. HSHarry Stebbings
  You said about synthetic data, and that also being a very important segment to consider. Do you d- get model degradation when you get this kind of reinforcing loop of models learning on synthetic data which sh- creates more data for synthetic ... And it actually degrades, or does it improve?
3. JPJoelle Pineau
  It really depends how you're generating your synthetic data. So, in some domains, if you think like images, languages, like LLMs talking to each other at some point, you definitely get the degradation. And that degradation is due to essentially, like, a loss of diversity of your data. So, you know, you can make an analogy. Uh, you know, you'd, you take a bunch of people, put them on an island and let them reproduce. You know, at some point, the genetic diversity is gonna keep shrinking. Um, and so you get a reasonably similar phenomenon with, uh, with models because you're not injecting diversity into the data. So, for ... There are domains where lack of diversity means you get a collapse of distribution. There's other domains where, um, you don't need diversity. If you think of, like, you know, playing chess, playing Go, these kinds of games, we know exactly how to generate board configurations, and so we can generate tons of synthetic data. Not endless, 'cause it's a closed world, but still tons of synthetic data, and through that, learn for a long time. Then there's domains that are sort of in between. If I think of coding, we can generate synthetic code. You would take normal code. And we know how to inject diversity into the code. Like, I can take a couple of repositories, mix and match, apply an LLM to transform it. And so there's a way to generate synthetic data. The language is predictable enough, and there's enough structure that I also know how to inject diversity so that you don't get that collapse. So, the hope is that, especially in these domains, we can use a lot more synthetic data and do it without suffering from the degradation of performance.
38:42 – 51:16
Why AI Coding is Akin to Image Generation in 2015
1. JPJoelle Pineau
2. HSHarry Stebbings
  Do you worry that we are creating a world with just much worse code? Uh, a lot of people are concerned by the quality of code that's being outputted, and actually how we're just relying on it pretty haphazardly. Do you worry about that?
3. JPJoelle Pineau
  Let me make an analogy in terms of like the quality of generation. You know, y- you ask about c- code generation, but let me take you back to 2015 and image generation. I don't know if you have it in your mind, but the quality of the images that were generated ... We had image generation models in 2015. They were really bad. The resolution was bad, the composition was bad, and so on. And from 27- ... 2015 to about 2022 or so, we saw huge progress in terms of the quality of the image generation. So, you think of code generation, like right now, we're in the phase we were for image ten years ago. Yes, there's a lot of bad code that's getting generated. Um, there's a lot of code that will get thrown away. But wait another ten years, and I think the quality of the code that's produced is gonna be excellent.
4. HSHarry Stebbings
  Will we have d- ... What will the developer world look like in ten years when that is the case?
5. JPJoelle Pineau
  (laughs) Well, if I carry my analogy further, I don't know if it's a reassuring scenario, 'cause if we look at where we are today in terms of i- image generation, there's just like ... The volume of image getting generated is huge. Uh, what matters now is sort of y- you know, picking the ... picking the quality out of the volume. And so, if I fast-forward ten years on code generation, when the ... we have the ability to generate a ton of code to do a ton of different things, we're going to need some selection mechanism to decide what code we actually want, when there's actually value. And so that's gonna come. There's still gonna be some sort of editorial, design choice. Someone needs to decide, like, of all the code we can generate, what's the code we want to generate? What do we need to be running in terms of our digital world?
6. HSHarry Stebbings
  So, it's like a chief curation artist-
7. JPJoelle Pineau
  Yes.
8. HSHarry Stebbings
  ... within each company.
9. JPJoelle Pineau
  Curation d- curation doesn't go away. G- c- curation, verification, this is work that doesn't go away.
10. HSHarry Stebbings
  Does the structure of teams fundamentally change, then? Y- y- ... It's funny, kind of playing that back to you, and then also s- playing back to you what you said earlier about the human and g-
11. JPJoelle Pineau
  Mm-hmm.
12. HSHarry Stebbings
  If that is the case, there's not much of a partnership, is there, between human and AI? It's a chief curation person sitting on top of a huge amount of artificially created code.
13. JPJoelle Pineau
  Well, that's your 10X productivity improvement there.
14. HSHarry Stebbings
  It is.
15. JPJoelle Pineau
  (laughs) It ... Uh, you tick in that box. It removes the human from the ... Um, you still, you still need people with, with, uh, with intent. That's one thing that, you know, you need to decide what you want to build and what purpose does it serve. And so that, that intent is still there. That, that role of, of critique is still there. Um, so the, the team composition does change significantly once you suddenly have, you know, designers who, in their hand, have amazing tools to dif- ... Go directly from the ideas in their head to the, to the digital world, maybe eventually to the physical world. That equation definitely changes.
16. HSHarry Stebbings
  Do you think prompts and the way that we interact today with prompts from a chat largely, is like the enduring interface for human engagement with AI?
17. JPJoelle Pineau
  It's awfully limited. And, you know, p- prompts can mean a few different things, but, but the idea of like typing in a box, that to me is very limited. And we're gonna break out of that box. Already, we're seeing a lot of, uh, cases where voice is a lot more natural as an interface. I do expect we'll see, you know, gesture, eye gaze, these kinds of much more multimodal ways to interact with, uh, with the AI rather than just stick in that, in- in- in that prompt box. But language is incredibly powerful. So, if you think of ...... prompt as being more language, as a way to express ideas and communicate with a machine, that's a powerful paradigm. I mean, as humans we ... So much of our communication is based on language, I don't think we're going to move away from that, uh, because it encodes information. You know, language, words are symbols that encode so much information so efficiently. And so, I don't think we're, we're close to getting away from that.
18. HSHarry Stebbings
  It's funny, this conversation has changed a lot of, kind of previously held assumptions for me. When you think about what you did believe, that you now have changed your mind on, what's most prescient?
19. JPJoelle Pineau
  I'm a scientist that is happy to be proven wrong any time, as long as there's new evidence. I'm genuinely curious to know. Other scientists are much more, like holding on to very, very strong conviction. I have weak conviction, but very strong respect for the scientific method and, and, and rigor. Experimental rigor, you know, theoretical rigor as well. Um, so there's a ton of things. I mean, I ... Ah, f- I mean (laughs) ... I used to be quite skeptical that neural networks were necessarily the ultimate solution to machine learning. I'd seen enough cycles of neural networks kind of peaking and, and, and then, um, being less, less useful, and I used to think every time you change the scale of the data, you know, you go from hundreds of examples to thousand, thousands, to hundreds of thousands, to millions of examples. Every time you change the size paradigm that neural networks were the first thing we tried, 'cause they're a universal function approximator, and then something else comes out that was better, and that was true for the previous generations. You know, some of ... You may remember SVMs as like-
20. HSHarry Stebbings
  Yeah.
21. JPJoelle Pineau
  ... being better than the neural networks in, in early 2000s, and I, I seem to be quite wrong on this one. Like, uh, neural nets seem to be here to stay (laughs) , um, and the ability to do back propagation and gradient descent and all that seems to be a really powerful way to learn.
22. HSHarry Stebbings
  What does everyone else believe quite strongly that you think they are quite wrong on?
23. JPJoelle Pineau
  I don't have a lot of patience as a scientist for people who are predicting sort of the, the extremist scenarios. Whether it's, uh, the catastrophic risks of AI or whether it's the, you know, winner takes all, you know, AI becomes our overlord kind of scenario. I don't have a lot of patience for that, um, and I wouldn't say it's necessarily widespread, but I just ... I think, uh, you lack scientific rigor to, to analyze these kinds of scenario. I'm, I'm much more pragmatic, grounded. I'm pro-innovation, I'm excited to see where, where AI is going and the problems it can solve, um, but I'm not so interested in just going around and, you know, making up science fiction scenarios.
24. HSHarry Stebbings
  You've been on the most incredible ... You said there about kind of image generation in 2015 and, uh, funny how much it's improved. Um, we're seeing this kind of unbelievable capital supply go into-
25. JPJoelle Pineau
  Mm-hmm.
26. HSHarry Stebbings
  ... the space.
27. JPJoelle Pineau
  Yeah.
28. HSHarry Stebbings
  In a way that we haven't seen obviously for many, many years. Is it a good bubble, where we are getting incredible improvements and it's fundamentally advancing technology? Or is it a bad bubble, where costs are becoming too exorbitant, teams are too impossible to build, compute is too difficult? Is it a good bubble or a bad bubble?
29. JPJoelle Pineau
  I think about it as a bubble with bigger variance. It's like ... (laughs) You know, the, uh, the upswing is gonna be bigger and, you know, there's going to be big downswings as well, and so there's a lot of variance into the system right now, um, as long as people have a tolerance to risk, then I think AI is a great investment and, you know, we should continue to be supporting, you know, risk-taking, new enterprise, new ideas. There's a ton of exciting new startups being created, we should continue to, to support them. You just have to be tolerant to risk.
30. HSHarry Stebbings
  I've had some people on the show suggest that, uh, evals are, to put it delicately, bullshit-
51:16 – 51:50
If Joelle Was a VC Where Would She Invest?
1. JPJoelle Pineau
2. HSHarry Stebbings
  If you were investing today and you were joining my team, which category would you most like to invest in? Be it security, be it, um, generative AI, be it, um, compliance. You, you name it.
3. JPJoelle Pineau
  Yeah. There's a lot of verticals, whether healthcare, scientific discovery, that I think have incredible promise where we're gonna see real, tangible progress within five years that are going to change completely the face of what we can do. So that's probably where I'd, where I'd push.
4. HSHarry Stebbings
  That's very exciting on the healthcare front in particular.
5. JPJoelle Pineau
  Mm.
6. HSHarry Stebbings
  When you think about that timeline as well.
51:50 – 58:52
Quick-Fire Round: Lessons from Zuck, Biggest Mindset Shift
1. HSHarry Stebbings
  Um, I'd love to do a quick fire round with you, if that's okay.
2. JPJoelle Pineau
  Yeah, sure.
3. HSHarry Stebbings
  So I'll say a short statement. What would you most like to do, but because of technical or financial limitations, you're not able to?
4. JPJoelle Pineau
  I'm super keen to figure out how we build, um, societies of AI agents. We're doing it implicitly, but how do we look at populations of AI agents interacting together and having, like, a sandbox for, for doing that? Um, so, maybe something I'll, I'll get to do. Is it lack of time, resources, something else? There's just, like, a ton of different things to do. But keen to, to see what happens there.
5. HSHarry Stebbings
  When you think about that ecosystem of agents, uh, you have children.
6. JPJoelle Pineau
  Yes. (laughs)
7. HSHarry Stebbings
  And AI changes our relationship with other humans-
8. JPJoelle Pineau
  Yes.
9. HSHarry Stebbings
  ... and friendship and social.
10. JPJoelle Pineau
  Yeah.
11. HSHarry Stebbings
  How does AI impact social friendship connection, do you think?
12. JPJoelle Pineau
  Hmm. It definitely does, and you know, there, there's a sense that we spend a lot of our time in the digital world. And for some folks, you know, I look, two of my children, they spend a lot of time in the digital world, playing online games with their friends. It's still very social. It's ... There must be some AI, there's the digital platform, but it's still a very social experience. Others have more individual experience. There's definitely a shift of this time we spent, uh, towards that platform where we go look for that, for that social element.
13. HSHarry Stebbings
  Knowing what you know, what do you not let your children do?
14. JPJoelle Pineau
  Ah. (laughs) Eat too much sugar. (laughs)
15. HSHarry Stebbings
  Sh- to- totally. That's, so that's the, like, physical diet.
16. JPJoelle Pineau
  (laughs)
17. HSHarry Stebbings
  Completely agree with that.
18. JPJoelle Pineau
  Yeah.
19. HSHarry Stebbings
  Is there a technical diet? I don't-
20. JPJoelle Pineau
  Um, uh, I spend some time discussing, like, settings. I mean, like, you get an Instagram account, great. You can have an Instagram account. But, like, what are the settings on that account? Making sure they understand. Uh, I mean, they'll go and change them if they want.
21. HSHarry Stebbings
  That's a fun conversation with mum, isn't it, "We're gonna discuss settings." Oh, God.
22. JPJoelle Pineau
  (laughs) Uh, I know. That was not a popular one.
23. HSHarry Stebbings
  Do they listen?
24. JPJoelle Pineau
  Um, the thing with children is you don't know till later.
25. HSHarry Stebbings
  Do you limit screen time?
26. JPJoelle Pineau
  Um, I spent a lot of energy, especially in their younger years, limiting screen time. S- my kids did not have a phone till they were 14, 15.
27. HSHarry Stebbings
  Did you see Adolescence?
28. JPJoelle Pineau
  I have not.
29. HSHarry Stebbings
  Okay. Watch it. It's, it's fascinating.
30. JPJoelle Pineau
  Yeah, yeah.

Episode duration: 59:02

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode 51y4KatMBFI

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome