The Twenty Minute VCMike Krieger, Instagram CoFounder & Anthropic CPO: Where Will Value Be Created in an AI World?|E1265
EVERY SPOKEN WORD
135 min read · 27,033 words- 0:00 – 0:50
Intro
- MKMike Krieger
I think models over time get more different, rather than more similar. I still think we are in like day one around, is AI an indispensable part of most people's work? And I think the answer is no. I think the DeepSeek piece, people seem surprised that there were cutting edge research teams there. And if you were paying attention, that part should not have been the surprising piece. I think we've, if anything, underinvested a bit in two things. One is just having a faster iteration speed on first product ... products. And then on second part, on the API side...
- HSHarry Stebbings
Ready to go? (instrumental music) Mike, dude, I am so excited for this. I've literally just been out for a walk and I've been listening to like every show that you've done in the last year. And
- 0:50 – 1:40
Where Will Value Be Created and Sustained in a World of AI?
- HSHarry Stebbings
so, I told you before, I don't wanna start with the, "Oh, how did you get into tech?" And all the normal rubbish. I wanna start with a very challenging first question, which is, I, as a venture investor today, have to determine where value is in the future. And I look at the world today, and I don't know. And so my question to you is, when we look forward, where will value be generated in an AI-driven decade that we have ahead of us?
- MKMike Krieger
I think it's an awesome question. I get a version of this question often from entrepreneurs who, you know, "I went from, you know, uh, purely building startups myself, to now running a company that is, uh, partly enabling these startups to get created or helping boost their, their fortunes." And the question I get often is like, "Well, what can I build that is not gonna be in the lane of an Anthropic or, you know, another one of these labs?" And, um, I don't have a perfect answer, 'cause it's hard. I have the crystal ball,
- 1:40 – 4:31
Are Foundation Models Commoditised Today?
- MKMike Krieger
but like my, my sense of where it ends up being most valuable to exist, is places where you have some differentiated go-to-market, some differentiated knowledge of some particular industry or some special data that only you have access to. Ideally, two or, or even three of those as well. So, companies that are, you know, within a financial sector, within a legal sector, within healthcare. I mean, healthcare I've like gotten exposed to, and it is, you know, a tremendously complex, uh, sort of ball of yarn. And like the work upfront, it's not the sexy work, it's actually not the work that you're gonna be able to really do in a, you know, accelerator or, you know, a short amount of time. But it is the worth and the legwork that you've put in, I think those are durable places to generate value. And then, you know, you can sit in a place where you can pull on what's great from the foundation models. You can do your own fine-tuning if you need it. You can do your own AI sort of specialization, if needed. But the thing that's gonna give you legs and like be durable over the long run is being able to sell into those places, have something that you understand about those places uniquely, and then get better for being deployed there over time.
- HSHarry Stebbings
When you say about the legwork there, what I think too, and you said about differentiated GTM and differentiated data pools, or data sources, does this next generation wave of AI benefit existing vertical SaaS companies who have those already and can implement AI? Or does it benefit bottoms-up, net newly-created companies in those spaces? Which one more so?
- MKMike Krieger
That's a great question. I think it can be both. At the highest level, the way I think about AI and product design is, uh, you have to dance this very delicate dance of showing the future and dreaming up what the models are currently capable at their edges, you know? 'Cause you wanna design for where they'll be, gosh, three months from now, which is how quickly things are moving. But, um, not overpromise and underdeliver, because that's like a very trust-breaking piece. And now, if you're a startup, you can do a little bit more of the overpromising, because people are kicking your tail. With early adopters, they have a little bit more of that, uh, sort of willingness to engage. It's much harder if you're like an existing vertical SaaS company and you say, "We've added AI," and then people try it, and it's like, "It's not that good." Or like, "Oh, I thought it was gonna do all these things," or, uh, "You said it could do these 30 things. It does like two of them well." Um, I think that, like each of those two groups have like a very different challenge. On the former it's, you have established products, you have established behaviors, you wanna skate to where the puck is going without alienating your existing customers. Um, and I think we can dive in. I think there's some good patterns for doing that. And on the startup front, you probably don't yet have the data, um, and it's like landing the initial sort of lighthouse customers, or you don't have the relationship, but you have some hypothesis about where AI will have an impact on a given industry or given vertical. Um, and then your differentiation is not the established relationships, it's painting the future and, uh, finding ways of delivering that value quickly within a company that might be willing to take that bet on you.
- 4:31 – 6:55
Should Founders Build for the Models of Today or Build for Models of the Future
- MKMike Krieger
- HSHarry Stebbings
You mentioned there, about kind of startups building for where models will be. It's a very challenging time, where startup products are so determined, quality-wise, by the quality of the models. And a change in model can seismically change a startup's output, be it a coding software or a legal platform, whatever that is. Should startups build for what we have today? Or should we build for what we can project forward in time?
- MKMike Krieger
A really good question. I've heard from multiple people that say like, "My startup was not a startup until Claude 3.5 Sonnet, or the second Claude 3.5 Sonnet." Uh, but I hear that from entrepreneurs that are, "This, this company was not a company until this model breakthrough, where now, you know, the accuracy went up, I don't know, from 95 to 99, and now that's, you know, close enough for this industry." Or for sometimes it's like, from 70 to 90. Sometimes you get those kind of generational leaps as well.
- HSHarry Stebbings
Mm-hmm.
- MKMike Krieger
So, um, how to figure out where, where that is. Like there's times where entrepreneurs have been knocking their heads against the wall within a particular space, where whether it's helping people code, whether it's helping with legal analysis, whether it's, um, you know, I mentioned healthcare, like something in that space. And the cobbled together, it probably undersells it, the like lovingly assembled version of what they did, which probably involved multiple tools, was either like price uncompetitive because it required, you know, an Opus class model that was not gonna be sort of supported by the underlying business, is still worth doing. Because when the model arrives, you're not starting from square zero. And so often the companies that do benefit from those model generation shifts are not the ones that-... suddenly start that day, like, "Gosh, you know, it sounds like Claude 3.5 Sonic can do that." It's them w- having beat against that, the wall. I take Cursor as an example. Somebody showed me a, um, list of Hacker News front page submissions from the Cursor founders over time, and it finally broke through. But that was not their first product or their first sort of like iteration on it. They've been trying and, and going for, I don't know exactly how long it was, but it was, you know, it was not just quickly enabled by the model. It came from that sort of, um, building context, building knowledge, building, uh, sort of experience about what has gone wrong or gone well with that space so that the model can unlock you. So, I guess to be more succinct, don't wait around for the models to be perfect. Be exploring in this space, be frustrated by the current generation of the models, and then be very aggressively trying the next one so that you can feel like you can now finally deliver on the thing that you saw in your head, if only the models were just a bit more capable.
- 6:55 – 12:59
Why Will Models Become More Different Than More Similar
- MKMike Krieger
- HSHarry Stebbings
Dude, I have to ask, when you said about differentiated GTM, differentiated data, and then you said, "Wow, you know, there's so many different releases and they come so thick and fast," I don't know how to say this. Is there value in the model layer if it's not a differentiated data game? Is it a differentiated GTM game? How do you think about that?
- MKMike Krieger
I think it's a couple of different pieces. On the model layer and like on the foundation model layer especially, I think about th- like three places where, uh, it's worth investing for sort of a long-term place in the market. One is talent, and I know it's hard to quantify, you know, exactly what, what is, what does talent mean, what does talent density mean, but talent begets talent, right? And you, it become an attractor, and especially talent around sort of a cohesive mission or a story about why you're building what you're building. I've absolutely seen that at Anthropic where I love our research team and they, like, feel like monthly, we get some new significant hire that has come from, you know, potentially another lab, potentially academia and has, and has joined. And, and so, you know, that's a- an advantage you have to cultivate and also maintain because people are obviously free agents and they can do what they want to do. So, you have to maintain the whatever was attractive in the first place. But that is important because, um, to stay at the frontier, it requires more than just more of the same. It requires also figuring out what the right breakthroughs are. So, that's one. The second one is, I think models over time get more different rather than more similar. Of course there's like a lot of similar benchmarks that people are looking towards, but there is something Claudy about Claude and I think there is something GPT about GPT, and they have their pros and cons, and that's both from a, like, character and tone side of things. But then there's also sort of the places where those models really excel. And, um, for us it's clearly been coding is one really big vertical, right? That we've gone after. And it wasn't an accident and it's also not a thing that we just say, "Great, it's good at code. Let's just continue to be kind of good at code." It's, you know, seeing that traction and seeing how many companies are now relying on cloud models for code, for example, or for agentic planning, inspires the next generation of what you want to do from a reinforcement learning perspective. So, the first one's talent, the second one is, um, sort of focus and, uh, model characteristics over time that you sort of develop deeper. And then the third one is, um... And I got this question a bunch with DeepSeek, when DeepSeek came out, like, "All right, what does DeepSeek mean for you?" And I think there's things that we learned from on the tech side, just looking at what they were doing. But from a go-to-market and place in the market perspective, it has almost no impact. And that's because the relationships we ended up having with companies are not, you know... They sign up for the API, they want to just exchange their input tokens for output tokens at some rate. And it's actually, "Hey, I want to be your long-term AI partner. I want to help co-design products with your applied AI team. I want to dream big with you. I want to think about not just, um, your API, but also Claude for Work." And so that, um, is- looks more like being a company, which I know sounds trite but is sort of what you're providing people is AI partnership, not just AI models. I think the more you are just, like, maybe it's good inverting that all to see, like, what the failure mode looks like. I think it is resting on your lo- laurels or not retaining your best people, um, just believing that making the models incrementally better on every broch- benchmark is enough, and then treating the API as just like a way of exchanging sort of money for intelligence without figuring out how to be more of that AI partnership. Th- if you can't do all three of those, I think you're in trouble.
- HSHarry Stebbings
I do want to go into the coding element in a minute. I do just have to ask, when we look at kind of, um, blockers or barriers to progression, when you look today, what do you think the biggest blockers are? 'Cause this is one where I have completely disparate opinions from different people, whether it's Alex Wang or whether it's, you know, Jonathan Ross at Groq. What is the blocker, say, compute, data, algorithms?
- MKMike Krieger
It's getting the environments by which the models get trained in to better and better match real-world challenges that aren't sort of single shot. Um, I know Alex has been thinking about this problem as well, because we talked about sort of evals for agentic behavior as like one sort of very specific version of the broader thing that I'm talking about, which is, um, even within the realm of software engineering, the work of a software engineer is not just to produce code. It's to understand what needs to get produced, to work out the timelines with their product management counterparts, um, to, um, deeply understand the requirements and deeply understand the user, um, uh, use case that they're building for. And then also delivering whatever they built in a way that then can be tested and i- iterated on, and then has user feedback at the other end if they're building some kind of public-facing product. That's a hard... well, there's no eval for that, right? There's like, um... It's interesting that we call the, the sort of most, uh, sort of common software engineering thing SWE bench, right? Like, to actually be a SWE is a lot more than just, you know, "I looked at a pull request, I produced this pull request, you know," or pull it, "produced this diff and then you're gonna accept it or not." So, building environments and evaluations that better mirror that. We think a lot about office professionals, um, at Anthropic in terms of one of the use cases that, you know, is going to be, uh, potentially really multiplied by these models in the future. Nobody's really evaluating that well. There's like something around research that we're starting to get a bit better around evaluations. Um, there's extremely, like, convoluted, I mean that in the best way, uh, evals like humanity's last exam which is like very much like, okay, multi-step reasoning. But there's yet to be the, the sort of, uh, I show up to a new job, I quickly understand what my role is.... who is who in the organization, what are the relationships that are being mapped, where to go find extra information if I need it, and then be in the sort of run loop of the, the, you know, the functioning of the business. That's a hard environment to, to sort of capture. And so that, to me, is figuring out how we better, uh, either break that down into component parts, which is probably part of the story, but also think about it holistically as the biggest blocker to, um, at least one slice of progress, which is how do models go from being, um, extremely good at extreme slices of things to being more generally, like, helpful
- 12:59 – 18:02
Will Human or Synthetic Data Be More Prominent in the Future
- MKMike Krieger
collaborators?
- HSHarry Stebbings
Before we dive into these kind of specialized products, on the data side, you know, I had Adarsh on from, uh, Macquore recently, who obviously raised a big new round. Um, but I, I asked him the question, and I, I'd love your thoughts, which is like when we look at the future of data within models, will there be more synthetic data that compounds on top of each other, or will human data continue to be the predominant data source that drives model progression? How do you think about that?
- MKMike Krieger
I think for the, um, for the models to improve, you do need a story around how do you perhaps seed it with in- original human data, but then can generate all these synthetic environments by which it can sort of path-find and explore. Um, Claude's been having fun playing Pokemon this week, which is, you know, uh, it has been a good but kind of funny distraction for our own, like, research and engineering teams. And I'm like, "What is everybody doing?" They're like, "Oh, we're watching the Claude plays Pokemon, uh, live stream." But I think games are an interesting example, where, you know, you can imagine a lot of different runs through the same game within some constraint and rules. That gets a lot harder when the problem space is less well-defined than, you know, did you make it out of the Viridian Forest? I never played Pokemon. I'm learning just watching this live stream. Um, but it's still important to be able to take sort of golden paths, but also synthesize, um, a variety of approaches through it so that you can still think about how the model can progress in the face of uncertainty. So I think it absolutely has to be a mix, and I think the best, uh, models will come from that combination of... Great, like, for code, it's, you know, being, having good foundational understanding of code and good examples, but then also being able to explore a really wide variety, uh, of paths, um, through that. The other part that is still, I think, underappreciated is how do you measure and evaluate and get data in for character, and I'm going to use a very loose word, which is vibes, right? Like, what is, you know, uh, what is the, exactly the, the, the feel of using a model? We don't really know until we actually, like, sit down and play with it, um, which is, in some ways, kind of a nice property of it because it means it's almost this, like, uh, very, like, qualitative, like, human-like aspect to it. But it also means that you don't have good regression testing on it. Like, sometimes we'll... all... we'll, we'll, you know, go from Claude 3.5 to 3.7, and people will say, "Oh, Claude seems friendlier but more terse," or, "Claude seems, you know, more willing to, like, answer my questions, but I wish it was better at creative writing." Like, these things are not easily evaluable. This goes to the data question. And so I think it is important to, to, to both be able to have the data in there around, um, these more softer skills, but then also have the evaluations for them.
- HSHarry Stebbings
You know what I find bizarre? I find it bizarre that we're able to choose models. And you may go, "Well, duh, you will do because there's specializations within them." But I think when you project yourself forward three to five years, you will not be selecting which model you use. That's like selecting which Google you use. Am I completely wrong, or do I completely miss the point?
- MKMike Krieger
No, there's a concept that I love from sort of... my background was in human-computer interaction, and, um, you might have heard this term of leaky abstractions, right? Which is like as- with software builders, we try to do perfect, a perfect job of sort of encapsulating all the complexity under some, you know, uh, you know, little shell, and then the user should not have to think about any of these things. And the reality is, um, the current state of most AI product design is an extraordinarily leaky abstraction. To take having to choose the model, you know, like what... why? Uh, why should you choose between Opus, Haiku, or Sonnet? Do- most people don't understand the difference, right? Like, uh, or, you know, if you go to the OpenAI drop-down selector, there's a lot of models in there. And like every single one of them has a good reason for being there, and yet the like overall experience is one of, "Why would I choose one over the other? Oh, this capability is available here, but not there?" and then we suffer from this problem as well. So model selection. The second one is once you understand how these models are built. You know, they build up context. They have turns. Every turn actually has the full context replayed to it. That's how it's able to make the next inference. What that leads to is this experience where every chat is different, which I always think of, you know, when you're talking to a coworker. Yeah, you might have different email threads, but it's still one coworker behind all of that. And if you reference, you know, some... their favorite sports team or you reference a project you worked on together, it's not like, "Oh, I d- I don't know what you're talking about," or, "I'm gonna have to go retrieve my memory." It's sort of like a shared underlying piece that's like another... It's r- forcing people into a, uh, uh, understanding of the models that I don't feel like we should be having people need there. And the last one is prompting, which is as much as things have evolved and we've done a bunch of work around like how do we take simple human prompts and then translate some of the ones that are very model optimal, I want to make that absolutely transparent to people, where it's not something that they're... like, they're engaging with it and if the model has a lack of clarity on the problem or needs help understanding better, that that engages in conversation rather than, you know, uh, seeing the difference between somebody who's an extremely good prompter versus not. Now, that gap closes generation to generation, but I'd like... we need to collapse it even further.
- 18:02 – 20:12
Model Quality vs. Product UX
- MKMike Krieger
- HSHarry Stebbings
How do you think about model quality versus product in UX, and how to prioritize and think about those two and the relationship between the two?
- MKMike Krieger
You can't separate the two anymore, and I think to be a UX designer... I was just in a product review right before our call and, you know, I was thinking about like Instagram product design sessions. It was, you know, pixels- some synthetic data or maybe real data, you know. We took my feed and then we, we formatted it to this, you know, uh, UX that we're proposing. Um, but there's not a lot of non-determinism there. You know, you're going to put it out to the world and maybe, like, people will use it in some ways. But designers and product managers and definitely engineers today need to think, "All right.What I'm actually doing is I'm designing a scaffold and, like, a product around a fundamentally non-deterministic system, which means the evaluation, the model quality, the prompting, you know, on the back end, all is part of the product design, which is, it's going to have direct implications. So one example is, you can prompt Claude to ask follow-up questions or not, and that might be what you want in one part of the product, but not another part of the product, right? Um, you might prompt Claude to, you know, want to go, you know, and think longer about a problem and do more reasoning, or not. And again, these are all decisions that, uh, up front you are making in product design, and it, they're gonna have this manifestation, uh, in the actual product. And then the other piece, we talked a little bit earlier about sort of as a, you know, startup founder or as somebody who's doing maybe classic B2B SaaS, you need to figure out, triangulate where the models are, where they're going, and what the user needs are together. Um, that's gonna be the case in your, in your product design as well, where you're doing the evaluations, hopefully up front, to see if what you're doing is even possible with the current models, or at least having, like, an eye out for when they might be. But models change over time. Products change over time. If you don't have good, uh, a good framework around evaluation, even regression testing those evaluations, you might end up launching a product that three months later people are like, "Oh, the product used to good, but something else has happened where it's no longer, you know, serving that purpose." And you're like, "But I'm not sure which of these three things changed. Is it the model? Is it the product design? Is it the introduction of a different feature? The system prompt got longer." It's in many ways the most complex product development work I'll ever do.
- 20:12 – 31:49
The Competitive Landscape of AI
- MKMike Krieger
- HSHarry Stebbings
I interviewed Sam in London from OpenAI, and he said one of the joys that they have as a startup is that they can just release things much quicker, and it doesn't have to be perfect, and actually, the challenge is as they've got bigger, you have more and more weight and pressure placed on every release. Um, how do you think about that? Release and it doesn't have to be perfect, let's get it in the hands of users, versus now Anthropic is a massive company with millions-
- MKMike Krieger
Yeah.
- HSHarry Stebbings
... of users. It does. How do you think about that as, as the product leader?
- MKMike Krieger
I think about this a lot, and especially because you have different surfaces and different audiences that have different both, uh, expectations of stability or a sort of desire to be on the cutting edge. And so, um, you know, in an API product, like what people value is predictability and, uh, stability, and the opt-in of something that's more future-facing, right? And so it can be a very opt-in thing. So I remember we launched prompt caching, which was a big cost savings for people. Initially, we did that via, like, a beta header that you had to opt in to. And a lot of what we do on the API is in that form. If you do that for our customer facing, like, our more consumer stuff, that's really lame to have to, like, have people opt in, or like, really, you wanna be able to sort of iteratively release and be experimental with folks. And, you know, you don't want to totally break their experience, but you've got a little bit more of that permission. And then we have all these enterprise customers that are using Claude for Work in an enterprise. Now, I think AI adoption in the enterprise is still a early adopter product in the enterprise. So you can get away with more than, you know, if you're, you know, I don't know how many releases Salesforce does a year, but I know a lot of these companies do like two, right, or three. And it's usually oriented around some big event that they can do, and we're really far from that. We're still launching pretty quickly, but we're s- honestly still finding the balance there, where, you know, is it a monthly drop? Is it, you know, you ship as often as you can, but there's an admin opt-in on each kind of thing? That adds complexity as well. Um, and so I, it's a great question. We're, I would say, like, it's an active topic of conversation, how raw or how quickly we can ship knowing that we wanna bring things out to the world, and you don't know how they're gonna be received, and you want to learn, but as you accumulate sort of, uh, notoriety, or, you know, depen- people start depending on you for workflows, you can't treat that completely sort of wantonly.
- HSHarry Stebbings
Are we in a product marketing nightmare? And what I mean by that, we have DeepSeat released something this week. We have OpenAI release something this week. We have Anthropic release something this week. We have, uh, Mistral release something, you know, ten days ago, where, blinding every single day there's a new release, that the world maybe gets apathetic. How do you think about that, and how does that inform how you think about product launches, messaging?
- MKMike Krieger
Yeah, I, I mean, it is, uh, much more comp- so Instagram, you know, you, y- the things that you had to watch out for the big rocks were very known in advance. It's like, don't launch anything WW- WWDC week. That's going to be a, you know, flurry of allow- announcements with the September I- like iOS event. You know, there might be some other big rock, like holiday. It was so much easier from a product marketing perspective, where here, um, it reminds me a little bit of Crossy Road, where you're like, "Okay, the car's going by. All right. There's a gap in the car. Like, launch tomorrow," or like, "Now it's good, but oh, now we hear there's a rumor." It- it's so much harder, and I've heard from, uh, folks at other labs as well that everybody's kind of trying to read the- the tea leaves and be like, "All right. Is anybody ... Is it quiet? All right. Is it okay to launch now?" Or like, "I think we're gonna, we can do it next Tuesday." So it's much harder.
- HSHarry Stebbings
Go! Go! Go! (laughs)
- MKMike Krieger
Go! Go! You should, you know? It- it requires a completely different approach, and I give credit to our, literally our product marketing team, because they've had to orient from a point where, you know, we were ... Cloud G7 Summit we launched on Monday, and we locked the blog post for that Sunday night at 9:00 PM, which is not best practice from a marketing perspective, you know? We were briefing press that day, on Sunday. Thank you to folks that hopped on the phone with us on Sunday, but it was, it was sort of ... Right, but that's when ev- that's- that's the point where everything is done and ready and locked, and we can like, we can go. And so it does involve that sort of, uh, ability to react quickly and be nimble. I mean, even things like the, you know, when we release a model, there's a model card, and there's evaluations and a comparison table. There are things in that comparison table that were released the week before, right? Like, uh, Grok-3 was, you know, just a week prior. So it involves-
- HSHarry Stebbings
What- what ha- what- what happens-
- MKMike Krieger
... a completely different-
- HSHarry Stebbings
... when those are released, when Grok-3 releases there? Is like jokes aside, like d- does everyone at Anthropic and OpenAI get by it, "Oh shit, they beat us again," or like, "Oh shit, we won. Yeah."
- MKMike Krieger
I think it, it requires, like... The, the- one of the things I try to do, you know, to support the team there is remind like, you know, it's the model releases are going to happen, and at any given point, you are going to be, you know, it's the, you know, "It's so over, we're so back," like that cycle. It's like, that is, you have to live that in AI, and you can't get too down about one release because, yeah, for sure, it is inevitable, and sometimes you're, you're lucky and there's like two or three months where the model that you launched, or the product that you launched is still state of the art across all the things you really care about. Sometimes it lasts a week, and you, you can't over-rotate on either of those. You can't rest on your laurels, you can't be like that. Like, I think the thing that's used, really useful too is, and it's a chart I show to like almost every, on every sales call, which is just mapping from like Anthropics founding to where we are today and the milestones, and at any given point you could say, "Wow, like Claude 2, that's like pretty far behind. Oh, Claude 3's state of the art," and then no it's not, and it's, it's, you gotta look at the trajectory and like trust that you are going to continue to make improvements, is number one, and then number two, remind yourself that, you know, if everybody switched every single day purely due to a like eval being, you know, changed, one, that would be like an insane thing to do to your user base as like a provider of software, but two, like that would, it would, that, that would make for like an even crazier industry. Over time you start learning that people don't just deploy models, they're doing like fine-tunes or they're deploying models plus they've done a lot of really bespoke work to make that model be great for that use case, it's not a thing that's gonna switch, um, uh, overnight, or you're one of three or four options within a model selector, right, which like, for example, in a, uh, in a coding environment, so you're still in the mix and you still have a chance, but it does require... I'm not sure if it's like finding the meditative zoom-out angle of it, or just like get used to the bumpy ride of some combination of the two, but it is, it is for sure a thing that like every time there's a model launch, I assume every one of those labs is like either watching the launch stream looking and they realize it's gonna be either, "Woo," or, "All right, now we got work to do."
- HSHarry Stebbings
I would argue that brand is the most important thing. To your point, people aren't switching every day. They're kind of like, "Oh, I'm a Claude person," or, "Oh, I'm a ChatGPT person," and they kind of identify already with their models. Do you agree with that statement or do you think that's too glib?
- MKMike Krieger
I don't... I, I think that is right. I think especially on the consumer front, um, you know, I was just reading, uh, Ben Thompson, you know, he has Nat Friedman and, and Daniel Gross on there pretty often, and they're talking about some people being Claude people and some people being ChatGPT people, and I think that definitely happens where they... You like the personality, you like the interface design, you like the vibe, again, you know. It actually reminds me a lot, um, you know, we had this interesting back and forth with Snapchat over the years with Instagram, um, and then even before that people would launch a new product that's like Instagram but just for super high-end photographers or with this like additional twist or just one photo a day, you know, let's be real, and, um, I had this like fake formula, I'm not the mathematician clearly at Anthropic, but it was, uh, you know, social networks are made of format, or formats, that you have in your product, audience, and vibes. And format, you know, for Instagram we had stories, we had feed, and then eventually we had video. Audience, you know, initially was sort of hipstery photographers, eventually grew to be anybody that's really interested in sort of visual storytelling or visual media, but the vibes of Instagram, even when we had more, uh, product similarities to a Snapchat, even to a Facebook, the vibes were very different. And I don't know what that fake formula is for AI products yet, but I think it's some version of that where the, there's like model, um, model personality is probably one of them, there's likely something around the scaffolding prescriptiveness of the product that you're working around it, and then there's vibes and like I, again, hard to measure but absolutely there.
- HSHarry Stebbings
When we have so many different models and so many different providers, um, open source is a very viable possible route and distillation is looked to of, in a shady way. Is distillation really wrong if it ultimately propels spaces forward?
- MKMike Krieger
Even like, let's take within the labs, like I assume every single one of the labs is using, like even within themselves, like it is very valuable to be able to take, you know, the knowledge of your highest end model then be able to make it higher, you know, lower latency, more affordable, et cetera, so there's that loop et cetera overall. Um, I think the places where this gets interesting, um, are one, do we want any nation to be able to be able to distill models from any other ones? Like, very personal answer is no, I think that there's value in like even like as AI gains in capabilities being really thoughtful about that from a national security, um perspective. And then the other piece is to have the advancements happen at the rate that they're happening and be sustainable long term, like you do need the labs to be able to, you know, be able to commercialize all of that training and, and, and innovation et cetera, and I think finding the right models for that long term is important. I think the open source models, like take LLaMA for example, like they've been able to do that from their own research and perspective and data ingestion and, and training, and so I guess I would say distillation does not feel, uh, essential in order to unlock those things and, uh, poses other issues even just, you know, from a terms of service perspective.
- HSHarry Stebbings
Does LLaMA show that there is no value in the model and all the value is in the data? If Facebook are willing to give it away for free because they know that no one can copy the data that they have, is that what that shows?
- MKMike Krieger
I think it's... A good interesting question is like whether LLaMA... Is the quality of LLaMA due to the fact that they can... I don't know if they've said that they do, but they clearly can train on, on Instagram and Facebook and et cetera data, or is Gemini better for being able to train on, on YouTube? It's actually clearer to me that Gemini benefits from that, like whenever they have like a good like video understanding demo, for example, I'm like, "Well, I... Somebody has like probably the largest repository of video in the world and can likely train on a lot of those pieces." Less clear on the Facebook front. Um, like I've never heard from people, "Gosh, you know what LLaMA does extremely well is generate good content that would work well on social media." It just seems like a, like a good general purpose model. So it'd actually go back to like the value is all, you know, in our conversation earlier, the value is all, uh, in how good is your, your team, um, you know, do you have the underlying data that you need to be able to do it, but then also how, um, how useful is your model in actual use cases? And that is like-... the highest order bid. I almost wish I'd started with that because, evals aside, evals are really useful for hill climbing and for internal research, but they don't tell the story of, like, is the model going to be excellent at what it needs to be excellent or deployed for. Or even if it is excellent at that thing, is it only excellent at that thing in very narrow situations or is it something that as a, as an entrepreneur, you know, outside the c- the labs, you can rely on the model to be, like, your representative, I guess, in that product? So yeah, I think the values in... For the labs, the value's in the team. It's in, like, the, the, the model's ability to actually perform the right actions in the real world without so much non-determinism that it becomes sort of unreliable.
- 31:49 – 33:31
Do We Underestimate China's AI Capabilities
- MKMike Krieger
- HSHarry Stebbings
I'm gonna ask one question on this. It's not a tr- trap to go down, but, um, I've spoken to Alex Wang about it on the show, and Aisa w- uh, Poolside on the show, and they said, "We deeply underestimate China's ability in AI." Do you agree that we underestimate it?
- MKMike Krieger
Yeah, I think the DeepSeek piece that... People seem surprised that there were sort of cutting edge research teams there and if you were paying attention, that part should not have been the surprising piece. Um, you know, it's... Um, and we saw in Instagram was blocked in China fairly early, and then we saw the sort of emergence of a sort of, like, uh, parallel world of, uh, startups when... If you take out Facebook and Instagram, what happens and what emerges, and those products were often, like, very high quality. They, like, demonstrate a lot of creative thinking and, uh, and were built at scale too. Like, they were solving problems. You know, people love talking about the, like, the super app and, and WeChat, and there were some technical challenges solved by those at scale that were of the same scale of challenges that Facebook was challenged, w- was doing. So, uh, it would absolutely, uh, be a mistake to have underestimated or continue to underestimate, like, China's ability to both, uh, train at the frontier, um, especially, like, if they get access to compute, um, and then continue to innovate there too. So I think it's a pretty Western-centric view that I've definitely seen happen in more traditional software around, like, well, like, maybe it's, like, caught in this, like, you know, '90s, early 2000s view of like, oh, all they're doing is, like, replicating what's already been working elsewhere and doing that. There's been products that I think take a differentiated view and grow, you know, internal to the Chinese market and then sometimes make their way externally. I mean, TikTok being an interesting example of that, uh, on the other side.
- 33:31 – 34:44
What Did Anthropic Learn from Deepseek
- MKMike Krieger
- HSHarry Stebbings
Final one before we move into, like, verticalized products. Does, did DeepSeek cause you to rethink anything or change anything about the way that you progress?
- MKMike Krieger
There's some architectural pieces, and I won't speak for the research team 'cause they're, you know, they're, they're d- definitely the deep experts but they're like, "Oh, interesting, like, that, that's worth us considering." Or some ideas that had been considered and maybe were worth reevaluating. So I think there's that, that piece, um, there as well. It's interesting, our plan was already to show the chain of thought when we launched our reasoning model. Um, so that was not a reconsideration, but maybe a, like... Uh, it was interesting to see somebody else do that, and there's some, like, user interface kind of details in there, and I think Grok does as well now on theirs. Um, so it, it'll be curious to see how that evolves. To your distillation question, that might be a reason why, um, more labs either choose to not show or otherwise obscure the, the, the chain of thought down the line. The other piece that, like, from a product perspective, there were two. Um, I think that's, like, the under-talked about piece of DeepSeek. It's... I think they were able to go from nobody knowing about them, to them being, like, frankly, in many circles, better known than Claude, right? Like, great au- was calling me about DeepSeek. I'm not even joking. Like, it was like the cliché was actually happening. Like, I got a thing, like, "What do you think about DeepSeek?" I'm like, "Great." Like it's broken through.
- 34:44 – 38:09
Is Deepseek a Sustaining and Credible Threat?
- MKMike Krieger
- HSHarry Stebbings
What do you think they did to break through that maybe Claude hadn't?
- MKMike Krieger
I think there is a, uh, lot of interest, of course, in, like, world politics right now and, like, have the narrative be, you know, this was much cheaper and whether that was exactly true or, like, what, you know, but like, "Oh, they were able to figure something out." Like, that was, you know, it's the story and like, frankly, and then I've had this conversation with our, with our, our marketing team as well, like, I don't think we tell the, the Claude story well enough externally yet around what is different or what is notable about the fact that, you know, take Claude 3, where we're training a model at the frontier that was state of the art with a team that was much, much, much smaller, um, than any other lab, right? Um, and I think we're, we've been always ve- very, like, uh, efficient with our, with our compute as we train. So, I don't know. I think there was... Whether that was a story that they told or was just told for them by the media because it legitimately was a really compelling story, the, the sort of uniqueness of the moment was a big, big piece there. And I think especially, like, it's January, you know, new presidency, China relations, like, it fed into the moment really, really, really well. Um, so I think that worked well. And the second part on the product, like, they went from not having a product to having, like, an iOS app that actually had a lot of, like, good details. And for me it was, like, uh, a good, like... I- I was gonna say nudge, but it was, like, stronger than that. Like a, uh, uh, a, a shove around, like, we need to be getting some of the i- ideas out to market quicker without, to your earlier question, focusing as much on exactly the polish that it needs to happen in every situation and instead be willing to put it out there and learn, because sometimes the novelty of experience is itself valuable, right? It was the first time most people experienced the sort of live chain of thought. That's interesting and, like, I wish we had done that sooner because it would have been novel for people to experience that.
- HSHarry Stebbings
When you look at usage, you see emerging markets' usage retains, and you see Western markets' not really at all. How do you think about them as sustaining credible threat?
- MKMike Krieger
They already have this sort of, like... They're known at a level where that has some, like, ability to, to generate that ongoing, like, staying power, et cetera. On a retention front, I think if all we're doing in these AI first sort of, like, lab-generated products, even six months from now or a year from now is, like, asking questions, maybe sometimes having, like, slight proactivity. Like, I don't think that's differentiated or interesting in the long run. It should be, wow, I can now do something uniquely because I am using Claude or I'm using DeepSeek or any one of these products and...... it unlocked hours of work for me, and it made me smarter, and it made me, like, a better partner to whoever are the important people in my life. Like, it has to transcend beyond the kind of surface level utility. Some people find the deeper level, don't get me wrong, and those are the people that, like, are your DAUs right now. But for a lot of people, they'll try it and generate a poem with it. They, you know, write a letter to their son. Like, there's all this stuff that they can do that, like, provides some value in the moment, but I still think we are in, like, day one around is AI an indispensable part of most people's work? And I think the answer is no for most of them. And so, um, there's, I think, DeepSeek and all of our honest product staying power will come from who can get there and do that sustainably over time and have the right product design, the right integrations, and the right sort of, uh, deployment of that to actually succeed.
- HSHarry Stebbings
And who can build those products? Which is, like, my, as an investor's, big question often, which is when does a model provider
- 38:09 – 43:44
Transitioning from Model Provider to Application Provider
- HSHarry Stebbings
move into an application provider?
- MKMike Krieger
Yes.
- HSHarry Stebbings
I'm just fascinated to hear your thoughts around what is attractive enough where you dedicate the resources to become an application provider, not just a model provider enabling?
- MKMike Krieger
Two main criteria that I look at is, um, 'cause our team, for all of Anthropic being big, you know, I think we're across a thousand people, our product team is, you know, maybe a 10th of that. Like, it's, you know, by Instagram year two standards, very large, but by, you know, large SaaS company, very small. We're somewhere in between all of those differences, and we're supporting, like, you know, you know, we have Cloud Code now. We have the API, we have Cloud AI, we have Cloud for Work. So it is across a lot of different surfaces. So I think generalizability is really important. Even if we pick a persona or a vertical to go after, we are going to be building things that are, um, general purpose as a rule, with maybe some specialization at the, like, user level, but not at the... I don't anticipate us building a lot of verticalized experiences that are, like, fairly bespoke to a given workflow or use case. So I think that's-
- HSHarry Stebbings
But I think about, like, translation, transcription, customer service, quite horizontal kind of homogenous things.
- MKMike Krieger
Yeah.
- HSHarry Stebbings
That seems, like, right in the pathway.
- MKMike Krieger
I think it does, except for the fact that I think that there's a lot of, um, valuable workflow and... Yeah, like, workflow knowledge that means that you can retain a differentiated product over time. Like, um, a good example is-
- HSHarry Stebbings
Yes, if you're- if you're a power user. If you're a power user, yes.
- MKMike Krieger
Perhaps, yeah.
- HSHarry Stebbings
But if you're- if you're not a translator and you- you're your mom, who maybe uses it once a month for that odd thing that she needs...
- MKMike Krieger
Yes. Yeah, I think the role of the, uh, great, we can help you translate this and from, like, a, you know, individual user, we'll get you to pay, you know, a $10 monthly subscription, that feels iffy because I think that the models are quite good at that already, right? And maybe they're- you're right, there's not the, like, uh... You know, if you- if you play with Eleven's, like, console and workbench, a lot of the features that they've built are very clearly for people that are translating hours or, like, uh, voicing hours of content with a reliable voice across a whole work stream. Descript, I think Descript has some of the best product design in AI and, like, they've clearly put so much time into the workflow of it. I had to use it once for a, like a pe- a personal podcast, I was like, oh, this is clearly been built by people who are, like, day in, day out sitting on- in this workflow and understanding it. So yeah, I think that maybe we've come to the- some synthesis of our views, which is there's value in the more professional use cases, um, and the workflows that are unlocked by that. And I think on the, like, consumer and maybe even prosumer side, it gets good enough from a, like, you know, basic AI product perspective.
- HSHarry Stebbings
You know, when you look at what you're brilliant at today, you do so well, as we said, on the code front. Is there a roadmap here to put your own IDE in, code agent in? How do you think about that?
- MKMike Krieger
You know, again, with the product focus lens, um, I think we have to pick our bets carefully. Um, and even building- we built Cloud Code, which we just released, um, as a sort of, uh, uh, command line agentic coding tool, internally first because we just wanted to accelerate our own team. And, uh, after seeing it play out for a couple of months, we're like, this is good. Like, it's not- it's not a solution to all coding problems and doesn't obviate the IDE, but it's useful enough to us in enough cases that we want to see people use it out in the real world. And so then you, you know, and shipping is never free, right? There's like, you gotta name it something externally. We gotta, you know, find the right, you know, packaging around it. There's the go-to-market piece. Um, so we- we do it carefully. I think my view of- of where the models are today is, um, you still need hands on keyboard and you still need that exchange of, "Hey, I did this. Is this right?" Um, right? "Well, let's pursue this direction down. Yes, this is great. Let's put up a pull request." Or, "Yeah, we went down kind of like a false trail. Let's, like, unwind the stack metaphorically, and you know, maybe in actual, uh, usage and then- and then keep going." That's why I think that there is a- there is a role for this sort of in between, uh, IDE and the full on, like, cognition, dev and, like, full on delegation of tasks, which then can be used for a certain category of tasks. Our product engineers love Cloud Code because a lot of product engineering is, all right, we gotta update the backend. We gotta create the front end. We gotta, like, submit these things for translation. Um, we're gonna like, you know, "Oh, this still doesn't work. Let me do this." And it's that sort of build the product end-to-end workflow that does well with a thing that can work agenticly across a diff- a lot of different things. I did two pull requests last week. I hadn't coded since joining Anthropic, which made me sad. And so I got to finally use Cloud Code. I have not opened our code base before, so I don't really know, like, how it's even structured. But Cloud Code is very good at finding the file that has the right piece and then going on and making edits. And ob- obviously not everybody's in the same situation I'm in, but it is really valuable for those use cases. So when I think about the coding space and where we can play and add value, it really is on the agentic side. It's not on the IDE side. There are other companies that are spending- like, they wake up and go to bed every night thinking about how do we make a great IDE, and that involves things like low latency autocomplete. That involves, like, the right integrations, figuring out how you play with the VS Code plugin ecosystem and all of that complexity, right? There's a bunch of work there that I- is valuable and- and different than what we're doing. Um, I think we can really play in, let's be talking to these models and be...... doing real work with them in that agentic loop, but recognize that they're not yet at the place where, for many use cases, you can let them kind of run free for, for hours. You need that more, um, human-in-the-loop piece.
- 43:44 – 48:31
What is the Role of a Software Developer in the Future
- MKMike Krieger
- HSHarry Stebbings
You power and you work with Cursor, Codium, StackBlitz. My question to you is, when you look at, bluntly, as you said there, the first time you've coded since joining Anthropic and the changes that we see in developer behavior, what will the role of a software developer be in three to five years time, do you think?
- MKMike Krieger
Yeah. I mean, I think it already looks... It starts to look different already. I was a huge, uh, uh, early proponent of GitHub Copilot. I think like my, my quote was on the homepage for a while. I don't know if it still is, um, 'cause I saw the potential and then, uh, even GPT-4 came out. Before it, they had multimodal and I was trying to do Swift with it, I would draw ASCII art of the screens I was trying to build for Artifact and then go make coffee 'cause it was, at that time, quite slow, and come back and it had like an 80% version. Obviously, now it would be a 95 to 99% version with something like 3.7 Sonnet. I think the, the, the skills that become important, one, I think it becomes multi... What word am I looking for? Like multidisciplinary, where it's knowing what to build as much as it is knowing what, like the exact implementation that you want. I love that about our engineers, like many, maybe even most of our good product ideas come from our engineers and come from them prototyping, and I think that's like what the role ends up looking like for a lot of them. The second piece is, code review really changes when all of a sudden you're mostly evaluating AI-generated code. I even experienced this. I put up a pull request and some of the comments that came back were, "Yeah, Claude Code does this sometimes. Like we don't actually use default arguments in this case." And I was like, "Oh, well, damn it." You know, like, uh, so, uh, it was sheepish, you know, like if I was coding it, I would have probably noticed those patterns a little bit better. And so there's kind of two sides that need to happen. One, uh, models and just the infrastructure on models need to learn from code bases and code reviews better so that they can, like produce code that feels idiomatic to that company. But then also how do we evolve from being mostly code writers to mostly delegators to the models and code reviewers? That's what I think the work looks like three years from now. It's coming up with the right ideas, doing the right user interaction design, figuring out how to delegate work correctly, and then figuring out how to review things at scale, and that's probably some combination of, um, maybe a comeback of some static analysis or maybe AI-driven, like analysis tools of what was actually produced. Like is there se- security vulnerability? Is there, uh, some other flaw? Is there a bug? Computer use plays a part. Sorry, you can tell I get very excited about this space. Um, like, uh, automated testing of UI so that what y-... What would be great is you delegate the task, you know, a year from now. Three years is crazy. Let's say you take a year from now. You delegate a task to it. When you come back to it, it says, "I evaluated these three approaches. I have tested them all out. I had a different agent actually try them out in a browser. Um, this one is the one that worked best. I've run it through this additional agent that did a vulnerability test. It all looks good. Um, all we need to do is, like help you resolve this one question, like let's review this particular critical section of code to make sure it's what you really wanted." Like that feels like you're suddenly empowered to be more of a, a manager and delegator to these things, rather than just a, a partner in the loop.
- HSHarry Stebbings
You, you said three years sounds ridiculous. A year would be much more realistic. I, I agree and I get you when we look at the speed of scaling. Do we think that we hit a plateau or an asymptote in product releases, the speed of development? 'Cause it feels so fast now to our point earlier. Do we hit that plateau or do we continue in this exponential progression movement?
- MKMike Krieger
It's a question I think a lot about. I started the year by looking at our product development process and looking at where we are Claudified, like where we're using Claude and where we're not. And, uh, you look at it and you say, okay, you know, Claude can be useful in sort of taking an initial idea and creating a PRD out of it, and Claude can be useful obviously in the coding side. Um, Claude can be used for synthesizing a lot of conversations that people are having about a product and kind of like finding, like the kind of thorny issues of disagreement. Driving alignment and actually figuring out what to build is still the hardest part, right? Like that is actually, like the only thing that is still best resolved by just getting together in a room, and, and talking through the pros and cons, or going off and exploring it in Figma and coming back. And so, like any dynamic system, if you optimize one piece, all of a sudden something else becomes the, um, uh, the something like the, the blocker or the, or the critical kind of pa- path. And I think alignment, deciding what to build, solving real user problems, and like figuring out a cohesive product strategy, still very hard, and probably like the models are more than a year away from solving that. That is the constraint. It's why I'm really bullish on at least startups being able to explore the space because, you know, I remember this from my both Instagram and Artifact days, like when it's just a couple of you, like alignment is a coffee conversation in an afternoon rather than steering the ship of a large company that has commitments to customers and all of those things. That, that's still a very human problem that I think we're at least three years away from the models being, being solving, uh, at that, at that level
- 48:31 – 52:25
Balancing API and Consumer Products
- MKMike Krieger
of abstraction.
- HSHarry Stebbings
Final one I just have to ask before we do a quick-fire, but we mentioned kind of some end products there and building them. When you think about building end products for consumers versus building the API division of the company, which is very significant, how do you think about the balance and the trade-offs there between building an API business and building an end user consumer business?
- MKMike Krieger
There's, um... What we get out of, uh...... each, um, and I think about that trade-off. So, I think we learn a lot more quickly with first-party products. So, um, as a really, you know, specific example with Claude Code, within, you know, a week of it being deployed internally, we had found a way in which one of the sort of tools that it has access to, uh, the model wasn't using as well as it could've, and that made its way directly into 3.7 Sonnet. Like, that's a way in which internal dogfooding of a first-party tool directly led to a model improvement in the next generation. There's like, a few other places where we've hit that, even building first-party products. Much harder with a third-party product, right? Like, they'll tell you if something's wrong, but it's- it's a bit more arm's length. And even though we work really closely, including with some of those, uh, coding startups that you mentioned, it's still not the same. So, there's a lot of value in- in what we learn there. Then there's the sort of stickiness and- and sort of, uh, we've talked about brand and loyalty, I think it's easier from a consumer if you can build a brand around a product than just an API. You know, the fact that we power a lot of these coding products is visible to people. Like, it's often the dr- default in the dropdown selector, and if you're in the know, you know. But not everybody does, and it's still, you know, not- n- not the thing that they download and not the thing that they install that they're gonna tell their folks about. And, but yeah, it's also a place where we've gotten tremendous distribution and, uh, we're not gonna invent every company and we're not, you know... And this way, we can kind of play, uh, this sort of... Oh, it reminds me of my, like, investing days where you get to see a lot more, and there's more than one Shaolin goal and it's not all of those things. And so, it's been actually a fairly, from like, a resource allocation perspective, fairly even split. I think we've, if anything, under-invested a bit in two things. One is just having a faster iteration speed on first-party products is like, my current obsession. Um, and then on second part, on the API side, how do we build, um, abstractions beyond like, you know, tokens in and tokens out, you know? And every time we do that, we get great feedback from people. Um, so whether that's helping the model, like, plan and work agentic-ly, whether it's having the model build more knowledge graphs and repositories of like, how companies operate internally if you're using the API to build more of like, an internal knowledge product. Whether it's perfecting tool use, um, whether it's understanding very large bits of context and having memory that transcends conversations. Like, those are problems that I think are worth us solving on the API because there are things that, um, we can take what learn- what we learned on the training side and directly map it to the API and build good products around it. So, that's how I think about those two. But it's a new problem. On Instagram, it was easy. It was like, 95% product, 5% API, and it was, you know, that's all we really needed to do.
- HSHarry Stebbings
What can and will you do to increase product speed on the first-person consumer side?
- MKMike Krieger
I think there's two things. One is recognizing that we were running, I think, a larger company playbook for what is actually like, we're still in like, start... Our products are... E- even if the company has good traction and like, the API business is doing real well and people are using Claude AI and upgrading to Claude AI Pro, it's still early days and it's still like, do or die, or like, make it or break it. So, we need to operate in that way. And so, uh, that means getting the right people together sooner, faster, and ignoring organizational boundaries. We got too calcified, I think, and like, of, "Well, this is on this team's plate versus this team's plate," and, "Oh, you can't get this done this quarter because it's not on this team's..." It was like- it's like... I mean it, I get why organizations evolve and some of that is- is natural, but we can't afford that right now. And so it's been a lot more, who are the right people? Let's get them together. Let's clear away all the other distractions, and then like- like, let's clear out my calendar so that like, I spend more of my time in product review and design review than I do in administration.
- 52:25 – 52:59
Is Europe Stronger or Weaker in a World of AI
- MKMike Krieger
- HSHarry Stebbings
DeepSea showed the benefits of constraints. Do Western companies' respect for you and OpenAI have too much money?
- MKMike Krieger
I think it's, um, the way I would put it is the adoption that we've gotten of our products is ahead of their actual, like, true product market fit because they are still the best ways of getting the models, and I don't think that's doable over time, so I- I think that's like, not a thing that- to rest on. Um, and two, I just think we're under-serving people 'cause I don't think we've gotten the right products yet. So, it's... I don't know. It's what I wake up stressed out about every morning, or it's inspired by, depending on the day. It's like, I think we've got a- we've got so much work to do on
- 52:59 – 1:02:40
Quick-Fire Round
- MKMike Krieger
that side.
- HSHarry Stebbings
I- I love it. Listen, I want to do a quick fire round. So, I say a short statement, you give me your immediate thoughts. Does that sound okay?
- MKMike Krieger
That sounds great.
- HSHarry Stebbings
What's OpenAI done better than you on?
- MKMike Krieger
They've, um, moved faster at shipping V1s, even ahead of where the model is sometimes.
- HSHarry Stebbings
What've they done worse than you on?
- MKMike Krieger
Probably personality and having the features they build be cohesive.
- HSHarry Stebbings
Which alternate model provider do you most respect?
- MKMike Krieger
OpenAI. Um, I think that they've balanced first-party product development and- and an API that like, people, uh, people use at scale as well. And I think that they, um... Well, we had an Instagram principle that was, "Do the simple thing first," and I think they often do the simple thing first.
- HSHarry Stebbings
If you could rebuild the Anthropic product and stack from scratch, what would you do differently?
- MKMike Krieger
Oh, I love this question. Um-
- HSHarry Stebbings
I do too. It's a good one, isn't it? Yeah, I'm with you.
- MKMike Krieger
It's a really, really good one. Um, I think, uh, the things that we built that were actually very valuable last year are now feeling like they're having... This is a long answer, rather than a quick fire, I'm sorry. Uh, have some costs to the information architecture, which I know sounds like a very nerdy way of describing it. But basically like, people should not have to think about like, projects versus artifacts versus chats and how they all relate, and I think like, tearing it all down and being like, "What actually matters is, do you have the right context into the right conversations? Do you feel like you can always know where to go next in the product?" And is Anthropic and Claude itself being a helpful sort of guide to what work is most important to do next is a different paradigm than like, "I know to create a project and then..." And like, if you get good at that, it's an amazing product, but it is a lot of steps along the way. Uh, so there was that on the- on the product side. I think that's the fundamental thing. On the stack, I mean, Claude AI and probably ChatGPT.com were like, very much like, initially just built to be, uh, sort of showcases of the models and not really built in a lot of ways to- to be the right, like, the- the- the sort of foundational for like, a much more complex sort of multi-product sort of thing. And I think, uh, we have an active effort right now around tearing down some of that and rebuilding the core UX to just feel good. It doesn't feel great right now. It feels a little bit like it's been an evolution of a product that served a purpose at the time but now is being asked to do way more things such that the incremental thing is now both harder to add and getting slow.
- HSHarry Stebbings
What have you changed your mind on in the last 12 months?
- MKMike Krieger
How much first-party stuff is important. I think, uh, I saw the growth in the APNs, like, this is what we should just invest a lot more of our time. And I think that there's... You'll, you'll miss out and not have enough of a, a durable moat if you're not, if you're not equally investing, um, or maybe even investing even more on the first-party side of things.
- HSHarry Stebbings
How much did it hurt you being late to that?
- MKMike Krieger
I think significantly, if you take a deep seek moment, right? Like, ideally that the, like, the, the, the story of, oh, there's more than one, uh, sort of front or leading edge API, or sorry, AI product to be used, is some, a narrative that we should have captured. I think it hurt us there.
- HSHarry Stebbings
What's a major technical or product challenge on the horizon in AI that no one's talking about that you think is critical?
- MKMike Krieger
The models as they get more... Or really the headline, which is basically like discernment and privacy. So as the models get more capable, they'll also become more knowledgeable, right? They'll have... You'll be in conversations with them about everything from something that might be quite intimate or something that's quite sensitive from a company perspective, or they'll have access to all of your, you know, particular company's things. And then everybody loves to talk about agent-to-agent interaction, right? The intersection of those two, not enough people think or talk about it, I think, which is, do you trust your Mike agent or your Harry agent to be out in the world and then not be jailbreakable or reveal something that it knows that is like quite personal or sensitive, right? I think, like, my metaphor is my 5-year-old. It's great watching her with, uh, you know, somebody that she's just met, because she does- she hasn't quite differentiated between like stuff that's secret and private to our family and stuff that is like things that are okay to talk about with, you know, a new friend or, you know, somebody at the checkout aisle. So that, uh, discernment is something you acquire over time for people. And I, I think models, this is very underappreciated and probably under-researched as well from like a model capabilities perspective, because models fundamentally want to be helpful. And that is not always what you want them to be, um, and there's a safety case for that. But then I think there's also a privacy and data security case for it too.
- HSHarry Stebbings
Do you worry about your 5-year-old becoming more comfortable talking to models and agents than they are humans?
- MKMike Krieger
I've had so many conversations with Alex Wang about this, because he has this whole thing about how in the future most friends will be AI friends, and, um, you know, uh, I don't think he's wrong. Um, and I think that there's, uh, there's ways in which, uh, that's already starting to be the case with, you know, people, uh, you know, having lots of online game experience, and some of those are NPCs, and you might just have like more of like a comfortable sort of existence in there as well, even if you're not breaking through. So I do, I worry... She is so gregarious that, like, I'm not actually worried in her particular case, but let, like, let's abstract to the broader sense. There is a lot you can learn, uh, from, you know, what it feels like. Like, here's the bull case. I was a fairly awkward, you know, you know, teenager, and I probably could have benefited from some practice mode, like AI interactions around some of these things to build it up. And at the same time, that's, like, not the real... It doesn't feel like it's totally closing the loop around, like, the consequences of real interaction. Like, it's the difference between reading about what it's like to have your first, like, really hard argument with your high school girlfriend and then actually having it. And, like, when you're in that moment, you know, it's like now the classic, like, the Chinese room experiment, where it's like you're... It's not the Chinese room experiment. It's a different, uh, thought experiment where, like, somebody's in, you know, a black and white room only reading about red, and they go in the room and they see red. And like, is there something qualitative differently about that? Absolutely. And is there something different between talking to a model and engaging a model, even in emotional role play, and then having that same interaction with a real human? Like, absolutely. And so it is a, uh, probably a helpful piece of future human interaction and absolutely insufficient as, like, the whole.
- HSHarry Stebbings
Does Europe become more or less relevant in an AI-driven decade?
- MKMike Krieger
I want them to do well because I, I love a lot of Europe, and I have, um, you know, I lived in Portugal growing up as well. Um, I saw a funny, maybe somewhat defeatist argument where if real-world experiences and human interaction become more valued, Europe becomes more valuable, so- itself as like the perhaps world capital of sensory and, you know, uh, uh, experiences. That feels weird as a, as... If that's all you're resting on, that, that feels a little, uh, limited in there as well. What I think will be really interesting from a Europe perspective or European perspective is, what are the things... Like, the thing I really respect about Europe is there's often been the case that there are, um, things about the lifestyle or the society that they hold very, very strongly that then they, not always elegantly, but at least attempt to enshrine in either, like, best practices or even laws. And so, even as we think about doing our product design and data privacy and, you know, selling to German users or German companies, there's a different set of questions that get asked that are often very helpful questions. And so maybe the, the bull case there is that those are actually questions that are relevant to everybody, and they will just be at the leading edge of asking some of those questions. I think from a labs perspective, it's a lot harder question to answer. I think there's maybe some combination of, like, access to compute. Maybe they move further up the value chain. And if it is the case that building applications on top of these models becomes... it is a lot easier, and you can go from zero to one, and you can be more, um, uh, nimble than even these labs that are going to all have like, you know, tens or hundreds of millions of users and you have to move slowly at that pace. Can innovation happen there? Probably, but it probably involves a different both regulatory and startup ecosystem environment to really make that actually the case.
- HSHarry Stebbings
Final one. Dario has said that this will be the generation that could live to 150. I'm slightly, like, butchering and summarizing his quote, obviously. Uh, but like, this could be the generation. I'm very optimistic. My mother has multiple sclerosis, so we'll find cures for diseases like MS with AI. Do you agree with his optimism? And how do you think about AI increasing longevity and human lifespan?
- MKMike Krieger
I think the potential is huge. I think there's everything from... At the, like, today, where AI is helping, which is in, um, closing the loop on drug discovery and closing the loop on clinical trials, right? Novo Nordisk, uh, used to take, I think, something like 15 weeks to do their clinical trial reports, and now they use cloud and get it done in 20 minutes. And like, that's a step change. Now, there is years of research that preceded that, so I'm not saying that we've cut years to weeks, you know, or years to minutes, but that's a point, you know, of the process that we can make faster, and that's like with the models today. Then you see, um, ARK, which is this, um, science and research institute that, um, Patrick Collison and some others ha- have started and funded. They're working on foundational models for cells, right? Where you have, all of a sudden, uh, a real cell model that you can run experiments on, and that kind of thing should also accelerate drug discovery, um, and, and, and experimentation there tremendously, 'cause all of a sudden you're, you're cutting the loop there. So I'm very optimistic. There's a lot of places where AI is, I think, under-utilized relative to its potential. And I think some of the smartest people in the field and the quote, like, smartest minds of my generation were working on, like, serving more targeted ads. Maybe that was true at one point. I think a lot of them today are working on, how do you make models that are tremendously useful and valuable and intelligent across a lot of domains?
- HSHarry Stebbings
Mike, you've been fantastic. Thank you so much for letting me just completely unpack all of my questions on you without warning. Um, but you've been amazing.
- MKMike Krieger
My pleasure. Really fun to do this.
Episode duration: 1:02:41
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode GqDZfcx1kRg
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome