EVERY SPOKEN WORD
40 min read · 8,388 words- 0:00 – 2:39
Introduction to Steve
- SKSteve Kaliski
At Stripe, we're landing about thirteen hundred PRs that have no human assistance besides review per week. A lot of where our work begins is it could be in a Google Doc as we're planning a new feature, or maybe a Jira ticket comes in, or we're talking about something in Slack. I can click an emoji, and then the Minion will sort of attempt to one-shot resolving that prompt using all the tools that are available at Stripe.
- CVClaire Vo
When you're in larger organizations, there's so much friction that can come between a good idea and getting it into the world.
- SKSteve Kaliski
Not only can I have one of these, but I could have many, many of these running in parallel in isolated environments, making isolated changes all at the same time.
- CVClaire Vo
How are you getting all this code review done?
- SKSteve Kaliski
Whether the text has been written by Steve or the text has been written by Steve's robot, you still want that CI environment that's providing confidence that the code that's being changed is safe, and that as it rolls out, we're having blue/green deployments, so you can roll back, too. All that is super critical, independent of the nature of the authoring of it.
- CVClaire Vo
No matter how juiced these laptops are, you get three or four worktrees in, and, like, it starts to sound like an airplane taking off. It's no good. And so I do think on this multi-threading agentic engineering work, cloud environments and virtual environments are so important to unlock velocity. [upbeat music] Welcome back to How I AI. I'm Claire Vo, product leader and AI obsessive, here on a mission to help you build better with these new tools. Today we have Steve Kaliski, a software engineer at Stripe, and he's gonna show us how the Stripe team deploys a bunch of Minions to do their engineering work. We'll also watch an agent spend a little bit over five dollars to plan a birthday party, all in Claude Code. Let's get to it. This episode is brought to you by Optimizely. Most marketing teams aren't short on ideas, but what they are short on is time, and that's exactly what Optimizely Opal gives you back. With AI agents that handle real marketing workflows, you know, like creating content and checking compliance, generating experiment variations, personalizing user experiences, analyzing pages for GEO, even tasks like approvals and reporting. It's your AI agent orchestration platform for marketing and digital teams, plugging seamlessly into the tools you already use, handling the boring busy work, and keeping everything on brand. That leaves marketers with more time to do your actual job. See what Opal can automate for your team by signing up for a free enterprise agentic AI workshop with Optimizely. Find out more at optimizely.com/howiai. Attend live, and you'll get a free pair of Rayban
- 2:39 – 4:42
Stripe’s minions and their effect on Stripe as a whole
- CVClaire Vo
Meta AI glasses. Steve, I'm so excited to have you on How I AI because I saw the Stripe Minions on the timeline, and one, exceptional branding, don't sue us, and two, I just love the idea that you and your colleagues in the team at Stripe have created not just one agent, but minions all across the company that can help with development work. And I'm so excited for you to show us how that helps you in your day-to-day here. So welcome to How I AI.
- SKSteve Kaliski
Thank you for having me.
- CVClaire Vo
So tell me, what has been the effect that Minions have had on you personally at Stripe and at the Stripe team as a whole?
- SKSteve Kaliski
Sure. So, you know, for me personally, um, I, I think sort of anecdotally, I don't remember the last time I s- I started work in the text editor, right? So I do end up there often, um, but, you know, what I found is that, you know, a lot of where our work begins is, you know, it, it could be in a, in a Google Doc as we're planning a new feature, or maybe a Jira ticket comes in, or we're talking about something in Slack. And those are sort of, like, the more natural entry points to starting work, right? And then you end up in a text editor when it's time to, you know, actually do the work or make the final tweak. Um, and it's just felt very natural. Um, and, and I think i-in particular, the sort of, like, activation energy of starting work feels a lot lower, right? So if, you know, you're in a Slack thread and maybe there's a, a piece of user feedback and it's something simple like a, you know, we have to update the docs, or, or maybe it's something more consequential and we just wanna build a prototype, I can click an emoji and, like, the work begins. And often the work finishes, too. You know, we... at, at Stripe we're, uh, landing about thirteen hundred PRs that are, uh, have no human assistance besides review per week. Um, but at the minimum, the activation energy of, like, starting to write code, seeing tests pass, maybe a test fails, occurs without me even, you know, participating. And then I can jump in and I can tweak and I can kind of like have the mom- that momentum sort of, uh, s-sort of like generative momentum, you know, that I c- I can hop in halfway
- 4:42 – 5:44
Why activation energy matters more than execution
- SKSteve Kaliski
through.
- CVClaire Vo
What I think is magical about this, and I, I won't call Stripe a big company, but you do have a decent amount of employees and very, very large business, is I love that concept of activation energy going lower because when you're in larger organizations, there's so much friction that can come between a good idea and getting it into the world. And it's not malintent, right? It's-
- SKSteve Kaliski
Yeah.
- CVClaire Vo
Nobody is like, "Oh man, I really wanna slow this process down."
- SKSteve Kaliski
Yeah.
- CVClaire Vo
It's either, you know, functional, "I don't have access to a technical area of expertise to actually get from here to there." It's operational, "I don't know how to organize people and communicate effectively to get the next step done."
- SKSteve Kaliski
Yeah.
- CVClaire Vo
Or it's just kind of like people get siloed in their day-to-day and don't think of new ways to get work done. And one of the things that has been so revelatory about AI for me personally is, like, all that just kind of goes to zero because coordination cost can go down, execution cost can go down, communication cost can go down. You just get closer to the work, which I think is the fun part we all really
- 5:44 – 6:52
What is a minion? The technical architecture
- CVClaire Vo
care about. So show me how you actually activate a Minion and, you know, we skipped this a little bit, what a Minion is.
- SKSteve Kaliski
The quick spiel of a Minion. Um, when I, as an engineer, sort of in pre-AI time, uh, you know, want to make a modification to Stripe, um, well, Stripe is a huge code base with tons of services, um, it, it can't run on my computer alone. So Stripe already has a long history of, of, you know, investing in great developer tooling.Having hosted development environments that I can spin up, that, you know, have all the code already there and services running, and I can SSH in and, and make modifications, um, and we have a, a ton of great CI tooling around that. So that's the context, um, you know, we have all that. The idea with the minion is that I can provision one of those environments, seed it with a prompt, and then the minion will sort of, you know, attempt to one shot resolving that prompt using all the tools that are available at Stripe. Right? So, uh, all of our internal documentation, our internal CI, our, um, you know, test data, so on and so forth, and it will loop through that in an attempt to, you know, solve
- 6:52 – 9:04
Demo: Activating a minion from Slack with an emoji
- SKSteve Kaliski
that prompt. So, um, let's go ahead and, and jump in, see what sort of a prototypical experience might look like. So I'm in a Slack channel. Uh, it's called Steve Kaliski Robots, uh, dash Claire. I actually have a Steve Kaliski Robots channel that has seventy-six humans in it, um, but I do have every, uh... It, it started off as just me and my robots, and now there's some sort of, you know, like, audience o-observing. Um, but let's imagine that, you know, maybe I'm, I, uh, am thinking of a new feature idea, or I wanna improve documentation that we have. So we have a, a launch coming up soon, um, and I, I wanna sort of embellish the documentation. So I'll say, um, "I have this cool idea for docs.stripe.com/payment/machine." This is our new machine-to-machine payment work, which we'll look at later in our call. Um, and I want to, you know, make sure the landing page really s-sticks and gives a good code example of how to get started quickly. Right. So maybe someone posted a message like that, or it came in through a ticket or, or whatever the origin may be. All I have to do now is, you know, add a reaction, which is create minion pay server. So this is a particular, uh, repository within Stripe. We get the one-sec cooking from the dev box agent, and then we get a reply in here saying, "Your minion for pay server," it's the repository, "for a new branch that's created, landing page code example, has been created," and it's gonna kick off our docs service so I can eventually preview it. Now, I'm gonna click follow along. So right now what it's doing is it's provisioning that development environment I was talking about earlier. Right. So this is... this part isn't new. It is excellent, but it is not new. And basically it's gonna spin up a, uh, instance in the cloud. It's gonna apply all the configuration, um, that's required for both me and the agent to do coding within Stripe. So this will just take a few seconds. It's gonna check out that repository f-with a new branch, configure the local database, apply my Git config. It's gonna set up a VS Code server so I could connect to it just through the web or, or locally. Install some extensions.
- 9:04 – 11:22
Why good developer experience benefits both humans and agents
- SKSteve Kaliski
So what's really, like, great about minions is, you know, obviously there's the, the agent loop that's, you know, uh, making the code modifications, but it's built on top of, like, a ton of incredible work that our developer productivity's done around just making it easy to get, like, a, a perfectly operating Stripe development environment for coding, which means that, you know, not only can I have one of these, but I could have, you know, many, many of these running in parallel in isolated, you know, environments, making isolated changes all at the same time. So in that little one-click emoji, I could have done that with a few messages at the same time, um, which is really great. So-
- CVClaire Vo
Yeah. One thing I wanna call out here is we had my friend Zach from LaunchDarkly on, and-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... one thing he said was, "Look, what's good for the developer is good for the agent." So-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... there's this virtuous loop of if you have or do invest in developer experience for your human engineers, your agents will benefit off of that. And in turn, if you invest in developer experience or agent experience for your agent engineers, your development team benefits from that. And so I always tell people, you know, engineering team, you always ask, like, "Can we just give a little bit more time on the roadmap to DX? Like, pretty please, can we invest here?" And I think if you attach it to an AI initiative, that's, like, the secret way to get some of that, that good stuff done.
- SKSteve Kaliski
Oh, oh, totally. Yeah, I mean, imagine you're, you know... Some code bases are small, but, you know, Stripe's is huge. You know, imagine you show up day one and there's, there's no documentation and there's no tools and there's, and they just say, "Good luck." Like, anyone would have trouble, and even if you threw the agent at it, it's v-very likely that the, the context window would be blown by the whole code base. Just scanning through to understand all the intricacies would be, like, impossible or extremely expensive. So, you know, if there's a very blessed path for ninety percent of the common activities in being a, you know, engineer at Stripe, that makes it, you know, that makes the propensity that the agent succeeds really high too, right? So, you know, imagine we wanted to make an API change, which we do, you know, hundreds or, or thousands of times a year. We have really good documentation on, you know, how to add a new field or a new method or a new resource that the minion would read and would execute against, and then, you know, the, the propensity that it would one shot is very high. So good docs for developers are, uh, equally
- 11:22 – 13:42
Walking through the agent loop and system prompts
- SKSteve Kaliski
important for the agent, to your point. So we've now transitioned from, you know, booting up the development environment to now we're in the, the first agent run. So we have that prompt that I posted in Slack here, and now what it's gonna do is boot up an instance of Goose that's basically the harness that's, you know, gonna run through all this.
- CVClaire Vo
We did have an episode with the Block team about Goose, this, this open source argent, um, agent harness that got set up. And I wanna call out one thing for folks that are not watching and are listening, which is I love your system prompt, so sophisticated. It says, "Implement this task completely," colon, and then just whatever you put-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... put in in quite loose-
- SKSteve Kaliski
Just no mistakes.
- CVClaire Vo
No mistakes.
- SKSteve Kaliski
Yeah.
- CVClaire Vo
You forgot no mistakes. But, you know, I, I think people really think they have to over-architect their initial prompt, and I think if you have a great harness, it can go a long way to extracting out, um, a successful outcome from a pretty loose prompt.
- SKSteve Kaliski
Totally. And, and we w-A lot of this is an experiment in some way, right? You know, as new models come out and, you know, w-we build new tools, like, th-there is this sort of dynamic nature to it. And we, we've built a lot of interesting, you know, bots that help write the prompt, right? So, you know, maybe first it will do the task of searching through the code base or looking at all the pull requests or Google Docs or whatever it may be. Yeah, I, I think now at Stripe, m-most things that could have an MCP server have an MCP server, so we're able to interact with a lot of the internal data we have. And then it can make a prompt, uh, that I could then paste in here or I could assign, um, to, to, to the agent. So that's sort of i- you know, part of why I wanted that public channel we were looking at is like, you know, where can I see if that we don't pair program anymore, but we, you know, pair prompt, right? And that, that activity could be with other engineers or other data sources or, or other agents too, right? To figure out if we can, you know, properly explain, uh, to the agent, you know, how to do it correctly. In any case, you know, what it's doing now is it's taking the link I gave it, which is to public documentation. It's gonna search through the code base and, and use, uh, some of our, you know, uh, code searching tools to, to locate where that change in particular should go. It's gonna execute a v- a whole sequence of tools, and over time, as it figures out where in the code base it should work, what the modification should be, it'll ultimately commit those and make those available, um, in a pull request that, that me and, uh, my fellow colleagues can
- 13:42 – 16:00
Why Stripe chose Goose as their agent harness
- SKSteve Kaliski
review.
- CVClaire Vo
Yeah. I have a couple qu- cou-couple questions on this because we've seen-
- SKSteve Kaliski
Sure
- CVClaire Vo
... a few examples of folks building their own cloud agents, and, and-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... kind of... And I'm curious, you know, why, why Goose, uh, you know, versus doing something on your own or doing sort of a c- a more commercial solution. I'm curious if there was an internal discussion or how this or did this happen organically 'cause it worked for one engineer. Curious how you kind of seeded the idea of, of Minions on top of your development tools.
- SKSteve Kaliski
Yeah, sure. So, you know, we also make Claude Code and, and Cursor and tools like that widely available to engineers at Stripe. So, you know, I think our general sentiment is like, we want to accelerate development, uh, so we can build new features for our users, and, uh, there are gonna be new models coming out, new tools, and we wanna build, proliferate those as much as we can. In the particular case of Minion, um, the... It's very, I don't wanna say very specific, but it's very specific to like the Stripe developer experience and the Stripe developer environment. And we have been, uh, experimenting with Goose early on, and I think in this partic-particular case, we'd forked it to make some modifications as well. And really what we were looking is like sort of a base harness and loop, um, to apply all of our own tools and software to. Um, so we, we spent a lot, we spent a lot of time on like making good tools available and making sure that the sort of routes that the Minions go through, you know, work closely with like the most common Stripe developer workflows. So it's, you know, the sort of like commercial versus, you know, custom things. Like, there are things that are very specific about Stripe's code base and being a developer at Stripe and the way we build things, um, that it, it was just sort of easier for us to, to build and deploy that. Um, but the commercial solutions are great, and we use those extensively, and e-even later on this demo, I can s-sort of show like, I can, you know, for example, I can pop into VS Code Web, where I could manually edit some of the code that's going on here as well. But I can also boot up Claude, and I can have sort of the typical Claude experience with, um, all the Stripe, um, MCP tool, internal Stripe MCP tools available as well. So, you know, there's, there's no singular tool to rule them all, but I think the, like, overall end-to-end development story at Stripe, uh, is built on Minions. So you can see I'm in that dev box in Claude now. Yeah.
- 16:00 – 17:15
The role of Stripe’s developer productivity team
- CVClaire Vo
Cool. I have one other question and then, and then an observation I wanna make sure that, that the listeners don't miss. So my first question is-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... you know, Stripe is a very well-resourced, I would say, engineering organization, so I'm presuming you have a team dedicated to working on not just your dev tools, but as well as, as Minions and managing that as an internal product itself. Has that team been sort of built, um, as a standalone team that's focused specifically on internal developer experience? Is that how it works?
- SKSteve Kaliski
Yeah. We, we've had a developer productivity team for as long as I can remember. I think I've been, I've been at Stripe six and a half years now. And, you know, that team's focused on all the tools that I engage with, um, and, and making them more useful, right? So that's all the way from, you know, how, how we interact with, with, uh, Git and version management to our text editors and our configurations there to our, uh, development environment and, and how that whole story pieces together. And, you know, we, we, you know, j-just as, you know, as a product engineer on Stripe, I, I, I care deeply about our external users and, and them being successful at Stripe. That team cares e-equivalently about engineers at Stripe being successful and, and being able to build things quickly, and I think that's been, uh, even more accelerated by, by AI in the last couple years.
- 17:15 – 21:14
Why cloud environments unlock multi-threaded AI engineering
- CVClaire Vo
Yeah. And then one other observation I wanna make, 'cause I think you glossed over it a little bit at the beginning, but it's so important for folks that really wanna go HAM on coding with AI-
- SKSteve Kaliski
Sure
- CVClaire Vo
... which is, look, all of us engineers have a MacBook Pro that weighs eight million pounds.
- SKSteve Kaliski
Yeah. Sure.
- CVClaire Vo
That can, can do some damage. Mine, for anybody who wants to know, its nickname is Big Boy.
- SKSteve Kaliski
Oh, wow.
- CVClaire Vo
So whenever I need my kids-
- SKSteve Kaliski
Okay
- CVClaire Vo
... to get my partu- my coding laptop, I say, "Can you bring me Big Boy?" Um, 'cause it's f- I call it San Francisco rucking when I carry two of them in my backpack.
- SKSteve Kaliski
Oh, my God.
- CVClaire Vo
But, you know, no matter how juiced these, these laptops are, you get like three or four worktrees in all running, and like it starts to sound like an airplane taking off.
- SKSteve Kaliski
Yeah.
- CVClaire Vo
It's no good.
- SKSteve Kaliski
Yeah.
- CVClaire Vo
And so I do think on this sort of like multi-threading agentic engineering work, cloud environments and virtual environments are so important to unlock velocity, and that's one place where I haven't seen enough large engineering teams invest in those environments to really unleash the power of, um, either AI-assisted coding for their software engineers or agents in general. So if there are any CTOs, VPs of engineering listening-If you were to invest in something to really unlock growth in the next year, getting that situation locked up would be really good. Because again, I hear so many people being like, "Oh, I can Claude Code everything. I can spin... You know, I can Codex anything. I can spin up all these worktrees, I'm fine." And I'm like, "Are you running all these local... Like, what are you, what are you doing?" [chuckles] And so that's one thing I just want people to not miss, is the limitations of your actual machine on how multi-threaded you could be, especially in a complex code base like Stripe's.
- SKSteve Kaliski
Huh, totally. And, you know, I have Slack on my phone, right? So I can even kick off one of these minions on the way to work, right, as I'm sort of going through Slack on the subway, and then, you know, by the time I'm there, I, I, I can jump in halfway through. And I think, like, the, maybe, like, the hyperbolic thing here is, like, imagine if all engineers at a company could only, like, work on... Ha- didn't have Git, we all had to, like, coordinate working on the one code base together. Like, that, that would be crazy.
- CVClaire Vo
Mm-hmm.
- SKSteve Kaliski
Um, and, you know, the equivalency here is, like, i- imagine if I'm bounded by, you know, my agents are bounded by just what's available and can work on my computer.
- CVClaire Vo
Yeah.
- SKSteve Kaliski
Um, if the, the, the, the, the 10X thing to do is, um, you know, be able to have 10 of them run in parallel, but also not be contingent on my, like... It's like everyone's buying a Mac, a Mac Mini, right, so it doesn't fall asleep, right? Uh, it's like there's a, there's a whole business around just the computer not falling asleep.
- CVClaire Vo
I legitimately... First of all, I have, like, four Mac Minis upstairs, and one of them is just basically a laptop that doesn't close. Like, I use it-
- SKSteve Kaliski
Sure
- CVClaire Vo
... as a laptop that does not shut. Um, and it's really unlocked my, my velocity. So, okay, we thank you for going on this side quest about-
- SKSteve Kaliski
Of course
- CVClaire Vo
... virtual environments and local hosts-
- SKSteve Kaliski
Of course
- CVClaire Vo
... and all those things. I'm a founder, so I know most people don't start companies because they love running payroll or managing compliance. But somewhere between hiring your first employee and raising your next round, you end up in the weeds with HR, IT, and all that other stuff. That's what Rippling was built to solve. Rippling is a unified platform that lets startups run HR, payroll, IT, and finance in one system from day one. The Rippling startup stack replaces disconnected tools that don't sync with a fully connected platform. Over 15,000 startups, including Cursor, Clay, and Sierra, trust Rippling to scale fast without adding additional ops and HR headcount, so founders like you can keep building. Right now, venture-backed startups can get six months of Rippling startup stack for free. Head to rippling.com/howiai and sign up today. That's R-I-P-P-L-I-N-G.com/howiai to sign up for six months free today. Focus on what you're building and leave the
- 21:14 – 22:04
One-shot prompting: from Slack to shipped PR
- CVClaire Vo
rest to Rippling. Okay, so you are now running this. You're going to... I- it's, it's... You said one shot at the beginning. Really, you're trying to take one, one prompt-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... and not a single reply gets you, gets you what you want, but it goes into the harness, it goes through its own loop, hits the tools it needs, and ultimately, you as the end user get one response back, which is, "Here's the successful implementation."
- SKSteve Kaliski
E- exactly right. So we can already see that it's identifying the relevant files. It's keeping track of its own to-dos. Like, that- that's something that we've codified-
- CVClaire Vo
Yeah
- SKSteve Kaliski
... in it to, to focus on. It's making changes. It's, you know, preparing the, the commit, and so on and so forth. And, and ultimately, and sort of like taking it out of the oven, we'll see a response at the end of just, like, it finished. You know, like, you can go ahead and look at the pull request, and the, the r- the sort of normal
- 22:04 – 23:44
How Stripe handles code review for 1,300 AI-written PRs weekly
- SKSteve Kaliski
human review part continues. So yeah.
- CVClaire Vo
Let's talk about that really quickly. You said 1,300 code or agent-initiated PRs per week, something like that, and then humans are involved in code review. How are you getting all this code review done?
- SKSteve Kaliski
Well, y- you can make the argument that, you know, if I'm spending less time actively writing code, I can, you know, s- uh, you know, recenter my time on reviewing the code that's being written or working with users and so on and so forth. So I, I think that's a big part of it. I, I think the, the other side of it, like, it comes back to that CI environment, right? So having really good test coverage-
- CVClaire Vo
Mm-hmm
- SKSteve Kaliski
... having synthetics that run to, you know, simulate end-to-end interactions with your product, um, those h- all help inspire confidence in the code you're reviewing, right? So absent those, like, it'd be really difficult to look at code, es- especially in a huge code base, and have high confidence that it works. So, you know, again, whether the, the text has been written by Steve or the text has been written by Steve's robot, you still want that CI environment that's, um, you know, providing confidence that the, the code that's being changed is safe, and that as it rolls out, y- you know, you're having sort of blue/green deployment, so you can roll back too. Like, all that is super critical, um, independent of, of, of the au- you know, the nature of the authoring of it. I, I do, I do believe, like, if coding becomes easier, and coding historically has been the bottleneck in product development, it's just gonna shift to other areas, right? So, um, if, like, coding in effect becomes free, the review's gonna be really challenging, right? Or getting enough ideas in the first place could be-
- CVClaire Vo
Yeah
- SKSteve Kaliski
... a big problem, or distributing them, right? So I think, uh, the attention is just gonna, you know, uh, move around to,
- 23:44 – 24:53
Non-engineers using minions across the company
- SKSteve Kaliski
to other areas.
- CVClaire Vo
Great. And then, um, one other question before we go on to your next workflow, which I am so excited about-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... spoiler alert, is, are more than engineers using Minions? Are you seeing product managers, designers come in? How is this going across the company and across functions?
- SKSteve Kaliski
Yeah, I, I think, you know, part of why I like the Slack example is the entire company's in Slack, right? Um, and-
- CVClaire Vo
Yeah
- SKSteve Kaliski
... you know, to that point of activation energy, you know, e- even if, like, you had the text editor on your computer, um, and I, and I gave you the docs and, and, and whatever it may be, you know, to, to s- someone who's not an engineer, it could be really challenging or intimidating or, or whatever it may be. And, you know, for w- whether you just want, like, a proof of concept or you're gonna make a docs change or, or whatever it may be, like, you can-You can probably write out in plain text the, the thing you want to occur, right? You, you might be writing the product brief, or you might be, uh, giving design feedback. Like, you're, you're in effect just writing a prompt at some point. Um, so being able to just click an emoji or, or, or, you know, tag the, the robot to send a minion. We're, we're starting to see, um, more non-engineer usage there. Yeah.
- CVClaire Vo
Amazing. Okay.
- SKSteve Kaliski
Yeah.
- CVClaire Vo
So
- 24:53 – 32:15
Demo: Planning a birthday party with Claude and machine payments
- CVClaire Vo
let's go to our next workflow, which I am-
- SKSteve Kaliski
Yes
- CVClaire Vo
... psyched. As somebody with a stack of Mac Minis downstairs, I am excited about.
- SKSteve Kaliski
I... So, you know, at, at Stripe we're, you know, we're thinking about AI in a f- few ways, right? So we're... The demo we just showed is how we're thinking about using AI internally to accelerate our product development and, and engineering. The second way is, you know, thinking about how we're supporting all these businesses that are, um, you know, leveraging AI in their own products, and how we can support their business models. And, you know, that's with things like usage-based billing. And, you know, we, we just announced our, our beta of our, um, LM, uh, token billing product. Um, but there's a third side, which is, like, this sort of idea of agents as economic actors, or a- agents that can spend money as, you know, as part of their a- a- attempt to solve a prompt. And before we jump into the demo, like, just the thing I'll illustrate is, like, you know, often you give a prompt to Claude or, or some other agent, and it will use its own model to generate text and response, right? Or maybe it will do a web search, or call an MCP tool, or whatever it may be, to gather information or to, to effect change as part of that response. And, you know, of course, there's the shopping cases, but, uh, w- we imagine a future where, like, third-party services are gonna want to sell into these kinds of experiences, and that those interactions will cost money. Um, so we have to equip, uh, our agents with the capacity to spend, so that they can not only consume tokens, but so that they can also pay services as part of achieving, uh, the prompt. So I'm gonna give a, uh, an example. Uh, Jen, who's a product manager I work with, is awesome. I think her birthday is coming up soon. If not, the demo is, it's her birthday party, and we're gonna ask Claude, uh, to help plan it. And along the way, it's gonna, uh, interact with a bunch of different, uh, real third-party services that are really gonna accept money over a, a, a payment protocol. We're calling them machine payment protocol, which we've, uh, co-designed with Tempo, and, um, we'll see some real transactions along the way. So I have a, you know, sort of pre-baked, uh, prompt we'll paste in just to skip that part, and I will go ahead and give it. So I told it to research Jen Lee, who's my product manager.
- CVClaire Vo
Mm-hmm.
- SKSteve Kaliski
Figure out what would be a c- a good idea for her birthday, find a place to have the birthday, send invites to the birthday, and then, you know, we've burned all these tokens along the way, so we should probably donate to Stripe Climate at the end to make up for all the energy consumption. So right now, we're just still getting the environment set up, just setting up our, uh, ability to pay, uh, Tempo. The first thing we're gonna do, we can see right here, is that we've actually paid BrowserBase to create a new browser session. So I didn't sign up for BrowserBase beforehand. I'm just paying for this one session. Um, it's gonna do that. I gave it her website somewhere up here, um, so it's gonna go ahead and spin up that environment. So you can see right now it's writing some Playwright code locally, which will connect to that BrowserBase session. It got to her website, right? So Jen likes, uh, she... I think she bakes and she cooks. So it actually found out by running that browser session that she's a matcha-obsessed baker working on a cookbook. We're gonna go ahead and turn off that browser session, and we can see the net cost, uh, is just a fraction of a cent. And again, like, we, like, really paid that business just now. The next thing it's gonna do is, using its knowledge of Jen and her interest in matcha, it's gonna, uh, search online using, uh, Parallel AI to find relevant venues in New York that we could host this party, you know, something that matches, matches her matcha interest.
- CVClaire Vo
I'm gonna just do, again, a side quest, a callback to-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... um, our episode with Andrew and Nabil, who, who used AI to set up a, um, tabletop gaming business they were building in the East Bay. And my friend texted me, and she said, "This is the most San Francisco thing I've ever seen."
- SKSteve Kaliski
[laughs]
- CVClaire Vo
Which is-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... two dudes that need AI to help plan their game night.
- SKSteve Kaliski
I know. I know.
- CVClaire Vo
And I was looking up at your original, original prompt, and I was like, this is such an engineer's prompt for how to plan a birthday party.
- SKSteve Kaliski
I know, right?
- CVClaire Vo
It's like source env and then insert Jen's name.
- SKSteve Kaliski
Yeah. You know you're doing something wrong if I have to load environmental variables to celebrate someone's birthday.
- CVClaire Vo
Exact- It's just, like, so funny.
- SKSteve Kaliski
Yeah. So it found this, uh, matcha cafe in New York on Bowery that's, uh, it thinks is a perfect fit for her matcha interest, which is great. Now we should, uh, you know, send an invite in the mail. Um, you know, we're, we're taking it offline. Um, so now we're interacting with this service called PostalForm. Uh, PostalForm will take a, a PDF and, uh, s- actually send it in the mail. So again, right now what we're doing is we're, the LM is writing code locally to generate a PDF image of the invite. So there's this sort of interesting balance of, like, what can the LM do itself, right, with its own tools in my local machine, versus what it needs a third-party, uh, service for? Like, obviously the robot can't send mail, and I think if the robot could send mail, that would be, uh, kind of concerning. So, you know, now it's trying to fix a couple things with the PDF. I'm sure the invite, it looks... It'll be very interesting to see what the invite looks like.
- CVClaire Vo
Machine... It looks machine-generated?
- SKSteve Kaliski
Um, it'll look... Yeah. It's just a bunch of binary. No one's gonna come to the party.
- CVClaire Vo
How do you... I, I mean, I, I know this is a, a little bit of a demo you're giving us here, but-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... I think so many of these-Even consumer, you know, facing products like I've never heard of PostalForm, it sounds amazing.
- SKSteve Kaliski
Yeah.
- CVClaire Vo
Where it solves like a very, you know, individual user problem of like how do I get-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... mail out the door. So many of them are gonna be interacting with agents and like the API as, as the interface. And you and I were talking about that a little bit before the show, and you were saying you were getting user feedback recently that sort of spoke to that.
- SKSteve Kaliski
Yeah, we've been talking to, you know, b- b- you know, I think maybe including PostalForm. We, we've been talking to a lot of users as we've been integrating this machine payment stuff, and, you know, it's very, very normal for Stripe to ask for feedback, and, you know, typically they go, "Oh, I'll get back to you and write up some notes." And I would get these like in, in 30 seconds I'd get two pages back, and the, the engineer over there had used, you know, Claude or Codex to, you know, read the Stripe docs and implement the feature, and then figured since like they hadn't really written it themselves, that they'd ask Claude or Codex to send feedback back to me. And like it, it, it happened once, I thought, okay, like that's, that's funny. And it happened like four or five times that week, and it was just extremely jarring. And, and it, it, it added this sort of physicality to who the new user is here, right? That like the... we'd have to hear from the agents directly. Um, all right, we're just gonna check in quickly. We, we sent it in the mail, and then, you know, we, we burned, we burned some tokens along the way, so we actually made a dollar and 65 cent, uh, donation or a contribution to Stripe Climate to erase, uh, 4.4 kilograms of carbon based off of our 70K token usage. Um, and you can kind of see here an, a, a agent receipt of, uh, the services it interacted with and, and the cost of each. So at some point I'm gonna get an invite
- 32:15 – 35:08
Quick recap
- SKSteve Kaliski
to a party in the mail.
- CVClaire Vo
I want to just recap this for folks that are not watching.
- SKSteve Kaliski
Yeah.
- CVClaire Vo
So we started with a prompt in Claude Code that said, "Plan my friend Jen a birthday party. This is what we know about her."
- SKSteve Kaliski
Yeah.
- CVClaire Vo
It, it preceded... There was some like movie magic-
- SKSteve Kaliski
A little, yeah
- CVClaire Vo
... here where it preceded-
- SKSteve Kaliski
A little bit, yeah
- CVClaire Vo
... here are some tools I know can take agent payments, um, that might be useful in the pursuit of this.
- SKSteve Kaliski
Yeah.
- CVClaire Vo
And instead of a human having to go into those tools, log in, drop a credit card, buy a plan, there was a machine-to-machine transaction that happened that gave micro access to the to- the tool for the capacity the agent needed to do-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... the job at hand, and we see it use Browserbase and Parallel and PostalForm, and it, it issued those char- those, those, um, payments programmatically, accessed just what it needed, did a little offset, a Stripe Climate purchase, and then got your party planned. And what I like about this is, what's really interesting about this particular example is it makes it very clear the economics of doing something agentically. I like this little... You, you know, we got a little Stripe Climate shout-out here.
- SKSteve Kaliski
Of course.
- CVClaire Vo
But it also just calls out, like this actually does cost you in tokens whether or not your agent is doing outside transactions. So we're already operating in an economic framework, right?
- SKSteve Kaliski
Yeah. I think I'm on a Stripe plan here, but i- you know, in general, like, you know, people have a subscription relationship to, you know, these providers and, and that costs money, and we get a certain number of tokens, and any prompt I give, even though I'm not like seeing the penny count move by, has a, a ultimate dollar cost to it, right? And, you know, maybe in the typical coding example and, you know, consuming tens of thousands, hundreds of millions of tokens, we've sort of justified the value of that, right? Because the, the code has business value in the sense monetary value. But, like the, the sort of like token and the currency that backs it, like they feel closer than ever, and y- you know, whether I'm spending a penny or a dollar on a third-party service or I'm spending, uh, you know, tens or hundreds of thousands of tokens with LM, we're sort of doing a, a similar activity, right? Which is that w- we need intelligence or we need data or we need operations or we need a service to execute on that prompt and, you know, a- a- achieve some outcome. And I think it's like it... even just this view feels very provocative and it, it feels early, but I, I think it's gonna feel very natural over time to see the, the token and, and the dollar side by side. And, you know, for me it's like, uh, you know, I planned a birthday party for... I mean, it probably... I don't know if the, it's any good, but I planned a birthday party for $5.47. That doesn't seem too bad.
- CVClaire Vo
Again,
- 35:08 – 36:36
The future of ephemeral, API-first businesses for agents
- CVClaire Vo
we're, we're doing this episode in the year of our Claude, 2026. Like we're gonna show the terminal, [chuckles] the terminal example.
- SKSteve Kaliski
Yeah.
- CVClaire Vo
And most people watching this, and again, How I AI Is for everybody super technical and not, they're gonna look at this and be like, "Okay, I'm... But yeah, like I'm not going to plan my birthday party in the terminal." But let's just pull that thread six months-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... in the future or 12 months in the future. There's gonna be a bunch of builders out there that are gonna wrap this in a much more consumer-friendly user, user experience, and then you're gonna be able to build such interesting products that can interact and transact in just a much more human way, which again can just solve problems in a different, in a different mindset.
- SKSteve Kaliski
Yeah, and I think it'll be really interesting to build a business where your primary consumer sort of wants an ephemeral interaction with you, and it doesn't necessarily require you having a dashboard or an admin panel or a landing page or, you know, all the, all the other typical things that are really useful, um, you know, when a human or a business is interacting with you. And instead you could focus on like just a hyper useful single API, um, and, and monetize that directly and, and, and make your, you know, audience primarily agents. I, I think a lot of just like really interesting businesses can emerge out of, out of that opportunity.
- CVClaire Vo
I, I completely agree, and then we're gonna have agents identify what those businesses are, build them, transact with other agent customers. Agents all
- 36:36 – 41:54
Lightning round and final thoughts
- CVClaire Vo
the way down. Well, Steve, this-
- SKSteve Kaliski
All the way down
- CVClaire Vo
Awesome. Just to recap for folks, we saw minions and how to kick off, um, development work from Slack and the benefits of investing in developer experience. Again, few piece of engineering, just like carve off a, a DevX team and give it some love, and product managers get out of the way. You'll get more product at the end of the day [laughs] if you just, uh, give, give some time and effort towards developer experience. And then we got to saw- see these machine-to-machine payments, which I think by the time the episode is live, we should be able to maybe talk about or, or see. So fingers crossed-
- SKSteve Kaliski
Absolutely. Yeah
- CVClaire Vo
... um, this will be live by the time our episode goes live. And we showed you how to plan a... I guess, gotta zoom in, a matcha cheesecake birthday party-
- SKSteve Kaliski
Yes
- CVClaire Vo
... in New York City.
- SKSteve Kaliski
Jen Lee's matcha party, April 19th, apparently.
- CVClaire Vo
All things matcha.
- SKSteve Kaliski
Um, yeah, I guess I didn't pick the date, so the r- the robot has decided that will be a good birthday. So...
- CVClaire Vo
Saturday, April 19th-
- SKSteve Kaliski
Yeah
- CVClaire Vo
... 3:00 to 6:00 PM. Sounds perfect.
- SKSteve Kaliski
Yeah.
- CVClaire Vo
We planned a birthday party for $6-
- SKSteve Kaliski
Perfect
- CVClaire Vo
... carbon neutral.
- SKSteve Kaliski
[laughs]
- CVClaire Vo
Uh, Steve, this is awesome. Before I send you off, couple lightning round questions.
- SKSteve Kaliski
Sure.
- CVClaire Vo
One, you know, we showed kind of a, a, a contrived personal use case, but what are your personal workflows for AI?
- SKSteve Kaliski
The thing I've been really interested in is the sort of like disposability of software, and I, I have a, a four-month-old now and a, a almost two-and-a-half-year-old now, and the two-and-a-half-year-old keeps grabbing my phone to, like, try to change music. So I've, I've toyed around with, like, music apps that are extremely controlled to just six songs. I have no idea how to build iOS apps, but the robot does, so I, I've been toying around with, like, little, little engagements like that. And then I, I use, you know, the, all the AI apps sort of in the normal way, I, I guess, in addition.
- CVClaire Vo
Yeah. Well, if folks want to create an app like that, we just did an episode with Jesse Jannay, who built a, like, minimalist YouTube for kids, where it can on-
- SKSteve Kaliski
Oh, cool
- CVClaire Vo
... like, her kids can only watch the videos that she pre-approves.
- SKSteve Kaliski
Oh, cool.
- CVClaire Vo
And you can-
- SKSteve Kaliski
Cool
- CVClaire Vo
... only swipe back and forth. You can't do any... Like, no other buttons.
- SKSteve Kaliski
Oh, okay.
Episode duration: 41:55
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode o5Mi5SYSDnY
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome