How to Build a Self-Improving Company with AI

In this recent batch talk, YC General Partner Tom Blomfield breaks down how to build a self-improving company using AI. He'll cover how to create a series recursive, self-improving AI loops and explain why the founders who get this right will run companies that can improve while they sleep. 00:00 — Companies Are Roman Legions 00:54 — Copilots Are the Wrong Mental Model 01:55 — Extract the Domain Knowledge 02:24 — The Recursive Self-Improving Loop 04:12 — The Holy Shit Moment at YC 05:50 — Self-Optimizing Product and Support Loops 06:29 — Burn Tokens, Not Headcount 07:23 — Middle Management Is Over 08:05 — Make Everything Legible to AI 09:40 — Regenerating the YC User Manual 11:19 — Software Is Ephemeral, Context Is Valuable 12:18 — Where Humans Still Matter

Tom Blomfieldhost

May 19, 202613mWatch on YouTube ↗

EVERY SPOKEN WORD

15 min read · 2,769 words

0:00 – 0:54
Companies Are Roman Legions
1. TBTom Blomfield
  This is based a little bit off a talk Diana gave. There's a video up over the weekend, which is super cool. Um, Jack Dorsey was tweeting some stuff like two or three weeks ago that I thought was super cool, and I've kind of, um, stolen a bunch of those ideas and shoved them into here. This talk is, like, pretty conceptual and high level about thinking about how to build companies. So the Roman legions were designed to project power over two continents or something, from Rome at the center to, like, these people on Hadrian's Wall up in Scotland. And the idea was, um, this nested hierarchies with consistent spans of control, and you had, like, named individual with spans of control to pass orders down and send information back up the hierarchy. And if you think about most companies today, they are organized like a Roman legion, where human beings are the conduit for information flowing up and
0:54 – 1:55
Copilots Are the Wrong Mental Model
1. TBTom Blomfield
  down. And so Jack Dorsey's tweet, which I thought was great, was this, like, this underlying assumption that hierarchically organized companies are the, are the way that we should be organizing, like, our economic units of value. And I think AI basically breaks that. If you talked to people a year ago about how AI was useful, they talked about productivity, like copilots, making engineers 20% more productive, adding copilots to workflows, shipping more software. But I think that is actually a broken way of thinking about AI. That's like, he had a great blog post where basically you're just, like, taking the old way of working and adding, like, a more powerful engine onto it. And instead of that, I think you can reimagine, like, what a company is and how it acts. And so as Garry's talking, like, he, I genuinely believe, can produce more code than an entire engineering team. The thing that's really stuck with me is this idea of, like, extracting the domain knowledge from your company and defining it as a s- as, like, context or a set of skills or whatever you wanna call it.
1:55 – 2:24
Extract the Domain Knowledge
1. TBTom Blomfield
  But, like, this idea that there's domain knowledge or business knowledge or there's, like, some know-how that's inside the heads of people and in Slack messages and in emails and in Notion, all of this, like, information together defines how your company works. And if you can make that legible, you suddenly can, can move from this hierarchical organization to a sort of intelligent AI-powered organization with AI-native software. AI isn't the some...
2:24 – 4:12
The Recursive Self-Improving Loop
1. TBTom Blomfield
  It, it's not something you bolt onto the side of the company. It's not, like, a tool you give to engineers to make them more productive. But I think you can reimagine what a company is as a set of recursive self-improving AI loops. I think this is really, really, really important because when it gets there, I think the company starts to self-improve even when you're sleeping. So let me give you an example. Diana's talk talks about this as well. This AI loop, you start with, like, a sensor layer, which is like, that's a fancy word, but really it might be like emails from your customers. It might be support tickets, code changes, people canceling their subscription, um, product telemetry. It's like sensor data to get information from the outside world. And then a, a policy layer, decision layer, like rules about what you can do, what it has to ask a human permission for, what it must log. A tool layer, that's kinda Garry's skills and code. Like, the tool layer is Garry's code. It's basically deterministic APIs, things like query my database or look at my calendar. Um, a set of tools that the, the AI can call. A quality gate, like that might be evals, deterministic checks, safety filters, human review for high-risk stuff, and then a learning mechanism. It's like your system interacts with the real world, picks up where it doesn't work, and loops back into the top again. And if you can run every single step of that without human intervention or without, with minimal human intervention, your system gets better and better and better while you're sleeping. And I can give you actual examples of this that are live right now. We started with an agent that you can ask, and it, it has deterministic tools to query our database. Pretty simple, like when did I last have office hours with this company? Then it got a little bit smarter, which was like, for this company I'm doing office hours with right now, they need introductions for anyone in petrochemicals or
4:12 – 5:50
The Holy Shit Moment at YC
1. TBTom Blomfield
  something. And it could query the database in different ways and use RAG and all sorts of stuff to, like, come up with five relevant founders for you to meet. But again, this is like, this is a sidekick, right? This is an agent. This is like the old... This is last year's version of how AI, how AI is making me better as a group partner. It's making me 20 or 30% more effective. The aha moment for me came when we put a monitoring agent on top of that, which looked at every single query every single YC employee was doing and saw when it worked and when it did not work. And when it did not work, it's like, "Oh, why not? What would've made this query work? Do we need different deterministic tools? Do we need to update the skills file? Do we need a different database view? Do we need a new index?" And this happen, this literally happens overnight now. Let's write the code, put in a merge request to the YC code base, have an agent review it, and merge it and deploy it, so when a human comes the next day to ask the same query, it will now succeed. For me, that was like the holy fucking shit, right? That's not just AI making you 20 or 30% more valuable. It is the AI going through this loop to figure out how to self-improve. And I think basically, if you can identify parts of your company that work like this and eliminate as mu- have the human in kind of a monitoring or supervisory capacity, you can just throw tokens at this problem, and your company will get better. And so other examples might be if you have product analytics, having an agent go through your product analytics to s- to figure out what part of your sales funnel is presenting the highest amount of friction, researching best practices, putting in place an A/B test, running it for a week, picking the best version, and deploying it. And just doing that again and again and again for your
5:50 – 6:29
Self-Optimizing Product and Support Loops
1. TBTom Blomfield
  product. Just have a self-optimizing, like, product loop. Or you do it with customer service queries. You have customer suggestions coming in and in and in. You triage it with a kind of... You have to have an agent which is like your chief product officer and your chief technology officer who make kind of judgment calls about, "Okay, this is a suggestion we just don't wanna do. We'll discard it." But no, "This is a suggestion which is now in line with our roadmap. Um, we can do it overnight. Let's write the code. Let's deploy it. Let's ship it to the customer without a human being involved."So I think if you can think about each part of your company as a self-improving, like recursive AI loop, it becomes very, very different to this like hierarchically organized Roman legion or company. So what? So like if you want to do this, what
6:29 – 7:23
Burn Tokens, Not Headcount
1. TBTom Blomfield
  are the implications? One is like burn tokens, not headcount. We are seeing companies get to demo day with about 5X more revenue per employee than they did 18 months ago, and I think that's going to continue to series A and series B. And so I think you're going to be constrained on token usage, not on headcount really, really soon. The blunt measure now is just like measuring everyone's token usage, which is obviously like dumb and gameable at the extreme, but directionally, I think is correct. We're in the phase of like what is possible right now, and so everyone should be experimenting to the max to figure out what we can even do with this crazy new intelligence we have. As soon as you turn it into a leaderboard and people get promoted or fired based on it, obviously it gets gamed. Obviously, that's dumb. But I think directionally figuring out who in your organization is token maxing, who is not, is like a good way to think about which employees you should be spending your time with.
7:23 – 8:05
Middle Management Is Over
1. TBTom Blomfield
  I think middle management is done. I just don't think you need middle management for this coordination problem. I think AI should be doing it. And for me, there are two roles. Jack Dorsey has three. I actually don't like the third one, so I deleted it. But there are two roles that really, really matter for me. I think everyone just has to be an IC now, a builder, an operator, and I think crucially, having directly responsible individuals to get anything done, I think you need a named human, not a committee, not a group of people, just a single person. And I think you can build companies based on ICs effectively. I th- I think just middle management is, is over. So building this self-improving company, that's a dream. And by the way, I think like people are at the bleeding edge of this right now. I'd be interested to see where you all are, but it feels like people are like exploring the boundaries
8:05 – 9:40
Make Everything Legible to AI
1. TBTom Blomfield
  here. I'm not sure anyone has a truly self-improving company in every function. I might be wrong. You might prove me wrong. What would I do? First of all, this is really, really important. I would make the entire organization legible to AI. What does that mean? It means you've got to record everything. Simplistically, all of our, um, partner emails now, if you email a YC partner, that email is in the YC database. Every Slack message, every DM, every office hour we've started recording for the last three or four months. Every single thing that happens, if it is recorded, it happened to the AI. If it did not get recorded, it is, it did not happen to your intelligence. You know what I mean? And so I was talking with some founders over here, um, just now, and we're having like really good conversations about their company. But, like every conversation I had, I was like, "Fuck, I need to be recording this conversation," because some guy wanted an introduction to... I can't even remember who the introduction was now. Who was that? I was talking to someone about, and I promised you an introduction. I said yes. And I said, "Email me afterwards," because I would f- I j- I'm going to forget this. I'm going to talk to 20 people. Yeah, so it needs to be my phone or a clip or, or smart glasses, or we deck out every room with like microphones. But basically, everything needs to be recorded so that it can be legible to the AI. And then as Garry talked about, like diarization, you cannot pump in 100,000 hours worth of recordings into a context window, so you have to diarize it. You have to basically aggregate it down, synthesize it into the important parts, and then give the AI breadcrumbs. So like, okay, so here's an example. Who's read the user manual, the YC user manual? Hopefully everyone in this room has at least opened the user manual at one point in time, right? Like,
9:40 – 11:19
Regenerating the YC User Manual
1. TBTom Blomfield
  it's fine. It was written five to 10 years ago, most of it. It's kind of out of date. So Harj thought, uh, last weekend, since now we've got about 2,000 hours of recorded office hours from the last three months, why don't we regenerate the user manual? And so you can click, like you give it a set of instructions, you basically diarize it down, synthes- like categorize it into certain areas like fundraising, hiring, co-founder disputes, whatever, and then write me a new user manual. And by the end of the weekend, he had 150-page user manual, which is dramatically better than the existing user manual. And now we can also update it every single month. So our user manual becomes self-improving. Every new piece of advice we give, it's compared with the existing user manual and either incorporates it or thrown away. So the user manual becomes this up-to-date living brain of the advice we give to founders. And obviously it doesn't stop as a user manual. You then pump it in as context to an AI agent, and suddenly you can ask a super intelligent AI and get the combined wisdom of 16 YC partners in one, but only if it's legible. So you have to record everything. The second point is kind of the same, right? Like ev- if it creates an artifact that can self-improve, it's legible. If it doesn't, you throw it away. The third point then is that every function can generate... This used to say dashboards. It's not just dashboards, it's on-demand software. Codex 5.5 is now good enough. You can one-shot most simple inter- like m- most internal software dashboards, you can one-shot to a pretty high level of quality. I tried it over the weekend on a bunch of our stuff. It's just unreal. So all of your internal operations teams should be sitting on this layer of like kind of intelligence understanding and then creating their own dashboards and their own workflows,
11:19 – 12:18
Software Is Ephemeral, Context Is Valuable
1. TBTom Blomfield
  and I would see that those as entirely disposable. I would very preciously store all the data. So as Garry said, he puts it all, all of his emails in Markdown, never throw anything away, but then treat these, the software as ephemeral. You can, you can generate it, you can regenerate it. The valuable part is like the comprehension inside people's heads of like, this is how the function works, this is how we run a YC event, whatever. The software to actually run the event, you can generate for the event, you can throw it away. The mo- the models get smarter in a month or two, throw the software away, give it your original set of instructions, and regenerate the software. So I think the business context and, and skills are the valuable part. I think the software on top of it is ephemeral. So what, what are humans for in this world? I think basically we're talking about a company brain, and I know a bunch of people in this room are building this. But the bit in the middle, like all of your data, all of your emails, your DMs, the skills, the know-how, that is like the company brain,
12:18 – 13:27
Where Humans Still Matter
1. TBTom Blomfield
  and I think the humans sit around the edge of this interfacing with the real world. So it's where this intelligence makes contact with reality. Human beings reach into places the models can't go yet. That might be like a conference, it might be a... I'm trying to think of examples. I w- I would say a phone call, but I think the AI can reach into phone calls pretty easily now. Um, I think it's like novel situations, ethical considerations, high-stakes moments. You know, it's like, it's where the founder comes to us and is like thinking about breaking up with their co-founder, right? It's like those real high-stakes, high-emotion moments where you really want a human being. I think that's where the human fits. For all of you, like sales conversations, I think that's a human being in the room for the next 20 years. So the humans live, I think, around the edge. And I'm over time, and, uh, Kulveer should bullhorn me. I will leave you this one question. If you were building your company today, would you start it in this shape? For most of you, you are small enough to build it right, and so I don't think you have any excuse. And I know there are a few of you who are in the process of ripping up and rebuilding your company. So with that, I will stop, um, and we'll hand over to Pete. Thank you for listening.

Episode duration: 13:28

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode t-G67yKAHBQ

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome