EVERY SPOKEN WORD
5 min read · 1,016 words- SPSpeaker
Hello everyone. My name is Ayush. I'm the founder of AnswerThis. We build AI agents for evidence-based scientific workflows, and today I'm going to be sharing about how we're using AI agents internally and how you can replicate our setup. So we've been able to do over $2 million in ARR, largely being two full-time employees, which is myself and my co-founder. Do have two or three contractors for things like design and outbound. But a large reason for why we've been able to do this is because we have an internal AI ops agent that handles a lot of the work that would normally consume founder time. So this AI agent is processing more than 100 emails a day for us, has closed over 400 customer support tickets. It handles CRM updates after meetings. It collects user feedback across channels, helps with things like customer support. But more importantly, as, uh, Pete and Tom and Garry mentioned, it also lets us ask questions about our business at any time. Things like, "What's the status of a lead?" Or, "What are the open issues for a customer?" This has become instantly queryable for us, whereas, uh, previously, you know, we had to operate a bunch of different apps to get answers to our question. Now, the most important part of this is not that the agent can do a fixed set of tasks, it's that the agent is self-extending. When it runs into a repeated task it cannot do yet, it asks a coding sub-agent to build a tool for it, and this tool becomes permanent and is available in future sessions. So let me talk through the architecture for how you can set this up. The first thing you should do is have a Claude Code CLI wrapped in Python, where new messages from Slack, email, and other channels go into a task queue. The agent picks up these tasks and then works through them iteratively. Now, Pete, Garry, and Tom have already mentioned why you need a thin harness, so I'm not gonna repeat it. But Claude Code works exceptionally well because it already knows how to inspect files, run commands, and use CLIs. Which brings me to my second point: how do you tell it business logic that only you know of? A great way that we've found, uh, to be able to do this is by giving the agent a read-only copy of both our database as well as our code base, where it has a cron job that basically gives it an updated version every time we do a release. And whenever a customer support query comes now, and they ask some question about the business, the agent can just read our code base to be able to figure that out, including things like what our subscription logic is and where things inside of our app are located. Now, what actually makes it self-evolve, there's two very important components here. First is all the tools that we use as a startup, things like, you know, Intercom, Fathom, Stripe. We have given all of those as CLIs to our main agent, but at the same time, it has a general coding agent, also as a CLI, which can edit the agent code itself. And what this means is that whenever we ask it to do a task it cannot do yet, it is simply able to just code that tool into existence and handle the task. To us, it's magical because we only ask it to do things, but it's able to self-author tools and has, uh, gone from just being a skeleton to being this full-blown tool with over forty-five CLIs that it has made itself. A cool example for this is also when we just wanted to monitor our landing pages to make sure they're always up for ads, and we just sort of told it, and it created a cron job into existence that does this. Now, moving slightly further, there's one very important thing that also this agent needs to have, which is an editable personality or memory. We do this by an instructions.md file that is loaded on every agent turn, and the agent is able to edit this. This is how we are able to self-evolve the agent, because we can give it feedback like we would give feedback to an employee, and it just updates this instruction.md file, which then gets appended to the next run. The best example for this is customer support. When we had just rolled out this agent to start doing customer support, my co-founder, Ryan, who's non-technical, noticed a class of support mistakes. Instead of opening the code base or telling me or filing a ticket, he just messaged the agent in Slack and told it what was wrong. The agent updated its own instruction set and tooling, and then that entire class of mistakes stopped happening again. So the broader lesson here is that, uh, an internal agent needs sort of three sorts of memories. It needs factual memories, which is your code base and your database, things, how they work operationally within your startup. It needs behavioral memories. This is what you teach the agent. It's things like instruction and feedback. That is what we have the instructions.md file for. And it needs procedural memory, regular tasks that you're doing. This is what we encode into the tools that the agent is able to create itself. So if you want to copy, uh, this entire agent, just take a picture of this and give it to your Claude Code. Um, just use Claude Code or any other coding-capable CLI as the main agent harness. Give the agent read-only access to your code base, give it some basic CLIs, and then give it a coding agent as a CLI as well. Finally, uh, load an instruction file that gets edited on every turn. You know, you can just connect it to Slack or email through SSH, um, and you will have this agent ready for your business as well.
Episode duration: 5:33
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode DGD9b8K42lk
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome