How to Build an Internal AI Agent That Evolves Itself

AnswerThis builds AI agents for evidence-based scientific workflows and has scaled past $2 million in ARR with just two full-time employees — largely because they built an internal AI ops agent that processes over 100 emails a day, closes support tickets, and updates their CRM automatically. In this recent batch talk, founder Ayush Garg breaks down the architecture of a self-extending agent that builds its own tools when it encounters tasks it can't handle yet, how his non-technical co-founder trains the agent by giving it feedback in Slack, and the three types of memory — factual, behavioral, and procedural — that any founder can copy to build an internal agent for their own business.

May 19, 20265mWatch on YouTube ↗

EVERY SPOKEN WORD

5 min read · 1,016 words

0:01 – 0:31
Internal AI ops agent: outcomes and why it matters
1. SPSpeaker
  Hello everyone. My name is Ayush. I'm the founder of AnswerThis. We build AI agents for evidence-based scientific workflows, and today I'm going to be sharing about how we're using AI agents internally and how you can replicate our setup. So we've been able to do over $2 million in ARR, largely being two full-time employees, which is myself and my co-founder. Do have two or three contractors for things like design and outbound. But a large reason for why we've been able to do this is because we have an internal
0:31 – 1:01
What the agent actually does day-to-day (email, support, CRM, feedback)
1. SPSpeaker
  AI ops agent that handles a lot of the work that would normally consume founder time. So this AI agent is processing more than 100 emails a day for us, has closed over 400 customer support tickets. It handles CRM updates after meetings. It collects user feedback across channels, helps with things like customer support. But more importantly, as, uh, Pete and Tom and Garry mentioned, it also lets us ask questions about our business at any time. Things like, "What's
1:01 – 1:16
Instantly queryable business context (asking the agent business questions)
1. SPSpeaker
  the status of a lead?" Or, "What are the open issues for a customer?" This has become instantly queryable for us, whereas, uh, previously, you know, we had to operate a bunch of different apps to get answers to our question. Now, the most important part of this
1:16 – 1:31
Core differentiator: a self-extending agent that writes new tools
1. SPSpeaker
  is not that the agent can do a fixed set of tasks, it's that the agent is self-extending. When it runs into a repeated task it cannot do yet, it asks a coding sub-agent to build a tool for it, and
1:31 – 2:01
System architecture: thin harness + task queue for inbound channels
1. SPSpeaker
  this tool becomes permanent and is available in future sessions. So let me talk through the architecture for how you can set this up. The first thing you should do is have a Claude Code CLI wrapped in Python, where new messages from Slack, email, and other channels go into a task queue. The agent picks up these tasks and then works through them iteratively. Now, Pete, Garry, and Tom have already mentioned why you need a thin harness, so I'm not gonna repeat it. But Claude Code works exceptionally well
2:01 – 2:32
Injecting company-specific logic via read-only codebase + database access
1. SPSpeaker
  because it already knows how to inspect files, run commands, and use CLIs. Which brings me to my second point: how do you tell it business logic that only you know of? A great way that we've found, uh, to be able to do this is by giving the agent a read-only copy of both our database as well as our code base, where it has a cron job that basically gives it an updated version every time we do a release. And whenever a customer support query comes now, and they ask some question about the business,
2:32 – 3:02
Tooling layer: startup service CLIs + a coding CLI that can modify the agent
1. SPSpeaker
  the agent can just read our code base to be able to figure that out, including things like what our subscription logic is and where things inside of our app are located. Now, what actually makes it self-evolve, there's two very important components here. First is all the tools that we use as a startup, things like, you know, Intercom, Fathom, Stripe. We have given all of those as CLIs to our main agent, but at the same time, it has a general coding agent, also as a CLI, which can edit the agent code itself.
3:02 – 3:33
From skeleton to full toolkit: examples of autonomous tool creation
1. SPSpeaker
  And what this means is that whenever we ask it to do a task it cannot do yet, it is simply able to just code that tool into existence and handle the task. To us, it's magical because we only ask it to do things, but it's able to self-author tools and has, uh, gone from just being a skeleton to being this full-blown tool with over forty-five CLIs that it has made itself. A cool example for this is also when we just wanted to monitor our landing pages to make sure they're always up for ads,
3:33 – 4:03
Editable personality and memory via instructions.md (feedback loop)
1. SPSpeaker
  and we just sort of told it, and it created a cron job into existence that does this. Now, moving slightly further, there's one very important thing that also this agent needs to have, which is an editable personality or memory. We do this by an instructions.md file that is loaded on every agent turn, and the agent is able to edit this. This is how we are able to self-evolve the agent, because we can give it feedback like we would give feedback to an employee, and it just updates this instruction.md file, which then gets
4:03 – 4:34
Support-quality story: non-technical feedback that permanently fixes errors
1. SPSpeaker
  appended to the next run. The best example for this is customer support. When we had just rolled out this agent to start doing customer support, my co-founder, Ryan, who's non-technical, noticed a class of support mistakes. Instead of opening the code base or telling me or filing a ticket, he just messaged the agent in Slack and told it what was wrong. The agent updated its own instruction set and tooling, and then that entire class of mistakes stopped happening again. So the broader lesson here is that, uh,
4:34 – 5:04
The three memories an internal agent needs: factual, behavioral, procedural
1. SPSpeaker
  an internal agent needs sort of three sorts of memories. It needs factual memories, which is your code base and your database, things, how they work operationally within your startup. It needs behavioral memories. This is what you teach the agent. It's things like instruction and feedback. That is what we have the instructions.md file for. And it needs procedural memory, regular tasks that you're doing. This is what we encode into the tools that the agent is able to create itself. So if you want to copy, uh, this entire agent,
5:04 – 5:33
Copy-the-stack checklist: minimal steps to replicate the setup
1. SPSpeaker
  just take a picture of this and give it to your Claude Code. Um, just use Claude Code or any other coding-capable CLI as the main agent harness. Give the agent read-only access to your code base, give it some basic CLIs, and then give it a coding agent as a CLI as well. Finally, uh, load an instruction file that gets edited on every turn. You know, you can just connect it to Slack or email through SSH, um, and you will have this agent ready for your business as well.