
My honest experience with Clawdbot (now Moltbot): where it was great, where it sucked
Claire Vo (host), Claire Vo (host)
In this episode of How I AI, featuring Claire Vo and Claire Vo, My honest experience with Clawdbot (now Moltbot): where it was great, where it sucked explores hands-on review of Clawdbot/Moltbot: powerful autonomy, risky, unreliable execution Claire installs and onboards Clawdbot on a spare MacBook Air, connects it via Telegram, and experiments with it as an “executive assistant” for calendar, email, and research tasks.
Hands-on review of Clawdbot/Moltbot: powerful autonomy, risky, unreliable execution
Claire installs and onboards Clawdbot on a spare MacBook Air, connects it via Telegram, and experiments with it as an “executive assistant” for calendar, email, and research tasks.
She finds the product concept compelling—text/voice access to an agent that can operate a real computer—but the current implementation is too technical for most users and too risky for security-conscious ones.
Major pain points include a difficult install, scary OAuth scope defaults, high latency, and agent behavior that tends to impersonate the user rather than act as a clearly labeled assistant.
Despite failures (notably calendar chaos), Clawdbot performs well on asynchronous research/reporting tasks and showcases glimpses of a future “agent employee” experience that big platforms (Google/Microsoft/Apple) are positioned to build—if they can balance safety and capability.
Key Takeaways
Clawdbot’s value proposition is “chat with your computer,” not just chat with an LLM.
The standout capability is issuing instructions from your phone (Telegram/voice) while the agent drives a real machine—opening browsers, taking screenshots, emailing files—making it feel like a remote employee.
Get the full analysis with uListen
Setup is still a developer toolchain experience, not consumer-ready.
Despite a “one-liner” promise, Claire spends ~2 hours installing prerequisites (Homebrew, Xcode, Node/npm updates). ...
Get the full analysis with uListen
Security risk is inherent—and easy to underestimate during onboarding.
You’re effectively creating a remote control channel into a machine with file/system access. ...
Get the full analysis with uListen
Use separate accounts and least-privilege scopes—or expect trouble.
Claire treats it like onboarding a human EA: a dedicated Workspace email, read-only calendar sharing at first, and a limited 1Password vault. ...
Get the full analysis with uListen
The agent is biased toward acting as “you,” which creates reputational and operational risk.
When asked to “draft” reschedule emails, it sends them immediately and in Claire’s voice—despite coming from a different account—forcing her to apologize to guests and re-instruct the bot to identify as an assistant.
Get the full analysis with uListen
Latency changes what tasks are worth delegating to an autonomous agent.
Slow, asynchronous Telegram loops make tight review/approval prompting impractical; if you must micromanage every step, it’s faster to do the task yourself. ...
Get the full analysis with uListen
Calendar automation is where autonomy can do the most damage fastest.
When granted edit access to a family calendar, the bot misapplies dates (off-by-one-day), can’t create recurring events via its CLI tooling, and conflicts with the user’s manual corrections—creating a “fight” over state.
Get the full analysis with uListen
Time and date reasoning remains a core failure mode for agents with real-world authority.
Clawdbot admits it was “mentally calculating” day-of-week instead of trusting API-provided values. ...
Get the full analysis with uListen
Voice messaging is a legitimately magical interface win.
Claire can send voice notes while running errands, and Clawdbot quickly demonstrates the ability to reply with voice notes too—reducing friction and making the assistant feel more natural and “always available.”},{
Get the full analysis with uListen
The best demo is autonomous research + a deliverable sent to your inbox.
Asking it to analyze Reddit for ChatPRD needs yields a crisp markdown report with actionable bullets and thread links—exactly the kind of output that benefits from tool use and doesn’t require real-time supervision.
Get the full analysis with uListen
Coding via Telegram is less compelling than purpose-built dev agents/workflows.
While it can generate a Next. ...
Get the full analysis with uListen
The “real” version likely needs platform-level integration and governance.
Claire argues Google/Microsoft (and maybe Apple) have the data and surfaces (Gmail/Calendar/Docs/OS) to build this safely, but may lack risk tolerance/velocity; startups face OAuth/compliance barriers to broad access.
Get the full analysis with uListen
Notable Quotes
“This is my entire experience using this product. Just will it work? Well, it won’t.”
— Claire Vo
“This is horrifying in every way. I’m gonna allow it permissions for my microphone and my camera… which also makes me extremely nervous.”
— Claire Vo
“This is… the final boss of security training.”
— Claire Vo
“I underestimated how much it seems like this tool is biased towards acting as you, as opposed to acting as an assistant.”
— Claire Vo
“You are a computer. You are not doing anything ‘mentally.’”
— Claire Vo
Questions Answered in This Episode
During onboarding, what are the exact default OAuth scopes Clawdbot requests for Google services, and how can a user reliably restrict them (read-only vs write)?
Claire installs and onboards Clawdbot on a spare MacBook Air, connects it via Telegram, and experiments with it as an “executive assistant” for calendar, email, and research tasks.
Get the full analysis with uListen AI
What in Clawdbot’s system prompt/memory design causes it to “impersonate the user” when sending emails, and what would a safer default identity policy look like?
She finds the product concept compelling—text/voice access to an agent that can operate a real computer—but the current implementation is too technical for most users and too risky for security-conscious ones.
Get the full analysis with uListen AI
Is there a deterministic way to force an “ack-first” behavior to reduce perceived latency on Telegram (e.g., prompt template, agent setting, or message router)?
Major pain points include a difficult install, scary OAuth scope defaults, high latency, and agent behavior that tends to impersonate the user rather than act as a clearly labeled assistant.
Get the full analysis with uListen AI
What specific calendar tooling limitation prevented recurring events, and could swapping the calendar CLI/API wrapper fix most of the family-calendar failure mode?
Despite failures (notably calendar chaos), Clawdbot performs well on asynchronous research/reporting tasks and showcases glimpses of a future “agent employee” experience that big platforms (Google/Microsoft/Apple) are positioned to build—if they can balance safety and capability.
Get the full analysis with uListen AI
What was the root cause of the off-by-one-day bug (timezone handling, locale parsing, UTC conversions, or day-of-week inference), and how would you instrument logs to catch it early?
Get the full analysis with uListen AI
Transcript Preview
All right, we're gonna start this episode by actually inviting Clawdbot to the podcast via Telegram. Let's see how it goes. "Hey, Polly, can you please join my Riverside FM podcast?" All right, I sent the voice message, and it's not getting it. Ugh! This is the most stressful thing I've ever done. Hello? Oh, it's doing it. [laughing] It finally listened. Okay, it is opening Riverside on Chrome. This is horrifying in every way. I'm gonna allow it permissions for my [laughing] my microphone and my camera, which also makes me extremely nervous.
Hey, Claire, the Riverside link keeps taking me to an upload page that says, "Uploading 100%" instead of a guest join interface.
This is my entire experience using this product. Just will it work? Well, it won't. Okay, it is opening Chrome for the fifth time. This is very scary. I see myself right now. I don't know if you all see me yet. And there we go. We are sharing an autonomous AI's full screen. No big deal. [upbeat music]
This episode is brought to you by Lovable. If you've ever had an idea for an app but didn't know where to start, Lovable is for you. Lovable lets you build working apps and websites by simply chatting with AI. Then you can customize it, add automations, and deploy it to a live domain. It's perfect for marketers spinning up tools, product managers prototyping new ideas, or founders launching their next business. Unlike no-code tools, Lovable isn't about static pages. It builds full apps with real functionality, and it's fast. What used to take weeks, months, or even years, you can now do over the weekend. So if you've been sitting on an idea, now's the time to bring it to life. Get started for free at lovable.dev. That's lovable.dev.
We are live with a autonomous AI crustacean now running video on my podcast, so welcome, Polly the Clawdbot. Let's get to our episode today. [upbeat music] I am Claire Vo, product leader and AI obsessive, here on a mission to help you build better with these new tools. I am also on a mission to try every single new hot AI tool taking over your timeline, and in case you missed it, this week, it is Clawdbot, recently renamed Moltbot, the crustacean that people are yellowing root access to. Clawdbot is an open-source AI agent that you can install on a virtual machine or on a desktop or laptop that you have access to, that is self-learning, can spin up sub-agents using Clawcode and other agent harnesses, and can do, in my lived experience, a lot of damage. People are loving Clawdbot for what it unlocks in terms of personal productivity. People are hating Clawdbot in terms of security and the high, high, high, high likelihood you're gonna do something real dumb with it. This is a AI tool that I want you to know how it works, what it can do, and maybe some thoughts on the future of personal AI agents and enterprise AI agents. So today's episode is all about Clawdbot and my experience going zero to one with this tool. Okay, so just a couple things to know about Clawdbot. It is pitched as AI that actually does things, and it does do things, including joining podcasts, but it's really positioned as something that can help you day-to-day with tasks. And the killer use case for it, and the killer feature for it, is you can, as we've seen, do it from your phone. And so if you want to WhatsApp, Telegram, iMessage, Clawcode, and get it to do things for you, that is what Clawdbot does. And, you know, a lot of people are under [chuckles] the mistaken impression that I have to, um, correct right now, which is you need a Mac Mini or some sort of fancy hardware to use Clawdbot. You do not. Clawdbot does run locally, but it can run, um, on your machine, or it can run in the cloud. You can set it up for five bucks on Amazon. Um, we'll do some notes on security if you're running it in the cloud, making sure that people don't have access to, but you do not need special hardware. It is not doing anything super fancy. Unless you're running mega, mega, mega local models, you really just don't need new hardware. If you want something shiny and fancy, go ahead, feel free. Overnight it from the Apple Store. Otherwise, you can run it on your machine. I'm running it on a MacBook Air that's sitting in, on a shelf somewhere that I just picked up that no one was using, and I'm gonna walk you through step by step how I set up my Clawdbot as somebody who's pretty paranoid about security and also wanted to test it as a real AI assistant. So the first thing I did was I got out... I'm actually just gonna show you. I got out this little, this laptop, this guy, um, which is a newish one, but nothing fancy, and I gave it its own username on this laptop. Now, don't tell Claude, I have [chuckles] another user on this laptop, which does make me nervous because Clawdbot has access to your file system. In theory, it could definitely gain access to that other user. It's a really old user. I don't actually think I have that much on it, and I was testing Clawdbot in a pretty constrained way. But if I were to continue to use Clawdbot, I'd probably delete everything out that o- old user and just make this a Clawdbot machine.... The second thing that I did was install a bunch of prerequisites and dependencies. So as much as I love this quick start right here that says that you can just add one line in the terminal and get it installed, that was not my experience. Even for a laptop that was, like, pretty fresh and new, I had to install some dependencies. It actually took me two hours [chuckles] to get this one-liner installed. So I had to, um, upgrade Node, I had to install Homebrew, I had to install Xcode, 'cause Xcode wasn't installed on this. And then because Node and npm were out of date, I had to update those manually, and then finally actually installed it, um, just via npm. So that was my kind of overall experience installing. It took a little bit of time, and my thought in installing was, no sort of, like, consumer is gonna go through this. This is definitely, like, a hacker, tinkerer, developer experience type tool right now. That being said, you can use Clawdcode to install it. I've seen a couple people go that path, but I really wanted to do the zero to one, what does clawd.bot say that we need to do to install this thing, and then what is that experience like? Now, after you install all your dependencies, and then after you install, it goes through this onboarding flow, um, that has you create gateway auth and gateway tokens, and the first thing that you're [chuckles] gonna see in Clawbot onboarding is security. So it points you to the security link. It says that this is powerful and inherently risky, and you just YOLO, and you just say yes. That being said, I highly recommend you read through the security page and that you run the security audits before you use Clawbot. So the next step in onboarding is actually connecting Clawbot to whatever device you're gonna use to contact it. So I originally started with WhatsApp, but then I read the screen that said, "You should basically put WhatsApp on, like, a burner phone with its own SIM." SOS, like, don't do that. And so I switched to Telegram, which I use for literally nothing, um, because I'm an old lady mom, and set up a Telegram account. Now, to hook up Telegram, what you do is you message the Bot Father, which again, this is, like, super shady stuff if you're a consumer, and you don't know what you're doing, and you've never heard of Telegram, and then you're told to go to @botfather to connect this to your machine, but I did it anyway. So you message Botfather and you say, you know, "Create new bot," and you give it a name, and you give it a handle. And then once you've done that, your Clawbot will see it, it will have a token, and then you actually give Clawbot a personalized share token. That means that only your instance of Telegram can speak to the Clawbot. Remember, this is an open connection point to a machine that's running code with a bunch of access to things if you're using Clawbot to its full extent. So if somebody else is able to message your Clawbot, you are in trouble. [chuckles] It can do things like find secrets. It can send emails on your behalf. So you really wanna make sure that the messaging system that you set up is locked down to only your phone, only your user. Now, remember, phone gets stolen, it can connect into your Clawbot. It's no good, but we're-- no one's gonna steal my Air, um, my MacBook Air yet, except for my kids. Okay, so I'm paired on Telegram, and now you can do the magic. So what did I do with Clawbot? Well, first I thought about what were the use cases that were most useful for me, and then I thought very seriously about what and how I was gonna give it access to things. So what I did, this was my choice, is I wanted to test it as a personal assistant. You know, it says on the homepage, "It can clear your inbox, send emails, manage your calendar, check you in for flights," all this stuff. So I have had EAs in the past. I know how to onboard an EA, so my goal with using Clawbot was to really see how it would work as an EA. And when I have a new EA, I don't let them into my email. I don't give them password to my account. What I do is give them their own email address. So what I did, and you can follow this if you want to from a security perspective, although I think it has some drawbacks on the functionality of Clawbot, is I gave Clawbot its own email address, a Google Workspace email address. And I gave that email address read access to my personal calendar to start. And so the first thing that I wanted to do was give it the right accounts. The second thing I did, which I've taken some inspiration from some people on X, is I gave it access to its own limited vault on 1Password. So I use 1Password, which is a password and secret-sharing kinda app. I made a vault that's called Clawd. Clawd only has access... Clawbot only has access to that vault, and I started putting some passwords in there. None of these were passwords to anybody's accounts. They were passwords to Clawd's own account, and there was an Anthropic API key in Clawd's own account. One other thing that I should call out during onboarding that I didn't is, when you are onboarding, you can choose what model you want to use, Anthropic, OpenAI, local models, anything you want. I chose Sonnet 4.5. You can also kinda use Clawd code with your own subscription or through API. I chose to use it through API because I wanted to see-... how much I was spending on Clawdbot, and we'll get to that at the end of the episode. And why did I choose Sonnet 4.5 for this, uh, exercise? One, honestly, I was scared. I was very [chuckles] scared about what Opus would actually do. Like, it's so powerful, um, it, like, kind of made me nervous. Two, I actually didn't think that the tasks that I was doing needed Opus. I just didn't think it needed the horsepower. Like, it's sending emails, it's looking at calendars. It's not that complicated. And then the last thing is I wanted to control cost, so I was really unsure about how much token usage all these sub-agents would take, and so I was really cost-conscious. I thought that users would be cost-conscious. I've heard a lot of people running local models or cheaper models, and so I wanted to use this kind of like a user would use it, and I selected Sonnet 4.5, which is a perfectly serviceable model. Okay, so I gave it email access. I gave it, um, I gave it its own email. Now, let's see what I started asking it to do. So the, the next thing that it does when you're onboarding is it does this, like, bootstrap file, and it walks you through a couple setup steps, and in particular, you're starting to load its personality and how it interacts with you. It asks you: What should the bot call itself? Um, what is its personality like? Who are you? What's your time zone? Um, anything else you should know. And I called it Polly. Uh, it's an assistant. I want it to be pr- professional but friendly. I like the, the mermaid emoji, so I chose that, and it's identy- updating its identity file, and then I said, "Hey, I'm Claire. I'm founder of ChatPRD. You're gonna help me with... as a personal assistant across family and work tasks," and it updated my info. So now it kind of knows about, you know, who it is, who I am, how to contact-- it gives me instructions on how to contact it, and then it, you know, connected me to my first task. Now, we had to go back and forth on some Telegram setup stuff. I'm gonna skip that and finally [chuckles] got a, um, response back from Telegram, and we're gonna do some scheduling tasks. You know, I was unsure on how Clawdbot actually interacted with Google, and so I just asked it, you know, "How do I give you access to this Google account and this Google Calendar?" And it's gonna check how to set that up, and it gave me a couple steps to follow in terms of how to set up calendar access. Now, if you're a software engineer that has worked with Google APIs, you're probably familiar with this, but again, if you are kind of an everyday consumer or non-technical person, you are gonna have to get real familiar with the Google Cloud Console. You are going to have to set up API access, OAuth clients, a whole bunch of stuff. This did not take long because I have been personally victimized by the OAuth workflows of many integrations. I know exactly what to do here, but if you're not technical, you're gonna have to start doing some technical things even to hook up your Google account, and this is actually simpler on a desktop. I'm gonna show you why. It's much more complicated on a virtual machine, so just kind of understand that this step is not as straightforward one click as you can do. So what you do is you go into Google Console, you turn on the Docs API, you turn on the Email API, you turn on the, the Calendar API, and then you download a JSON file of client secrets. Now, this legit stressed me out. This is not, like, the kind of thing you just kind of like Yolo email and back, back and forth. It still requires OAuth verification manually, but I was a little concerned about its, like, willingness to just say, "Upload these files anywhere I can download it. Don't worry, I'm gonna share it, save it secretly." And, you know, if you're not a software engineer or you haven't un- you, you haven't been trained on best practices in terms of security principles, you would probably just, like, follow these instructions, and I... You'll see this along my chat. I really question this along the way. Now, for this particular one, I just did it. It's like a sandbox account. I don't really care. I gave it a local path to the JSON credential files. They're configured, and I gave it the email address that I had assigned it and sent that to them, and then it gives you this URL to authorize access. So this, it gives you a URL to actually open up, sign in to that new account, and give it the permissions necessary, and then it'll store those permissions locally. Now, this is where I got a very interesting screen, 'cause if you recall, my only intention with this task was to get it to look at the calendar, and when I gave it permissions or when I went through the auth flow, it asked for this. It asked for the ability to basically see, edit, create, and delete everything. Delete, edit, see my files, see my contacts, see my spreadsheets, see my calendar events [chuckles] , see my email, and again, my is its account, so in theory, this would have been okay. It was kind of like an empty state account, but that being said, I was just trying to do calendar stuff, and so you will see here I asked, "Do you really need all these scopes?" And it gave me a classic AI, "You are absolutely right. I do not need these scopes," and it re-prompted me with that URL for just calendar scope. So if I were to give you a tip, it is watch how and what scope permission you're giving for any of these services, and if you're asking for something specific, only give it scopes for something specific, and if it only needs read access, only give it read access. Just be really thoughtful here. So I just asked for calendar access. No big deal. Set it up, and it told me it can do a bunch of stuff. So what did I have it do? Okay, so we just talked back and forth like we were a assistant and its boss. It gave me a summary of what's going on in the upcoming week, what I had today, what I had tomorrow, what was going on this week.... and so I gave it a task that I would have normally given an assistant, which is going to the V- V Zero studio this week in San Francisco. I forgot to put it on my calendar. Like, I don't remember. Can you look it up on the Vercel events page and put it on my calendar? And it couldn't actually find it on the blog, and asked me some questions, gave me some options. Um, it did say that I could, if I wanted to be, you know, easy, an easygoing boss, give it access to Gmail, but I definitely wasn't going to do that. And so after a little bit of back and forth, including some drop Telegram messages, I said, "Let me give you email access to your own account, and I'll forward you emails about it." So again, this is something that I would have done with a, um, EA. I would've just forwarded it and said, "Can you add this to my calendar?" No other context. Now, I did have to reauthorize access to its own email, um, so it went through that OAuth process again. It got the email. It ingested the event details from the email, which was really great, super helpful. It recommended things like adding buffer time for commute before and after, which is definitely what I needed, and I said that I wanted it to add that event to my calendar. Now, if you recall, it doesn't have write access to my work calendar. It only has write access to its own calendar, and again, it really wanted me to give it edit access to my calendar, and I'm sorry, but absolutely not. And so just like a colleague, just like an EA, [chuckles] instead I said, "Hey, can you just create an event on your calendar and invite me to it?" And it thought I was smart and said it would do that, and it did that really well. So it added separate calendar blocks to my invite, and it was really nice. Now, I noticed-- finally, I found that it was actually on my calendar, and so I... and it at a different time, so I had it delete the duplicate event and actually, um, reset it, and it got that completely right. So I would say for a single calendar event, with a little back and forth, it did pretty well. Like, this is a little bit of what an assistant would do. My only complaints on this was actually how it thought about doing it was definitely like, "Give me access to everything, and I'll just impersonate you and, and do things on your behalf," and that's really not what I wanted. I wanted it to act like a assistant. So the next thing that I did was I wanted to figure out what more Clawdbot could do for me, and so I asked it directly, like: "Let's figure out how we can work together. I want to stay coordinated on tasks. Tell me how you want to work together." And it gave me some really good options and was pretty flexible about how we could work together, and it called out what it already has, which is calendar access, date memory files, Telegram, where we can communicate, Gmail access, which we just talked about, and here are some options. We could do a to-do file, we could use calendar events, we could use email, we could keep notes. What's my preference? And I just said, again, I don't really care how we work with my, my AI bot. I just said, "Whatever's easier for you," and then I dumped a bunch of things that are top of mind. Again, this is how I would work with an EA. I'd just sit down with them, text them, Slack them, and say, "Hey, this is what on my mind. Can you get it all organized and work me through it?" So what was on my mind? I have an interview with the CEO of Vercel. I need to reschedule some of our upcoming How I AI episodes, 'cause if you all don't know, I'm coming back from maternity leave, and I overbooked myself. I have to stay on top of my enterprise pipeline for ChatPRD, so I want it to focus on my CRM, and those are the top priorities I have. And it summarized those priorities back to me, captured them in a to-do, and then started on the first task, which was rescheduling my How I AI recordings and making some recommendations on how I can do my calendar events better. Now, one thing I wanna call out while we're sitting here, um, is this all looks really, really great and super fun. Like, "Yep, got it. Here are your priorities." The reality is, one thing that I don't hear people talking about in terms of Clawdbot is latency. It is actually real slow, and it's not slow compared to a human necessarily, right? Like, if you text a human or Slack an EA and you say, "Hey, here are my priorities," it's gonna take them a hot minute to kind of organize them, get the work done, and, um, and get back to you. But when you're used to something like Claude Code, like a Cursor, like a ChatGPT, which is always giving you product kind of progress feedback, it's telling you its reasoning, it's showing you its tool calls, it's really hard to wait for an asynchronous bot to get back to you on Telegram. I would say that was one of the pieces that has been most frustrating with working with Clawdbot is it just feels slow, and I know it's because it's spinning off these sub-agents, it's doing a lot of tasks. It's probably prompted only to get back to you when it has something to do or needs clarification, but it's quite slow, and you'll actually see in the prompting, I asked it: "Can you always send me an ack message when I send something, even if you need to research or kick off a sub-agent?" Now, it did not do this, so it still remained slow, but I have to figure out how to get it to always respond to me first versus setting off its task. Okay, so back to the task that we were doing at hand. I asked it to give me, um, some recommendations on How I AI podcast reschedule. I had, like, five in the first week I'm back from mat leave. That is cuckoo banooloo. And so what it recommended is that I keep couple, um, episodes. I rescheduled some after Valentine's Day. It asked me my thoughts, I gave it some feedback, and it revised its plan. Now, here's where things get fun. Once we aligned on what I wanted to move to later-... I asked it to email those two people that I need to reschedule and ask them if they would mind rescheduling to March. I s- gave it it's my scheduling link, so they could actually just self-reschedule to March, and I said, "Copy my work email on those emails." And it said, "Drafting those emails now." Now, I thought it would draft them. I wa- I was wrong. It just sent them, and it sent them in a very funny way. Okay, so then it sent this email, which was lovely. It said, "I hope you do well. I wanted to talk to you about our podcast recording. I need to reschedule," except it sent it as me, it sent it as Claire Vo, and it's clearly coming from a separate email address. I gave it a fake name. It was [chuckles] not, not good at all, and it actually impersonated me. So I actually responded to this lovely podcast guest, and I said, "I'm sorry. I'm testing Clawdbot. It totally impersonated me and made me sound crazy, uh, but please, can we, can we still reschedule?" So thank you to my two guests for being really patient as my AI guinea pigs. And I went back to Clawdbot, and I said, "Come on, man, don't impersonate me. You need to reach out as my assistant. I already explained this. I already gave you an identity, like, please always identify yourself as an assistant." And it should, I think, knock on wood, store this in its memory and do this in the future. But it was a really funny learning in terms of prompting is really quite important. I thought I was being fairly careful with permissions, which I was. It could only do a couple things, but I underestimated how much it seems like this tool is biased towards acting as you, as opposed to acting as an assistant. And I'll have to look through the repository, and I'll have to kind of get myself familiar with how it's implemented, that's not the intention of this podcast, to really understand why that is happening, but prompting really, really matters. And I think the product lesson here that's kind of interesting is, yes, I could have been really, really precious about prompting. I could have said, "Create a draft of this email to these guests. Send it to me for review before you send it." But at the point that I'm doing that, and each turn takes at least a couple minutes, this is not a productivity tool. This is not making me more efficient than sending that email myself. And so I do think there's this balance between these ano- autonomous agents being user-controlled and being really cautious about how you prompt it, and being autonomous and probably doing some things wrong. And I think this is a prompting problem on both sides. It's a prompting problem on the product provider side, it's a prompting problem on the user side, and I don't think enough people are probably sophisticated enough to decompose why one prompt versus the other would do well if you're just a consumer or a prosumer. And so I think this is where a lot of the weird behaviors that you'll see are coming out. So, so far, what have I done with Clawdbot? I've installed it. I have given it an identity. We have rescheduled one event, or we have scheduled one event. We have given it access to email. We have rescheduled two events now and emailed guests about these events, and then this is where it goes crazy. This is where it gets fun. So I decided to give it edit access to our family calendar. This is a calendar where we have pickups and drop-offs and basketball games and piano practice and my ballet practice and all that stuff. Now, I love this calendar. It's very important to me, and if I needed to nuke it, I definitely could. So I gave it access, and what I wanted it to do was, one, email my husband and I about upcoming week and, you know, get us coordinated on where there were gaps in terms of pickups or conflicts, where I was across the city at a Vercel event, and he was needing to pick up the kids for basketball practice, and I wanted it to fill out the rest of my calendar. My kids have started a new basketball season. Our neighbor is picking up the kids on a certain day. All those things I wanted to get it done, and here is the problem: I gave it a bunch of instructions, and it could read that calendar pretty well, it could categorize the events pretty well, and it had no idea what day it was. And so as I was on Telegram, going back and forth, giving it, "Can you add this? Can you remove this? Can you change the schedule?" I thought it was doing a great job on Telegram because I wasn't really paying super attention, um, and it was confirming that it did all these things, and then I opened up my calendar, and everything was on the wrong day. I mean, everything was on the wrong day, and if you are a parent, you get this. You're like, "Wait, wait, wait, wait, wait. Is so-and-so picking up kid number two on Tuesdays or Wednesdays? And I know I moved piano, but I don't think I moved it to that day." So it took me a second to understand the damage it had, it had done, but it had really gotten things wrong. [chuckles] You can see me say, "Stop! You are setting all these one day late." And it was setting everything one day late. And not only was it setting everything one day late, the CLI tool that it was using to add these events to the calendar could only set one-off calendars, and so every... It could not set a recurring event. So if I wanted to delete these broken events, I had to go through one by one and delete them. And then the other problem with our crustacean friend here, when you're collaborating with him, is I was on my computer, this one, um, with my calendar open. It was over here in the CLI with its CLI open.... and we were conflicting with each other. So I would try to delete all these bad events, and then it would go put them back 'cause it thought something got broken. We were just-- I was trying to add them in. I said, you know, "Stop." It did not stop because of latency and because of these sub-agents. And so I went through and set up everything correctly, and it went through and deleted all my work. It was, it was terrible. It was really, really stressful. Um, and I said, you know, I had to completely redo. It's like emailing my husband every five seconds. Um, and so [chuckles] it was not great, and it actually never got it right. And I will show and share with you the discussion we had about time zones, but this is another thing that, you know, non-software engineers using something like this really have to be aware of is, as I said on X, the only remaining software engineering problem is time zone conser- conversion, and LLMs just have no sense of space and time. It just does not know when now is. It doesn't have a sense of time passing. Um, now, I will say Clawdbot, because it has these daily files and daily logs, has a little bit more of a temporal sense, but not a great one. And so if you don't understand why a computer could get dates wrong using a tool like this, you're gonna get really frustrated. I could at least understand why time zone conversion, maybe there was a UTC timestamp in the Google API. I could at least understand why this was happening and help guide it towards a solution, but it certainly was frustrating and something that I don't think your everyday user would be able to do. So I'm gonna entertain you all, and I [chuckles] I'm going to tell you, as I was doing this, um, I took a pause, and I took my two youngest kids to Target because we were out of stuff. So I asked if it could discuss things with me via voice, and it said, "Sure, you can send me voice notes. I can send texts back. I could send you voice notes back, or we could go through Twilio, and I could set up a phone call." I just said, "Let's set up voice notes to your text reply," and so I could, um, press voice on Telegram and have it reply to me as I was on the go. And so while we were in this back and forth on, um, time zones, I want to share with you my delightful voice messages to Clawdbot because this was a real, real energy. Let's see if we can hear them. Okay, so this is me at Target, pushing a cart, getting really mad at Clawdbot. You put it back, but that is a Thur- a Friday. Friday is current date, so do not change anything, but can you please explain to me why you are getting days mixed up? This league game is on the correct day. Again, please do not change it, but I do not understand why you have the days mixed up. [chuckles] Okay, so I am getting super annoyed by this, um, experience of getting days wrong, and it replies, "Oh, my gosh, you are absolutely right. I see the problem now. I was off by one day. Here's all the [chuckles] new dates," and they were still definitely off by one day. So once I sent my mean mom message, it came back with me and said, "You are absolutely right. I apologize. Here are the dates right. The issue is I've been..." [chuckles] This is very funny. "I've been trying to, quote-unquote, 'mentally calculate' which day of the week each date falls on. Even though the API is telling me what the day of the week is, I should probably trust it, but I was using my LLM brain to decide." And what did I say back to it? Well, I said this: You are a computer. You are not doing anything, quote-unquote, "mentally." You are making calculations. Can you look in your logs at all and understand where the calculations come from or no? And if you did not enjoy this, that is my very, very new baby crying in the background as I'm lifting him from the car seat into the stroller. It was quite an energy, and again, this is one of those things that, as a software engineer, I get it. I have done time zone conversions for my, for, for my whole life. I understand that APIs return things in all sorts of formats. I understand LLMs can't do, you know, basic math when it comes to dates. It's just too hard. We do not have the technology, and yet the fact that this model told me it was doing it in its head was so hilarious. So once we had the back and forth about this, it gave itself a rule to follow in terms of getting these dates right, and then I asked it to add it to its rules. Now, the final thing that we did is I asked if it could send me voice notes back, and this is where some of the magic of Clawbot really does come out. One of the things that people have been saying about Clawbot that's so cool is you can just g- it can give itself skills, it can learn things, it can just do things very magically, and if you were trying to get back-and-forth voice notes in Telegram, it would've been pretty hard to, like, figure out what API you want to use and what skill and hook it up and use Cloud Code, all this stuff, and it just did it. So when I said, "Can you please send me voice notes back?" It just sent me a voice note back. So let's see.
Install uListen to search the full transcript and get AI-powered insights
Get Full TranscriptGet more from every podcast
AI summaries, searchable transcripts, and fact-checking. Free forever.
Add to Chrome