How I AIHow a visually impaired engineer builds personal software with Claude Code + Wispr Flow
EVERY SPOKEN WORD
50 min read · 9,983 words- 0:00 – 2:34
Introduction to Joe and his background
- JMJoe McCormick
Right before I started college, I ended up losing most of my central vision due to a rare genetic disorder called Leber's hereditary optic neuropathy. I was talking with someone who was losing their sight recently from the same disease, and they were asking about different things, and I was like, "Oh, you can just do all of that now with Gemini or ChatGPT." The world is a whole lot easier.
- CVClaire Vo
So you're gonna show us some of the things that you've built for yourself.
- JMJoe McCormick
So when someone sends me an image, I use this tool to be able to get the gist of an image without needing to ask somebody to explain it to me. If I hit Control + Shift + D on any message, it's gonna pop up and go off and describe that image for me. And the cool thing is, I can go ask some follow-ups. What age child is this for? And it will head off to ChatGPT and get the response for this as well.
- CVClaire Vo
I'm curious for you, what are you most excited about in the multimodal world of AI?
- JMJoe McCormick
One thing that I was always afraid of: Can I read stories? I can memorize stories, I can tell stories, but your son being like, "I want to read this book," and you having to be like, "Sorry, I can't." And now that, "Sorry, I can't," becomes, "Sorry, I can," with the assistance of so many different tools now.
- CVClaire Vo
[upbeat music] Welcome back to How I AI. I'm Claire Vo, product leader and AI obsessive, here on a mission to help you build better with these new tools. Today, we have Joe McCormick, principal software engineer at Babylist, who has a vision impairment, and he's gonna show us how he uses AI to build micro Chrome apps to make his everyday life and work a lot more accessible. You're gonna learn how to use Claude Code to write Chrome apps, and you're gonna be inspired at the little things you can do to make your own Slack a little bit more efficient. Let's get to it. This episode is brought to you by Tines, the intelligent workflow platform powering the world's most important work. Business moves faster than the systems meant to support it. Teams are stuck with repetitive tasks, scattered tools, and hard-to-reach data. AI has huge promise, but struggles when everything underneath is fragmented. Tines fixes that. It unifies your tools, data, and processes in one secure, flexible platform, blending agentic AI, automation, and human-led intervention. Teams get their time back, workflows run smarter, and AI actually delivers real value. Customers now automate over one point five billion actions every week. Tines is trusted by companies like Canva, Coinbase, Databricks, GitLab, Mars, and Reddit. Try Tines at tines.com/howiai.
- 2:34 – 4:50
Joe’s journey into computer science after vision loss
- CVClaire Vo
Joe, thanks for joining How I AI, and I want you to spend a little bit of time introducing yourself and your story, and how AI has impacted your ability to do work and build interesting things, and engage in lots of awesome projects, and what's different about your life now with AI versus before?
- JMJoe McCormick
So yeah, my name is Joe McCormick. I'm a principal software engineer at Babylist, and I think I took maybe a little bit more interesting journey than most into the, uh, computer science world. Um, so right before I started college, I ended up losing most of my central vision due to a rare genetic disorder called Leber's hereditary optic neuropathy. And so before starting, uh, at Harvard, I was more interested into the mechanical world and kind of robotics and everything in that space, and then, uh, found, uh, that that was a lot harder, and doing things with my hands was becoming a lot harder, um, month after month. And so I took the intro to computer science course at Harvard and immediately fell in love, um, and found that I got the same feeling of creativity and being able to come up with an idea and make it happen. Uh, but now I was on, uh, maybe not a full equal plane to my competitors at the time, or my, like, uh, uh, other students, but then, um, obviously, as AI took off, became even more, um, equivalent, and the gap between, um, I think, software engineer for a sighted person and a visually impaired person is, is closing, um, day by day. Um, and also in my personal life, I think it's even been extra impactful. I was talking with someone who was losing their sight recently from the same disease, and they were asking about different things, and I was like: "Oh, you can just do all of that now with sharing your, your screen with Gemini or ChatGPT." Whereas, uh, when I was, uh, first losing my sight, it was using different magnification tools or, or even, like, glasses and things, and it's like now the world is a whole lot easier. Uh, I'm an avid Meta Glass user, um, and, and different things make my personal life a lot easier as well. Um, but yeah, I, I do lots of AI product engineering now, and I, at Babylist, lead the, uh, AI enablement and trying to make sure all of our software engineers can build with AI, uh, as productively as possible at all different parts of the software development
- 4:50 – 6:09
The concept of personal software for accessibility
- JMJoe McCormick
lifecycle.
- CVClaire Vo
So you figured out a way, one, to adjust your interest in engineering to something that's a little bit more accessible for you, and then two, lean into how these AI tools can really increase the accessibility and user experience of supportive technology that you've maybe used in the past, but that you've been able to make better yourself. And what I love about this personal software moment that we're in right now, which is, unfortunately, accessibility software and custom software that meets the needs of a lot of people, is simply, in some instances, not an economically viable business, for example, to build. And so in the kind of broader economic world, there's not a lot of incentive to build a full set of robust tools that can meet the needs of everybody who deserves to have their needs meeting and, and needs met. And what I love about what we're able to do now with AI is not only are more interesting sort of accessibility tools and, um, and platforms being able to be built, but people can build these solutions for themselves, and they can be very customized to-... your experience, your needs, your strengths, and I think that's a really underappreciated benefit of
- 6:09 – 10:40
Demo of image description Chrome extension for Slack
- CVClaire Vo
AI. And so you're gonna show us some of the things that you've built for yourself, and you're actually gonna walk us through your coding flow, which I think is really awesome, on how to build one of these tools, so we can follow it step by step.
- JMJoe McCormick
For sure. So yeah, we can jump, uh, right in. I'll show off, uh, two that I built myself ahead of time, and we're gonna do one on the fly. And I do think personal software is going to be huge. Um, one reason why I like, um, building some of these... So I'm gonna show off a couple Chrome extensions that I've worked on. And one thing I like about building some of these, as compared to maybe some of the offerings we have today from the AI-native browsers, is AI-native browsers are, are great. Uh, like, I do use Comet, um, but it's, it's using the, uh, the Swiss Army knife, and it does everything. Um, but in order to do that, it- some of its processes are definitely slower, which there are certain things that's totally fine for, but there are other steps where you want it to be really quick, and you want to use the drill instead of the little tiny screwdriver that came with the Swiss Army knife. And so I'll show off a couple that I've already built, and then we'll jump into one that's not, uh, necessarily even accessibility only, that I think everyone could benefit from. So I'm gonna hop into Slack now. Um, uh, Babylist, where I work, is a big Slack company, so lots of my stuff is very Slack-based, um, because it's where I spend a lot of my non-coding time. So I'm gonna hop in here, and I'll share and show a couple ones I've built so far. There's a little temporary Slack channel that we have for this, um, with some example messages that were actual messages sent by my colleagues. I just had them resend it, um, into here. Um, so first one I'm gonna demo is a image description tool. So when someone sends me an image, I, uh, use a screen magnifier, so I typically am looking at my screen at about 10x zoom, but it's not the easiest to, uh, to, to do, and I prefer to not have to always be paying attention to that, if possible. So I use this tool about to show off to be able to get, um, the gist of an image without needing to ask somebody to explain it to me or me having to actually use my eyes to do it. So I have a shortcut in Slack. Uh, if I hit, uh, I'm on Windows, so Control-Shift-D on any message, it's gonna pop up and go off and describe that image for me and, uh, tell me a little description of it. So I can see, hey, it shows a modern infant baby stroller with a car seat. Um, it's got a canopy, it's got all details about this, and the cool thing is I can go ask some follow-ups. So I can say, um: "What age child is this for?" And it will h- head off to ChatGPT, um, and get the response for this as well. And so we can get our answers there, and it's just a nice way for me to not have to actually push back and work with people, and get some answers to my questions, um, as, as I go on this, too.
- CVClaire Vo
And I think this is something that folks don't really appreciate, which is for you, you know, you have the ability to zoom in, look at this, but it's, it's just from a time perspective, probably a lot more tedious for you to do. And so folks have thought a lot about image to description in terms of generating metadata for e-commerce sites, which I'm sure you all think about a lot, or, um, we had an episode with a documentary producer that highlighted using image-to-text descriptions plus metadata to organize archival footage a lot easier. But this is a great example of image to text being just a much more efficient information transfer method for someone like you, who might need to parse this, this information differently. And then what I love about this is you could have just done the image description, right? Just, "What, what is this image? And tell me." But you actually were able to go that next step and say, "Great, if I need to query more information about this to understand more context," you make that really, really easy. So I love- I just, I love this example, and you, you built this all yourself?
- JMJoe McCormick
Yeah, this was, uh, probably 25 minutes of a Claude Code session. It was, it was pretty straightforward. Um-
- CVClaire Vo
Awesome.
- JMJoe McCormick
A, a flavor of this one I'm working on right now is a version that works in Figma directly as well, that, given any Figma node-
- CVClaire Vo
Mm-hmm
- JMJoe McCormick
... will explain it to me with a, a much different prompt, right? In the Figma case, I want to hear about the colors of the CTAs. I want to hear about all this stuff because I am a full-stack engineer, and so obviously, you can get all that out of Figma, but it's lots of clicks and lots of, uh, different steps, and being able to just hit, uh, one keyboard shortcut and find out what this design is really accomplishing is gonna be a nice, easy, uh, win for me. And that one is just about done as well. Um, it's got a little bit of bugs, so it's not ready to fully demo, but that's one that I'm excited about for as well.
- 10:40 – 13:12
Demo of AI-powered spell checker extension
- CVClaire Vo
So before we go into building one of these, are there a couple other extensions that you've built just as inspiration for folks watching or listening that you think are really interesting to show?
- JMJoe McCormick
Yeah. Another one that's not necessarily accessibility-focused, um, but it's, I think it's a cool one. Um, so I, uh, am not the best typer in the world. I, I don't even think that's my vision's fault. Um, I just... Uh, I, I think- I'd like to think I'm a touch typer, but I think my brain goes faster than my fingers sometimes. So I have one that I built that is just like a really easy spell checker. There's lots of tools that do this, Grammarly and all this, but simply, they're not all screen reader accessible. They're multiple clicks away sometimes, um, and so I built one out that, uh, works in any input field in, uh, on the web. I'll demo it here. I'm gonna say, uh, "Test- testing typos in the message." And then if I hit, uh, Control-Shift-S here on this one, this is gonna go off, send that off to OpenAI, and come back with, with that. And while it was doing that for me on my screen reader, it said like, "Processing spell check. Spell check complete." And so I know when I'm writing a message, I don't need to necessarily worry about the, all the polish on it. Uh, I can just do that. I hear, "Spell check complete," I'm ready, I go off, hit that, and send it off to people. And I have a prompt there that's basically like, "Do not change any of the words, just fix typos." Like, really hyper-focused to make sure that it's the content that I wrote, but just with the typos corrected in it.
- CVClaire Vo
... So I'm leaning in and smiling a lot, one, because if you've been watching How I AI, you have heard about my fancy nails. I'm a fancy nail girl- [chuckles]
- JMJoe McCormick
Yeah.
- CVClaire Vo
-these days. And with these fancy nails, I cannot type anything. It is all typos. And so this is such a great little workflow that you built for yourself that, um, I'm, I'm gonna steal. The two things I wanna call out for people who are watching or not watching the details is you are actually running Slack right now in Chrome. So at first I was like: Wait, how are all these apps interacting with the Chrome desktop app? But you're running Slack in Chrome, which means that these are all extensions that are available to you to interact with content in Slack and make modifications via, um, a Chrome extension. So I think that's a really interesting hack for folks that are like, "Okay, I can't hack my way into the desktop app, but I can load um, Slack in the browser, and then on top of that, add a, a browser extension that can do these interesting
- 13:12 – 14:37
The efficiency of keyboard shortcuts for accessibility
- CVClaire Vo
things for me." The second thing I love is how you're using so many keyboard shortcuts to trigger these micro apps. And again, this is about efficiency. I always say in these AI products that latency is the killer feature. And so anything you can do from a UX perspective or from a, a performance perspective to make these little apps more efficient, the better they're going to feel. And so I love that you, you know, type a couple keys, and you get a fully corrected sentence here, right in your browser. It's a great idea.
- JMJoe McCormick
Yeah, the first version I had of this was just using the, like, ChatGPT, uh, Alt+Space shortcut that would open up that little, like, mini ChatGPT window, and I had, like, a saved, um, uh, just custom GPT to do it. And then I was like: Well, why am I jumping out of where am I actually working? Uh, I can, I can save, uh, two steps and, like, three, three clicks here. And so I think it's almost like piloting it first without doing this, and then realizing like, "Oh, yeah, there's a better way." And I think one thing on personal software is the return on investment, uh, became so much faster. Like, before, you'd have an idea like that, and you'd be like: This is gonna save me, like, three minutes a day, but it's gonna take me three days to build, so the payback period is just not totally there. And now it's like, it saves me three minutes a day, and it takes me thirty minutes to build. Like, the payback period has just become insane for a lot of this tooling.
- CVClaire Vo
I, I love it. So let's build one!
- JMJoe McCormick
Yeah.
- CVClaire Vo
I wanna see what your flow is for actually building one of these
- 14:37 – 20:28
Live building a link summarization extension
- CVClaire Vo
things.
- JMJoe McCormick
Yeah, so before we build it, I'm gonna talk about what I want to build. Um, so one thing that comes up a lot in, uh, in the Babylist Slack world, and probably in many other company Slacks, is, uh, people send links all the time. Um, and for me, uh, I often just hit, like, the Save for Later button, and then maybe at the end of the week, I, like, decide if I wanna read them. But I've realized that, like, uh, maybe it's the, the do the thing that takes you one minute, uh, in the moment, instead of keep deferring it. I think it would be great if there was an easy shortcut where I could have an AI go off and fetch this article, give me the key takeaways, and then I'll decide, do I wanna actually do a full read and save it for later, or do I just skip it in the moment, and have that all work in under five to ten seconds? I think it's much more powerful than, uh, deferring and having this big to-do list at the end of the week when I wanna catch up on all these messages.
- CVClaire Vo
So you're gonna show us how, how you built this. And what I have to say is all of us are so overwhelmed with so much context and links and docs, and yet, you know, I would see something like this, whether it's, um, a partner or a competitor or just something somebody found that was interesting, and you wanna go, "Oh, yeah, I should definitely read that." But should you? Should you definitely [chuckles] read it? So this quick summarization, um, is, is a great idea. And so you're gonna walk us through how to actually code this up using Claude Code, I believe.
- JMJoe McCormick
Yeah, and I'll jump in along the way with a couple of, like, Claude Code tweaks that I've, uh, made or at least, uh, lean into to try and make it a little more screen reader accessible. But again, I similarly think that lots of things that are, uh, good for cool screen readers probably are, are gonna be good tools for everybody across the board.
- CVClaire Vo
Great.
- JMJoe McCormick
Um, so let's, let's jump right in. So I'm gonna switch over, uh, to, uh, my terminal for just a second, so I can initialize our project. So I'm gonna run a mkdir command to get our repo set up. [keyboard clacking] So we have our Slack summary extension. I'm making that directory, and I'm just open up Claude Code quick on this. Or should I open up VS Code on this? So VS Code opening and initializing here. I'm gonna open it up as big as I can, and we are going to jump in as this finishes loading. So I'm gonna start here by making a PRD, like every good How I AI podcast goes.
- CVClaire Vo
That's exactly right.
- JMJoe McCormick
And I'm gonna do this with audio. Um, so sometimes I'll use Whisper Flow for what I'm doing, but in this case, I actually find the VS Code Copilot audio to be pretty good. And so if I'm just doing something kind of quick like this, I'll just end up using the Copilot integration. So you'll see I can do, uh, Control I, and then when I hit Control I again, it's gonna dictate, so I'm not gonna- I'm gonna pause my talking and switch in that mode, and I'm just gonna dictate out for us a little bit of this PRD and then see what it comes up with for us. We want to build a simple PRD for a locally run Chrome extension, whose job is it to exist in Slack alone, and when focused on a Slack message, you can hit the keyboard shortcut Control+Shift+1, and it will search that message to find any external links. If there are external links found, it should open them up in hidden tabs, extract their content, and send it off to OpenAI to summarize. And we'll see here, I just finished that. It's gonna go off and quickly generate a small PRD for that. Um, this doesn't take very long at all. I'm just gonna accept all the changes, because I find reading in that diff view to be particularly painful with a screen reader. Um, it's not, like, terrible, but it's much easier just to read it in the document. And since there's a new document being made, I'll now look at it here, and we'll see how this looks.... So we have our goals. We want privacy, security. Makes sense. We have some user stories here. We have some functional requirements. It's got a parse, so all is making sense so far. Some non-functional and some out of scope. We want images. Yep, I already demoed my image processing, and some open questions. Ah, where and how should we summarize? Makes sense. We don't need success metrics for this, it's internal. So let's answer some of these questions. So let's just select it all and add it. I'm gonna dictate again, so I can hit Control + I, and I'll start. We want to build a very simple PRD here for a locally run Chrome extension, where the job is, when in Slack, in Chrome, and hovering over a message that has focus, you can run the keyboard shortcut Control + Shift + One, and that will look for any external links in the message. If any are found, open them up in hidden tabs, and extract that content and send it over to OpenAI. When in OpenAI, we should summarize them and extract three to five key takeaways from the article, and return those to the user in a fully screen reader-accessible modal, which includes the article's title and a link out in a new tab to view the article. Okay, we now have this generating the PRD. So we can see it talked about our keyboard shortcut, it's got our goals, minimize user effort, extreme readability is key for this, some basic user stories, some functional requirements. Cool, this looks good.
- CVClaire Vo
And one thing I, I have to call out here is, you're an engineer, and that was a pretty good product description, and that resulted in a pretty good PRD. So one of the things I like about AI is, um, as I say, there are no lanes. If you are an engineer and you have an idea, you can write a very good, uh, PRD using a little bit of AI assistance. If you need a tip, I know one or two tools that can help you with it. [chuckles]
- JMJoe McCormick
Yeah, and this is no, yeah, no custom PRD, uh, prompts or anything like that.
- CVClaire Vo
Yeah.
- JMJoe McCormick
This is just, uh, mostly the foundational models
- 20:28 – 25:30
Using Claude Skills to extract common patterns
- JMJoe McCormick
work here.
- CVClaire Vo
Yep.
- JMJoe McCormick
Cool. So now we're gonna hop into a new tab, spin up Claude Code in here. And ahead of this, as I mentioned, I'd, I'd built a couple of these Chrome extensions, so because I've done a few of them, I ended up building out a Claude Skill to help me build more Chrome extensions. Um, so after I built this, the first two, I had Claude look at both and figure out what was the patterns that were common across them, and work on a skill, so I could just build the third, the fourth, the fifth, in a much simpler version, um, and, and extract out that common piece. I have found Claude Skills to be a mixed bag in terms of actually being picked up automatically. Um, so I am gonna explicitly say, like, "Use the skill," but technically, you're not supposed to need to do that, but I've definitely found that that's, uh, buried in its approach. So I'm gonna make a prompt here that's gonna more request the skill.
- CVClaire Vo
And for folks that are wondering how to set up their own, um, skills, we do have a How I AI episode. It is Introduction to Claude Skills, where I explain that Claude Skills are files in a, in a folder. Sometimes they're zipped files in a folder. So if they seem mysterious to you, go check out that mini episode from, um, I think it was October or November, and learn how to make your own Claude Skills. They're pretty useful.
- JMJoe McCormick
Yeah. So in here, we're-- the prompt's just gonna be, um... So first, app mention the PRD, so we have that. [keyboard clacking] And we're say, "Use the Claude Skill for creating Chrome extensions to build out this PRD." And one thing, um, Cru- uh, Claude Code has added this feature where you can edit a prompt, instead of just in the terminal, in a code file. And so in Claude Code, if you hit Control plus G, it will open that prompt in a text editor. So especially for me, where navigating that terminal is not super screen reader-able friendly, I now am at, uh, navigating it in the same place that I write code on a day-to-day basis, which is very screen reader accessible. And so again, other people may find this view useful. You can craft deeper prompts. You can, like, Control + F in here. You can do whatever, it's just a file, and so I think it's a really useful tool they added a few versions ago to make it a little bit easier to work with. And I'm just gonna put a note here, um, "Use my OpenAI key from my shared Chrome extension config." So I've had some... I don't want to have to keep pasting my OpenAI keys and stuff like that, so I end up pulling out some shared config to share across all my Chrome extensions. That way, I don't need to rinse and repeat that step over and over again. So I save this file now, and now whenever I close this, it's gonna replace my prompt in Claude Code with that completed prompt.
- CVClaire Vo
Oh, interesting!
- JMJoe McCormick
Yeah, so I think it's super effective, especially if you wanna do a deeper prompt, to use this-
- CVClaire Vo
Yep
- JMJoe McCormick
... and not have to worry about the, the whole terminal side of things. Cool, now we're gonna kick this off. And so I do have this Claude Code session right now in planning mode. Um, you'll see it's requesting, "Use the skill," which is great. We wanna use our skill. I'm gonna shrink this terminal, so we can see more clogs. We don't really need this as much. Okay, it's gonna run some commands, um, that actually pull in that Claude, uh, that Chrome extension config that I talked about. Another thing I have to make Claude Code a little more, uh, accessible is, again, I'm not necessarily seeing everything that pops in as it's going, and so I set up a Claude hook that whenever Claude needs user input, it will, um, basically, like, ding a bell on my computer, so I could hear, uh, a sound that is like, "Oh, Joe, you need to do something right now to work on this." Um, I actually don't want Claude Code to read this file, so I'm gonna say, "No," because this is, is some secret configuration. So I'm gonna say, "No, don't read that file," uh, "Now that has a API key in it. Don't read it. If you need-... you then use jq to extract the keys. So again, luckily, I am an engineer, so it's not like fully vibe coding this, uh, from scratch. I know that there's a utility that will extract just the keys out of a JSON object, and Claude won't be able to see the values, which is the actual, uh, secrets there. So now, here we are. It's, um, just gonna make a symlink to this extension for us, uh, so we don't need to worry about the configuration. And if I ever do change it, it will automatically update in all my extensions. So instead of having each one have their own API key, and my key becomes invalidated for some reason, um, having to update all of them, I use this, uh, concept called a symbolic link. So just link the same config file into all my extensions, so one change fixes everywhere.
- CVClaire Vo
Yeah, and this is one of those things that's just easy to do when you're running these things locally, or just building stuff for yourself, is you just make the maintenance and, um, uh, the maintenance and deploy of these really easy for yourself. And make [chuckles] it as simple as possible for you to repeat, um, building things and using the same, for example, API keys. And, you know, when you wanna share all these publicly and publish them to the Chrome Extension Marketplace, we can, we can do a little cleanup.
- 25:30 – 27:45
Reviewing and modifying the development plan
- JMJoe McCormick
Exactly. Um, so here's our plan, and just like before, Control G also works to edit plans in the editor. So similarly, this is, uh, gonna be a pain to read in this terminal for a screen reader, but also, if I wanna make a tiny tweak to one thing here, I don't need to worry about telling Claude to update it or write it to a file, I just hit Control G. Opens it up in this file, and now we have our full plan here, and you can just tweak different parts of it if you want, and, uh, and modify it. So it's another, another great usage of the Control G shortcut.
- CVClaire Vo
Yeah, I wanna call this out for people, because so many folks would get something like this, and then if it was wrong, kind of say: "No, this is wrong. Please update X, Y, Z or A, B, C." And, you know, you're calling out, not only is that a pretty inefficient way for you to interact with this file in terms of accessibility and your need for a screen reader, but it's also just not the fastest way to give it feedback. And so your ability to just take this, uh, move it into code, use this Control G, I believe, edit it, close it out, run it, is just, again, a lot more efficient.
- JMJoe McCormick
Yeah, and so I'll do a quick run through of this plan. It's, uh... I'm using now, again, some more keyboard shortcuts just to break it down and, and, uh, fold the markdown headings, so I can-- from my perspective, it's hard for me to visually... or I don't visually scan the page. So reading through a big file, I typically, in code or markdown, rely on folding, so I can collapse different sections and read them, and then expand only the sections I care about. So I don't care about certain aspects here, but, like, maybe I wanna get deep into the error handling piece, so I'll expand that section and just, uh, read this part. So we got some just logging, some key patterns, but this plan generally looks good to me. So now I will just save this, and again, close the file, and that is what is over here in the prompt. 'Cause I didn't modify it, it, it didn't take any time, it was just ready. But if I modified it, it would take a split second while it loads that new plan in, and then it moves forward.
- CVClaire Vo
Yeah, and I just have to call this again, out again, for people who are maybe listening to the podcast or, again, are not paying attention to what it means to use a screen reader here, which is you got your little- your little headphone plugged in right now. And,
- 27:45 – 31:40
Removing cognitive friction for users through repeating patterns
- CVClaire Vo
um, I am so impressed that you're using the screen reader while walking us through this demo. And what I think is so, so fascinating about watching your workflow here is, it's super efficient and very fast, even if you don't take into [chuckles] account you're using a screen reader. So the fact that you've been able to build these shortcuts, these tools, use Claude Code in a more effective way, um, and then you add on this layer of, and it makes using this, this kind of screen reader a lot more accessible to you, is just very im- impressive. And I don't want people to miss that there's a, this invisible layer that we don't get to see or hear right now, that you're also putting in between this, which adds a little bit of micro friction.
- JMJoe McCormick
Yeah, and I think, uh, one thing that's great about Claude Code, uh... So, like, right now, um, visually, one of the options is selected in blue. I don't necessarily know which one it is, and using the arrow keys, it does not tell me, with the screen reader, what's selected. But Claude Code has done a great job of standardizing, where one means yes, um, uh, two means often yes, but, like, manually, uh, with a, with a variation, and then three is, is like, no or, or type something extra. And so I can, uh, I can basically, instead of using the arrow keys and Enter, I can just be like, "Yeah, I want, I wanna just move forward here," and so I hit the, the number one.
- CVClaire Vo
Yep.
- JMJoe McCormick
And so they've done a good job of using, I think, lots of different inputs and lots of different ways to make this a little more accessible as it goes, as well.
- CVClaire Vo
Yep, and so that consistency, again, maybe this is for folks that are building AI products and trying to reinforce workflows, especially for folks that are building maybe these terminal UIs, that I think are really lovely and interesting to build, is you want... You know, people love the terminal because it's so fast, and you want it to both be performance fast, but you also want it to be UI/UX fast. Which is, if your user always knows one is X, two is Y, three is Z, then they can consistently use these keyboard patterns of one key or two keys to efficiently get through your UI. And I think, you know, taking that mental friction, that cognitive friction, off a user by driving consistency and patterns that people can either explicitly or implicitly learn, is a really useful tool when you're using UI. Um, that is more constrained, uh, because, again, a terminal UI is naturally constrained to basically text.
- JMJoe McCormick
Yeah, and Claude Code is, has been working on and, and has released a, uh, VS Code extension that is more of a, a GUI.... I've just found that it's a little bit lacking behind some of the, the latest and greatest features, and I'm like, "I want everything immediately." Uh, so I'm a little spoiled, but I, I think that's gonna catch up as we go, too, and, and maybe a potentially more screen reader-accessible option for some of these things, too.
- CVClaire Vo
Yeah, and again, we love a, um, beep boop at the end of an agent completion. So, uh, I love, I love the cursor sound. I love, um-
- JMJoe McCormick
Mm-hmm
- CVClaire Vo
... that you're using one here for, for Claude Code, because, again, I'm presuming with your screen reader, you're not gonna read this whole stream of what-
- JMJoe McCormick
Yeah
- CVClaire Vo
... Claude Code's-
- JMJoe McCormick
Too much going on.
- CVClaire Vo
No, that's too much for you.
- JMJoe McCormick
And I think it's a big difference, too, between, like, vibe coding things and, um, like, production-quality code, right? The final output here is just for me. It's gonna run in my Chrome. I don't really care what the code looks like at all. Um, versus when I am building software for my, my, uh, full-time day job, uh, and actually building stuff that's going to be in the hands of millions of users and many developers, I do, uh, do things a lot differently. The plan, I'm gonna read very detailed, what it's gonna actually do. Um, the, the code I'm gonna be reading, I'm gonna be doing smaller commits and reviewing it, uh, kind of chunk by chunk and, and getting much more detailed.
- CVClaire Vo
Mm-hmm.
- JMJoe McCormick
In this case, uh, yeah, when the final output is, is just a user of one, the, the code quality, uh, is, is a lot less important.
- CVClaire Vo
Yeah, what matters is, does it, does it work?
- JMJoe McCormick
Yes, exactly.
- CVClaire Vo
And, uh, uh, does it work, and a little bit of like, did you leak your API key?
- JMJoe McCormick
Yeah.
- CVClaire Vo
Those are the two things [chuckles] we wanna, we wanna worry about. But other than that, we are, we are
- 31:40 – 34:55
How to get fluent with AI tools
- CVClaire Vo
on our, on our way. And, you know, the, the other thing that I think is really fun here is because you've built... I think the idea that you pick a platform or a framework for a set of your personal software and then establish best practices through a skill, and then just rinse and repeat for other use cases, is a really good way to get super fluent with some of these AI tools. And so I've seen a lot of people say, "All I do is build markdown-based repos for my documentation and everything else. Everything I build is just a markdown-based repo, and then I've gotten really good at using Cursor for this." And then, you know, you have this example of, uh, every- you know, not everything, I'm sure, but like, a lot of what I build are gonna be Chrome extensions. So I'm gonna make, you know, this framework, get it going, and then I can get really good at Claude Code, because I'm not relearning a technology and on top of relearning a tool. And so I do think it's, um, you get compounding effects by staying in the same technical space when you're trying to learn these vibe coding tools, because you're not trying to learn-- oh, you're not trying to learn on two fronts. You're trying to learn on just the tool, [chuckles] the tooling front, and some of the technical pieces have already been established.
- JMJoe McCormick
For sure, and I think, as we talked about before when we were starting, like, return on investment gets better and better, because building this one as my third takes half the time as the first one, and when I build the fifth, I think it's gonna take a, a fifth the time of the first one. Like, I am getting better. The Claude Skill, each time I build one, I'm just gonna feed it back in and be like: "What was the skill missing? Like, make, make it better." And so I think it makes, uh... the return on investment may turn out to be, like, two days of payback period or something crazy. Um, so yeah, it does, it does feel like it's cool to, to, I think, spread out and try many different things, but it does also just feel great to be like, "I have an idea. It is in my hands in under 30 minutes." Like, it's, it's just very cool.
- CVClaire Vo
Yeah, and again, this is one of those things where I tell people to work on their anti-to-do list. And when you have a recurring task or a recurring point of friction, where you're constantly, like, opening links from Slack in a new tab and then trying to come back to them later and read them, or, you know, you would be constantly doing this, like, let's zoom in on this image and figure out what it is, and is it something I need to worry about? When you have those recurring tasks, it's 100% worth it, instead of spending the time on the task itself, to spend the time never having to do that task again. And I like this idea of the payback period of personal software basically collapsing to zero, because it really just illustrates where we are in terms of the efficiency and value out of AI, which is, it is much more important to learn to build some of these tools than to do the task right now. Like, the payoff is so much higher to learn how to automate the task versus doing the task, and if you can just do the, um, change your muscle memory to every time you do the task, pause yourself and say, "Actually, I'm gonna learn [chuckles] how to automate the task," you can really, really create a lot of leverage in, in your, um,
- 34:55 – 36:19
Loading the extension into Chrome in developer mode
- CVClaire Vo
day-to-day life, even in your personal life.
- JMJoe McCormick
Yeah, for sure, and we are just about done here. So I was checking back in on our little, uh, to-do list, which it just finished. Um, so it's doing some final steps here, but we're just about ready to actually load this in. Um, right now, it's just kind of analyzing to see, uh... Yeah, perfect. So it's running this one last step, which is going to be, basically telling us, "Hey, go, go load this in." So once we are, uh, done with this, we need to, um, actually load it into Chrome. So Chrome has a mode for extensions called Developer Mode. You'll see I have that toggle on at the top, and it basically means that you can install extensions not from the Chrome Web Store. You can install extensions, like, from your local computer. So you don't wanna generally have that on, 'cause, like, somebody could have sideloaded in some, uh, some credit card skimmer that you've imported or something. But if you know what you're doing, uh, and you know you just built a thing, you can go in here and turn this on, and then this looks a little bit different once you have this on, compared to probably you guys who are not having this on, look in your Chrome Extensions world, because we have these options on the side here to load an unpacked extension. So this means an extension that's not, like, fully deployed in the app store. So let's hit this, and this is gonna open up our little, uh, file browser here. So let's just pop back, and we had called this-... Slack summary extension. Okay, so that is now loaded in.
- 36:19 – 40:44
Testing and debugging the extension
- JMJoe McCormick
The moment of truth, uh, with the truth of the software is you can only test it, uh, or easily test it in Chrome. Um, so we're gonna actually try it out. Uh, whenever you download a new extension, you do need to refresh your tab, so it picks up that extension. So if right now I tried to use it, it's still working with the extensions that I had at the time I loaded it. So I'm gonna refresh this Slack, so I can pick up our new, uh, extension here. And our moment of truth is gonna be, we're gonna focus on this message, and I'm gonna hit my shortcut of Control + Shift + 1, and we'll see, did we, uh, did we nail it?
- CVClaire Vo
Oh, look-
- JMJoe McCormick
For processing our link
- CVClaire Vo
... it even got the black color right.
- JMJoe McCormick
Yeah, it's kind of interesting. So it's processing-
- CVClaire Vo
On ramp
- JMJoe McCormick
... our link. We'll see. Did it work?
- CVClaire Vo
So maybe JSON.
- JMJoe McCormick
So it kind of work, but we, we have JSON here, right? It's not, it's not perfect. Uh, so let's, let's work on, uh, one, one level of refinement here. So I'm gonna take a, uh, a quick snippet shortcut of, uh, this. So we're gonna take a screenshot of this, and, uh, we're gonna send this back to Claude Code and say, "Almost." So again, it'd be cool if we one-shot it. Uh, that'd been a really cool demo, um, but it's not a perfect one shot.
- CVClaire Vo
[chuckles]
- JMJoe McCormick
So we'll make one slight tweak here, and I'm actually gonna use a custom slash command I wrote to deal with screenshots. So because I'm developing on Windows, but I actually run Claude Code in, uh, this thing called the Windows Subsystem for Linux, it doesn't have access to my Windows clipboard. So I can't do what everyone else can do, which is just, like, hit Control + V here and paste it. So I added a s- a slash command here called Paste Image that uses some PowerShell shortcuts to pull images out of my clipboard and share them with Claude. And so I can take that snippet and share it. And again, this is similarly, I was like, would copy a file, and I'd like, save it in Windows, then move it to Linux, and then import it with an @mention. And I was like: There's got to be a better way. And I use the slash command now all the time for, for building stuff out.
- CVClaire Vo
This is, this is extreme software engineer stuff, where you're like, "Okay, I run on this OS, but I run my terminal on this, and [chuckles] now, now I can't access my clipboard, but I still like it that way. So I'm gonna write a little script to give myself a u- a two-word shortcut to make this happen."
- JMJoe McCormick
Yep, and so it ported this, and it just... It drops the screenshots in our, in my TMP directory. So I just have to say-
- CVClaire Vo
Got it
- JMJoe McCormick
... "Yep, yep, please read it." Um, so again, it's, it's saving those, uh, those two minutes every day, uh, adds up, adds up fast. Um, and so it's just gonna fix this little JSON piece here. And again, it was kind of... It was close. It, it, it got the right content, just didn't display it right. And so now it's gonna go off and work on this for another second here, and we should hopefully have a, a quick update. And the nice thing is with, uh, with Chrome extensions in the Developer Mode, there's just an easy one-click button where you will update and grab the latest copy of all of your extensions. So if as you're working on multiple at a time or whatever, you can just hit that button, and it's gonna update for us. So we'll see. It's finishing up here.
- CVClaire Vo
And one of the things that it's doing, just I'm calling this out for people who are writing, um, queries to OpenAI, it's moving the JSON out of the prompt, which it's saying, "Please return JSON." I think it's actually just returning JSON as a, um, string in text, and it actually moved that to change the response object to being JSON. So then it can actually be read by the, um, [tsks] by the Chrome extension in a, in a more structured way.
- JMJoe McCormick
Control + Shift + 1. See if it works.
- CVClaire Vo
We're doing drumroll.
- JMJoe McCormick
Moment of truth.
- CVClaire Vo
Beautiful!
- JMJoe McCormick
We got it. Nailed it. So again, we've got these takeaways. We can now action on this, and I can decide: Do I actually care that Iterable added MCP? I do, uh, spoiler alert. Um, but yeah, again, we did it, uh, I think under, under 25 minutes here.
- CVClaire Vo
And is it working in your screen reader? Did the accessibility pieces-
- JMJoe McCormick
Yeah
- CVClaire Vo
... fall through?
- JMJoe McCormick
Fully, fully accessible. Um, so this modals in general can be sometimes problematic, um, because a screen reader will sometimes read behind the modal.
- CVClaire Vo
Yep.
- JMJoe McCormick
Um, but surprisingly, although not, a bunch of the web is not accessible, if you tell some of the foundational models, like, "Please make this accessible," the accessibility standards are actually incredibly well documented, so they, they do actually a great job with this. So they, they use the right, uh, it's called ARIA, A-R-I-A, roles and make this modal pr- have the right focus, not let you read behind it. So, uh, out of the box, they're not gonna make everything screen reader accessible, but you say, "Hey, go do this," it, it'll gladly go follow the spec and make it
- 40:44 – 42:12
Quick recap
- JMJoe McCormick
accessible.
- CVClaire Vo
So this is a meta question, and maybe before, before I get into the meta question, let's just recap for folks what we saw. So you built a Chrome extension that's focused to the web version of Slack. Um, that, that Chrome extension that's running locally because you've toggled on Developer Mode in your Chrome settings, will take a focused link that's shared by a colleague or somebody in Slack. It will go out. It will parse that link. Well, see if there are any links in it, it'll parse it. It'll take some key takeaways. The way you built this is you bopped into VS Code, you dictated a short PRD, um, you let AI kind of build that out. You made minor tweaks to it, but basically shipped it. You used Claude Code, including some custom slash commands, [lips smack] um, and a Claude skill, specifically around building, uh, Chrome extensions, to then scaffold out that Chrome extension. Um, you showed us Control + G in Claude Code, where you can actually just modify prompts and inputs as code, which is much more efficient, both from an accessibility perspective and just [chuckles] general user experience perspective. And you, uh, showed us a custom screenshot, so your very special, um, as I say, you know, unique snowflake, software engineer environment can operate as if you, as you want, even if there are some technical hurdles. And then now we have this great little extension that I want running on, on my app. So
- 42:12 – 49:17
Lightning round and final thoughts
- CVClaire Vo
this is great, Joe. I love this. I wanna hop into some lightning round questions, and this has given me an idea of one that I really wanna ask you, which is about-... MCPs.
- JMJoe McCormick
Yeah.
- CVClaire Vo
So one of the things that I think are, is so interesting about MCPs is it allows you to bypass all UI and just get to the bones of what a SaaS product does. And I can imagine that while there are lots of okay accessibles enterprise software products, not all of them are building for maximum accessibility, either in their design or in their kinda underwa- underlying way they're implemented. Have you found MCPs and just that interface into some of these SaaS tools has improved accessibility for you? Has not? What are your thoughts there?
- JMJoe McCormick
I think the ultimate goal, uh, most of, of mine is, uh, I would love to do everything in one place and not have to switch tools, like, whether it's a context switch cost or just a, a switch cost. So having MCPs has been great for that. Um, luckily, I actually think lots of enterprise software surprisingly, is being built pretty accessibly. Like, I wanna really give strong kudos to, like, Google Docs. Google Docs, for what it does, is so crazy accessible, and the work that, uh, people don't know that goes in to make it that, like, every single thing that is being done is being communicated to the screen reader, basically letter by letter, um, via this, like, secreting- uh, this secret system that people don't know about called, like, ARIA live announcements, is kinda crazy. Um, but I do find, like, "Hey, I need to get something from three sites," like, that's kind of painful. Like, can I just use the Notion MCP and Google Docs MCP or Glean in there is great. Um, and frankly, No- Notion is one that is a little bit harder. I think they, they do their best from an accessibility standpoint in some ways, but there's a lot going on in a Notion post, uh, in a Notion article. So I think that's another example where it's like, yeah, I can just pull this down and work in the markdown version. That's gonna be a lot easier. Um, and again, I've got... It's way easier for me to navigate with these, like, keyboard shortcuts and the folding feature, so I will pull down some Notion posts and just be like: Dump this into a markdown file for me, and then use my little code shortcuts to help navigate some of those pieces.
- CVClaire Vo
Yeah, I love that. And so my second question, again, is around personal software and the ability to translate. Sorta as you said, you can take a pretty complex Notion page, turn it into markdown. That allows you to, you know, read and parse it in a much more efficient way. And so I think this ability to translate files or formats is a really exciting part of AI, and we've hinted at a couple things we've seen in this episode, a little bit of, like, image to text, a little bit of voice to text. But I'm curious for you, what are you most excited about in sort of like the multimodal world of AI? And, you know, what recently has come out that's, you know, caught your eye and made you excited, or what are you hoping to see in the next couple months or years that you think could really open up stuff, either for you personally or just as a product builder?
- JMJoe McCormick
Yeah, I'll talk in the personal space. So I have, uh, I have two kids. I have a five-year-old and a three-year-old, and, uh, reading books to them is, is a challenge. I don't know braille. I've, uh, been trying to learn, but it's a hard skill to pick up at 33. Um, and so I've memorized a handful of books, um, and I'll read those, but it's, uh, it's not really reading, it's fake, fake reading. Um, but I- uh, big shout-out to the Gemini app and its, uh, like, live share features. I can now read any book. Uh, it's not gonna be me reading it, but me and my three-year-old, Cole, will sit on the couch, he'll bring a book over, and I'll be like: "Hey, let's... I, I can read this one," or, "No, Gemini can read this one for us." And we'll turn the pages and say, like, "Gemini, next page," and Gemini will read that page, and then we'll turn to the next one, and it'll read the content of that. And so I think just, like, equitable access to everything, um, is, is great, and that piece is one thing that I was always afraid of, is like: "Can I read stories?" I can memorize stories, I can tell stories, but there is something to just, like, your son being like, "I wanna read this book," and you having to be like, "Sorry, I can't." And now that, "Sorry, I can't," becomes like, "Sorry, I can," with the assistance of so many different tools now. But I think the Gemini one is particularly useful, and I found that one to be the strongest for just, like, easy sharing, just saying, "next page." It knows all the context and immediately starts reading it. Um, I have, like, Meta glasses, I have the ChatGPT Pro, I've got all these things, but I, I think Gemini is doing the best job of it right now, and this is before me trying any of the Gemini Pro, uh, 3 that just came out to see if any of that makes it even better.
- CVClaire Vo
Well, that is a very, very sweet story. And yes, I was just thinking, I wish, um, the Meta glasses, which I love, um, and, and use every day, would, would, would also help do a better job there. But it's awesome to hear that Gemini can add to that, that special time with kids, which, as you and I were talking before the show, you know, we both are boys- uh, parents of boys, is just, is, is such a special time. So my last question is, when AI is not listening, and I'm curious if you type or if you speak this, what are your prompting techniques? Have you ever whisper, flow, yelled at your [chuckles] AI, AI? Or, you know, do you have any tricks for us when AI or Claude gets really, really stuck?
- JMJoe McCormick
It's kind of, I think, a, a nerdy answer, which makes sense for, for me, but, uh, I... My typical mode is basically, like, clear context- clear the context, and, uh, and start fresh as much as possible. I think, I think a lot of people, uh, will, will try and, like, keep massaging it and being like, "If I just send this one extra prompt in this conversation, it'll figure it out." It's like, no, you just have to start from scratch and take the learnings that you have from the last time. Um, and so sometimes I'll be like: "This hasn't been going great. What did you, what did you learn about this?" Take that and feed that into the next prompt. But most of the time, I used to be like: "Let's start from scratch." Something clearly got poisoned in this context, and we start from scratch, I feel like it just, everything just feels smoother.
- CVClaire Vo
I love it. Well, Joe, thank you so much for showing this. I think it's just one of those, uh, workflows that we haven't seen before. Everybody can find a use case. I am thinking of all sorts of little micro frictions in my own life, where a keyboard shortcut or two could really make things a little bit better for me. So where can we find you, and how can we helpful to you?
- JMJoe McCormick
Yeah, um, so I, I mentioned I'm, uh, at Babylist. Babylist is very actively hiring. Um, and if you are somebody who likes using AI in your day-to-day building of, of software, and you're a, a software engineer, uh, we are a Ruby on Rails and React shop, um, but hiring across the board, all different levels. So, uh, check us out on, uh, on babylist.com. And, uh, personally, I'm on LinkedIn as well, so feel free, especially if you have any accessibility questions or any questions on some of the Chrome extension piece, I'm always happy to answer on LinkedIn.
- CVClaire Vo
Well, thanks for joining How I AI.
- JMJoe McCormick
Thank you. [upbeat music]
- CVClaire Vo
Thanks so much for watching. If you enjoyed the show, please like and subscribe here on YouTube, or even better, leave us a comment with your thoughts. You can also find this podcast on Apple Podcasts, Spotify, or your favorite podcast app. Please consider leaving us a rating and review, which will help others find the show. You can see all our episodes and learn more about the show at howiaipod.com. See you next time! [upbeat music]
Episode duration: 49:17
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode sibufEEhH6A
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome