Aakash GuptaInside a $400K AI Product Sense Interview (Amazon, Meta, Google, OpenAI)
EVERY SPOKEN WORD
45 min read Β· 9,483 words- 0:00 β 1:48
The AI PM round that decides your offer
- AGAakash Gupta
AI PM jobs are exploding, but even PMs with 10+ years of experience keep failing the interviews
- AVAnkit Virmani
These are the questions you have to prepare to answer
- AGAakash Gupta
OpenAI and Anthropic have 5% interview pass rate. If you bring the old playbook, you are going to fail
- AVAnkit Virmani
My mentees have 30 to 40% pass rates at these very same interviews, and that is still climbing
- AGAakash Gupta
Ankit Virmani just finished job searching at all of the top companies and cracked every interview with multiple offers
- AVAnkit Virmani
One round that keeps showing up at the top AI companies is AI product sense, and it's a completely different animal from our traditional product sense interviews
- AGAakash Gupta
You mentioned a couple of the top companies there. How much do the companies actually pay AI PMs today?
- AVAnkit Virmani
Based on the stories we have heard, the, the sky is the limit for a lot of these companies if you are an absolute rockstar. But even otherwise, there is a pretty solid understanding that the numbers are absurd
- AGAakash Gupta
All of my knowledge and all of Ankit's knowledge, we are giving it away to you right now
- AVAnkit Virmani
This is the last interview you should watch before going to crack your AI product sense interview
- AGAakash Gupta
Before we go any further, do me a favor and check that you are subscribed on YouTube and following on Apple and Spotify podcasts. And if you wanna get access to amazing AI tools, check out my bundle, where if you become an annual subscriber to my newsletter, you get a full year free of the paid plans of Mobbin, Arise, Relay app, Dovetail, Linear, Magic Patterns, DeepSky, Reforge Build, Descript, and Speechify. So be sure to check that out at bundle.aakashg.com. And now into today's episode.
- 1:48 β 3:41
Why behavioral gets you in but AI product sense gets you paid
- AGAakash Gupta
You land all your interviews at top companies, but you never pass. That's because top companies ask really tough questions. How would you position ChatGPT to beat Claude in coding? How would you help xAI become the number one AI? These aren't easy questions. It involves product sense, but it involves a lot more than that. Ankit Virmani has seen it all. He's been at BCG, Amazon, Meta. He was a group product manager at Meta, so he himself interviewed hundreds of PMs on product sense rounds. He is what we call a calibrated person. He actually understands what the bar really is at different levels within FAANG, so he has unparalleled insight that you are not gonna get from watching another video. Let's be honest, 99.9% of the videos you get on YouTube are two things. One, they're somebody who worked at FAANG like 10 years ago, or two, they're somebody really junior who just cracked FAANG. Here we have somebody very experienced who has actually just gone through and cracked offers at Uber, Stripe, Cisco, other top AI companies in the last few weeks here in 2026. So he has updated calibration, and today we have prepared a deep dive into AI product sense unlike anything in YouTube. He's gonna walk through what the actual questions he was asked were. We're gonna do a full live mock, but we're not just gonna end it at a mock. You know most of those mock videos that are 30 minutes. We're actually gonna analyze the mock so that you can replicate the successful techniques in this mock interview. So if you stay till the end, you will not only walk away with an understanding of what questions they're gonna ask, how to answer them, but how you yourself can get to that level. So Ankit, thanks so much for being on.
- 3:41 β 6:59
Ankit's job search and the round that kept showing up
- AVAnkit Virmani
Thank you so much, Aakash. Very excited to be here again.
- AGAakash Gupta
Ankit, you just completed an AI PM job search, and you said something that blew my mind literally, [chuckles] which is that you received very few AI interview. They were all just general interviews, a lot of behavioral interviews, in fact. But you did face one AI-specific round. Can you walk us through exactly what that was and what happened?
- AVAnkit Virmani
Definitely. And Aakash, this is also keeping in mind that every role that I was recruiting for within these companies was an AI role. And here's what really surprised me. Previously, when I went through rounds with Meta, Google, and more recently with Stripe, Uber, and a few others, the majority of the rounds, I would say 70 to 80% of the round, are still very much the traditional behavioral product sense and execution interviews, such as, you know, "Tell me about a time you influenced without authority," "Walk me through a product that you shipped." Very standard stuff. But the one round that keeps showing up at the top AI companies, whether that be OpenAI, Anthropic, Google's AI team, um, Meta's GenAI org, is AI product sense, and it's a completely different animal from our traditional product sense interviews. Um, in a traditional product sense interview, as you know, you're designing features for a very deterministic system where, let's say user clicks a button, something pred- predictable happens. You can use CIRCLES. You can follow a, a bit of a template, and honestly, you can pattern match your way through it. AI product sense completely flips that on its head. You are designing for a probabilistic, for a non-deterministic system, and the model's output varies every single time. Um, it can hallucinate. It can cost real money per query, and safety isn't a nice-to-have. It is critical to the system itself. So for me, the, the moment it truly clicked was when I got a question, um, which was very AI product heavy, and I realized that every decision I made from segmentation to solution had to account for what a model can and cannot do and what it costs to run, what happens when it is wrong, et cetera. That's not something your traditional frameworks can truly solve for or teach you.Um, and this is the kicker. This is the round that truly decides your offer. It decides whether you get the, the money that you get at the level, and whether you have any negotiation leverage going into an offer conversation. Behavioral will get you through the door, but AI product sense is what will separate you from candidate who get true and large offers from the ones that don't.
- AGAakash Gupta
That's a really important insight that I wanna double-click for you guys that I've seen not just with Ankit, but a lot of the people we've been advising in our Land a PM Job cohort program. The people who are really, really good at AI product sense, they're actually getting the offer at the level that they want. The people who come back to me and say, "Aakash, I was really expecting L5, I got L4," a lot of times it comes down to the weakness in their AI product sense round. And so let's make this
- 6:59 β 10:03
The 3 tiers of companies running AI product sense
- AGAakash Gupta
concrete. You named a couple of the companies where you have seen this. You have a really, really good view into the landscape because you work with different people in our cohort. What companies specifically are asking AI product sense?
- AVAnkit Virmani
Um, that is a great question, Aakash, and the truth is this is spreading really, really fast. Right now, as we talk, more and more companies are adopting the AI version of the product sense interviews, but I can broadly categorize these companies into three tiers. So the very first and the very obvious ones are the AI native companies, where it is a very core round. You know, uh, OpenAI, Anthropic, Google DeepMind. These companies have very clearly dedicated AI product sense round in every single PM loop. So for example, at OpenAI, you would get a question like: How would you double ChatGPT image creation with just three engineers, right? Um, then there are the Tier 2 companies, which are the big tech companies that have added it to their loops. Now, very recently, Meta is the big one here. They literally just added a fourth interview to the PM loop, um, which is the, the product sense with AI. It is very specific to certain teams or orgs within the company and for levels as well. But you would get a product case, and you're expected to use AI tools live in the interview. M- You know, we, we call it vibe coding, so is Meta calling it vibe coding. Now, you prototype with the interviewer watching. Google's AI teams, Amazon's GenAI org, Nvidia, they're all running these AI product sense rounds. Now, then you have the Tier 3 companies where they're weaving AI into the existing product sense rounds. Now, they, they may not ask about these questions upfront, but for example, in LinkedIn interviews, they will potentially ask you about: How are you gonna u- leverage GenAI capabilities while thinking through certain products and solutions? What are the implications of OpenAI and Anthropic launching and capturing newer markets, and how should their products rethink their strategy based on this changing environment? If you answer these questions without referencing the change in the markets, without model capabilities and so on, you're not going to make it. So to summarize it, the trend is very clear, right? The niche round was what was true 12 months ago, and now it's nearly in every single top paying PM interview process, and most candidates miss this, which is even at companies that don't have a dedicated AI round, AI fluency is being evaluated inside the regular and traditional product sense round, and the bar has very much shifted for this.
- AGAakash Gupta
That's something I'd triple-click on, which is that even if they don't have AI product sense in the list of interviews that your recruiter sent you, if you're
- 10:03 β 12:04
What AI PMs actually get paid in 2026
- AGAakash Gupta
interviewing for an AI PM role, expect this round. And you mentioned a couple of the top companies there. How much do the companies actually pay AI PMs today?
- AVAnkit Virmani
Well, based on the stories we have heard, the, the sky is the limit for a lot of these companies if you are an absolute rock star. But even otherwise, there is a pretty solid understanding that the numbers are absurd. Uh, these roles pay you absurdly well, and the data that we have seen from the market truly backs this up. So OpenAI, um, is very much top of the market right now. Median PM comps are in the 800K range, and, uh, the, the overall range runs anywhere from the 300, 400K mark to north of a million. Now, that's median. That's not the ceiling. Staff level PMs are definitely clearly seeing seven figures. Um, Google is right behind. Senior PM comps at Google teams run half a million, um, for the median, but can range very much into the multiple millions for a director and VP role. And the, the best part is that the equity is public. Anthropic, again, similar numbers starting from half a million, ranging up all the way into the, the high six figures. Now, their equity is pre-IPO, um, and their current valuation trajectory is pretty strong, which is great news for anyone who wants to join. Now, Meta also comes in with a solid half a million or so, uh, but ranges all the way up to north of, uh, a couple of, uh, millions for very senior levels. And the, the story is pretty consistent across every company that we are, we are talking about here, Aakash.
- AGAakash Gupta
And this is actual offers that we've seen plus levels FYI data, so this is no cap, guys. This is actual offers that people are getting at the median. So if you're highly experienced, you can expect to be at the top end or even above some of these ranges. All
- 12:04 β 17:06
Mock: 10x Claude Code weekly active users
- AGAakash Gupta
right. So shall we mock one?
- AVAnkit Virmani
Let's do it.
- AGAakash Gupta
All right. So the hottest company in AI right now is Anthropic. As you talked about that valuation trajectory, that revenue trajectory, actually, if we look at January twenty twenty-five, one billion, people are saying in March twenty twenty-six they hit twenty-five billion. [chuckles] This company twenty-five x'd. One of the drivers of that, with two and a half billion in run rate on its own, is Claude Code. Now here's the question they ask: How would you, Ankit, increase Claude Code weekly active users ten X?
- AVAnkit Virmani
That is fascinating, Aakash. Thank you for that question. Um, before I dive into answering this question, I'd love to ask a few clarifying questions. Um, first, I'm gonna assume that I am a PM at Anthropic working on Claude Code as a product. Now, this does matter because Anthropic has a safety-first mission, and that will very clearly shape how I think about growth. Um, the next assumption I'd like to make, and please tell me if this is fair, this is that let's assume a global market, specifically since Claude Code is available internationally already and the developer community is also global. Is that fair?
- AGAakash Gupta
Yes. Let's put focus on global, and you are a member of technical staff on Claude Code.
- AVAnkit Virmani
Okay, perfect. Um, and then the last assumption I'd like to make is I'll assume weekly active users to mean unique users who run at least one Claude Code session across any surface. We know that Claude Code today is present across terminal, um, VS Code or other IDEs, web, desktop, mobile, et cetera. So weekly active user means one session at least in any given week. Um, is that a fair assumption?
- AGAakash Gupta
Yes, and API. And if they just type in Claude into their terminal, they're not considered a weekly active user. They have to at least send one message beyond turning it on.
- AVAnkit Virmani
That is great clarifying, uh, point, Aakash. Thank you. So at least one message. Um, great. Now, before I try to answer the question, let me lay out how I want to approach this. So first, I'd like to walk you through, um, Claude Code's strategic context and why this growth question truly matters to Anthropic right now. [coughs] Excuse me. Then, um, I'd like to identify the ecosystem players and then pick a very specific segment to focus on. From there, I would like to map this segment's journey in Claude Code, um, either before they start using Claude Code or while they're using Claude Code, identify key pain points, which of course will be dependent on the segmentation. Uh, it'll be different for a developer, it'll be different for a non-technical user. Um, once we have come up with the pain points for the prioritized segment, I'll then brainstorm some solutions, and we'll make sure that we can solve at the application layer versus what requires, let's say, model-level changes, given that Claude Code is what is more of a harness, and the harness can use Opus or Sonnet or Haiku, um, and we'll focus on the harness part of this. Uh, once we have done so, I'll prioritize one of the solutions, which we'll take into V1 implementation. Does that sound like a reasonable approach?
- AGAakash Gupta
Yeah. The only thing I would say is that as a member of the technical staff on the app Claude Code, you still need to make requests to the model team.
- AVAnkit Virmani
Mm-hmm.
- AGAakash Gupta
So most recently, a couple requests that we got through, for example, was one million-context window in Claude Code for all plans across all models. So that actually came from the app team as a request to the model team. So don't just restrict yourself to the harness per se.
- AVAnkit Virmani
Definitely. That, that's great. So if in the solutioning we naturally have any model-specific requirements, then I'll incorporate it into the plan as well.
- AGAakash Gupta
Sounds good.
- AVAnkit Virmani
Perfect. Thank you. Um, is it okay if I take maybe thirty seconds to a minute to jot down my thoughts?
- AGAakash Gupta
Take your time.
- AVAnkit Virmani
Thank you.
- 17:06 β 20:05
Strategic context for Claude Code
- AVAnkit Virmani
Um, great. So let's start with first why a ten X growth for Claude Code even matters right now, right? Um, today I see Claude Code as representing a fundamental shift in how software gets built, right? Um, we're moving from an era where AI-assisted typing, such as autocompletion, to an era where AI operates as a very clear and autonomous coding agent, where I, as a user, describe the outcome and the agent drives the execution. Um, and Claude Code has organically turned out to be Anthropic's flash- flagship product in this transition. Right? Now, this matters enormously to Anthropic's business. Claude Code today, I think, is estimated at around two hund- uh, two point five billion in annualized revenue, and it's growing pretty fast from there. This is the single largest revenue driver in their portfolio, um, or rather in our portfolio, right? And growing weekly active users by ten X doesn't just grow a product metric, it actually strengthens this flywheel that AnthropicIs building, um, towards its safety research mission. Now, why is this very important right now is because, um, th-there are a couple of, uh, reasons. First is that OpenAI's Codex CLI just launched and is very quickly gaining traction and even some preference amongst the developer community for its token efficiency. Um, there are enough number of screenshots on Reddit where users are frustrated by how quickly the tokens for Claude are getting used. Um, the second is that Claude Code has just shipped a pretty large wave of features in Q1 2026. Um, I, a-again, it seems like almost every single day there were at least one to two features launched and announced, totaling up to something in the order of like 70 or 80 features over the, over the quarter. Um, and then the, the final one as you talked about, and this is the most interesting part, is we're seeing an unexpected user segment emerge here, which is non-developers, right? Product managers, um, such as you and I, founders, data scientists, operations teams, they are building real products and shipping fast with Claude Code. So with that in mind, I think what makes Anthropic very uniquely positioned is a combination of best-in-class code quality, uh, model, which is the Opus 4.6, and with a very safety-first approach towards, uh, building these agentic workflows. Now, with that, I'd love to come up with a mission statement for this growth, um,
- 20:05 β 22:53
Mission and the Cowork curveball
- AVAnkit Virmani
proposal that we're making. And the mission I'd like to propose here is Claude, uh, making you, or rather making anyone with an idea capable of building working software safely and autonomously. Before I proceed from here, Aakash, I'd love to pause and, um, just hear your thoughts if this seems reasonable. If this does, then I'd love to move into ecosystem players, which we would then break down into segments.
- AGAakash Gupta
It was really well articulated in all the spots that you covered, but there's one sort of blind spot, which is code is the basis of Cowork, and Cowork is actually our other fastest and most important growing product. And Cowork is not just about building working software safely and autonomously. It's actually about completing enterprise-grade workflows. And we don't talk about this super publicly. We're actually really worried internally because what we're seeing is that especially with the latest grade of this model, it can do everything a junior employee does.
- AVAnkit Virmani
Right.
- AGAakash Gupta
And [chuckles] on this team, we actually want to be able to harness that power so that it effectively is doing everything a junior employee does.
- AVAnkit Virmani
Um, that is a great point. Maybe this was, um, an inherent assumption in my articulation that I saw Cowork as almost an independent product outside of the scope of Claude Code. Now, I do completely understand that Cowork is very much built on top of Claude Code as the machinery, and Cowork reduces the amount of complexity that is needed for a user, which is it's primarily built for non-technical users. But based on your feedback, um, I'd love to make sure I have very clear action items as I think through this. Are you suggesting that we should also think about ways to grow Cowork in addition to Claude Code? Or are you suggesting that we can assume that Cowork is a separate entity or a separate product altogether?
- AGAakash Gupta
Actually, Cowork isn't a separate product team. Cowork is almost a, um, surface area that the Claude Code team maintains.
- AVAnkit Virmani
Mm-hmm.
- AGAakash Gupta
So the way to think about it is that's just one surface area that I can use.
- AVAnkit Virmani
Definitely. Okay. Um, thank you. That is, that is super helpful feedback. Um, I would like to take a, a minute or so to really think through segmentation. And of course, I'll make sure I'm incorporating the, the idea of Cowork as being another surface area in addition to the ones we have already talked about.
- AGAakash Gupta
Sounds good.
- 22:53 β 25:33
Ecosystem mapping and segmentation
- AVAnkit Virmani
Thank you. Great. Um, thank you for the time, Aakash. I definitely had a chance to think more deeply about the segments while taking into consideration the feedback that you provided on Claude Cowork as a critical part of the overall ecosystem. And before I s-- lay out the segments, I would first want to talk about the ecosystem players in the space, and there's a few. Um, first, we have the professional developers, right? Um, these are primarily operating within core terminal or using IDEs and shipping code almost on a daily basis, right? Then you have the non-technical knowledge workers. Think of them as PMs, research analysts, um, legal and finance, uh, folks, and so on and so forth. Now, Cowork is designed for them, but just to call it out, adoption is still relatively early amongst this cohort. The third, um, ecosystem player is a non-technical builder, uh, or creator, such as, let's say, founders, students, hobbyists who want to build software, but they can't code, right? Um, they sit somewhere between developers and knowledge workers. Um, they need something more than Cowork's just document processing approach, um, but they're definitely very intimidated by the terminal. Um, the next ecosystem player is enterprise organizations that are deploying Claude Code or Cowork across teams at scale. And then finally, um, we have, let's call them the, the miscellaneous players in the ecosystem. So MCP connector developers, um, skill creators, plugin builders, and so on and so forth. Now-For the 10X weekly active user question that we are trying to solve here, I would very much want to focus on the ecosystem player, which is the individual user, not enterprise accounts. And even within the individual user, I would want to segment them based on what they are trying to accomplish, uh, using Claude's agentic capabilities. So let's first talk about the, the segmentation heuristics and what my primary motivation is. Um, I am looking for what their relationship with code is, which to me will fundamentally determine which surface they use and what friction point they encounter. Okay? Um, and with that in mind, I have
- 25:33 β 31:37
Three segments: coder, builder, knowledge automator
- AVAnkit Virmani
three segments that I'd like to propose. So the first one, not surprisingly, is the professional coder, right? Um, the professional coder is an experienced developer. They're using Claude Code within IDE terminal, et cetera, as their primary coding agent. They are motivated by throughput and shipping more and faster, right? Um, they live in the terminal, they review AI output critically, um, they push the product's limit, and they already know their product well. Now, their issue today is rate limits and reliability, not discovery and onboarding. The second segment that I'd like to talk about is the aspiring builder. Now, these folks are non-technical or semi-technical people, um, who want to create software products, right? They want to make apps, tools, dashboards, et cetera, but they can't code professionally. Um, these may be PMs who want to prototype without engineering bottlenecks, founders building MVPs and such. Um, but we know that the terminal scares them and the web surface for either Claude, uh, Code is still somewhat developer-oriented. Cowork alone doesn't, doesn't fully serve them because it's designed around file and document tasks. I very much fall in this category where I am somewhat familiar with using Claude Code in the terminal, but when I tried using Cowork, it just wasn't sufficient enough for me given the, the very knowledge worker approach it has to problem-solving. And then finally, there is definitely the third segment where Cowork fits pretty well. Um, and that is the, let's call them the, the knowledge automators, right? These are non-technical professionals. Um, they want to automate existing work but not necessarily build new products, right? Um, for them, the, the motivation is efficiency, which is getting the, the repetitive and tedious high effort, low complexity tasks off their plate. Now, if I were to very quickly rank these segments on two dimensions of reach and underserved degree, um, I would like to then prioritize them. So going through the, the three segments very quickly. The professional coder is a medium reach. They, they are a pretty large segment, but they're already and substantially penetrated, right? Um, and hence their underserved degree is also medium. The product is built for them, and their bottleneck is rate limits. Um, for the aspiring builder, I would say the reach is pretty high. Um, there's orders of magnitudes of more people who want to build software than they can code. Um, this cohort, interestingly, is also getting into the mental model of building for themselves, which is they are able to think through bespoke solutions and products that would satisfy their own needs, even if they're not satisfying other people's need. And their underserved degree is extremely high. They fall somewhere between the surfaces. Cowork is too limited, and Claude Code is too intimidating, right? Um, and then the, the third category, as we said, is the knowledge automator, and for them, the reach is high, of course, pretty massive. This is eventually every knowledge worker with a repetitive task. But underserved degree, I would say is medium-high, um, because of course, Cowork exists to solve for them and serve them. Um, but there are a lot of other competitive products that are very clearly getting into the niches of these, um, uh, this, this persona, and hence I wouldn't call them underserved significantly. With that though, still, I think this is the segment, the, the knowledge automator segment that I would like to prioritize because for them, Cowork is the unlock, and for me, just the sheer reach is massive, which means that the 10X math is going to be extremely compelling, um, and could be a lower friction path for us to 10X. Does that sound reasonable to you? If that does, I'd love to flesh out a very quick persona for someone like this.
- AGAakash Gupta
I actually thought you were gonna prioritize aspiring builder, so just flesh it out for me a little bit more why you prefer the knowledge worker.
- AVAnkit Virmani
Sure. Um, I would say, as I said, there are-- there were three reasons for me, and I'm happy to shift to the, the aspiring builder as well. That is-- That was a very close second for me. And the primary reasons for prioritizing the knowledge automator for me was, again, Cowork being the unlock, where Anthropic has already built the right product for this segment. And this-- it has the same agentic architecture as Claude Code, right? Um, very much wrapped in an approachable desktop interfaceThe core experience works pretty well, and the gap isn't necessarily product capability, it's very much access and awareness. Now, the other reason why I selected this was that the 10X math was compelling for me. Um, there are tens of millions of professional users worldwide, uh, or potential users, um, when we think of developers. But when we think of knowledge workers, there are hundreds of millions of knowledge workers, not tens of millions, right? Um, so the knowledge workers deals with documents, spreadsheets, and very repetitive daily tasks. So if Cowork's agentic capabilities were accessible at a pro-level price point, um, with a very clear onboarding path, I think the, the addressable weekly active user expansion could be enormous. And then of course, this is probably the lowest friction path to 10X, right? The, the knowledge automator segment has a product that works.
- AGAakash Gupta
Makes sense.
- AVAnkit Virmani
Again, so-
- AGAakash Gupta
Let's focus on knowledge worker.
- 31:37 β 32:55
Stephanie, the persona
- AVAnkit Virmani
Okay, lovely. Um, I would love to very quickly flush out a persona, um, or talk about the persona that I have in mind. Okay, um, let's call this knowledge worker Stephanie. She is a senior financial analyst at a, at a healthcare company. Um, every quarter she pulls, uh, or manually pulls data from twenty, 30, maybe 50 PDFs, um, and normalizes it to spreadsheet, creates comparison tables, you know, drafts summaries, um, for the executive team, and it takes her an entire week to put all of this together. Now, she's heard of Claude Cowork or Claude as a chatbot for writing emails. That is her level of understanding, and she has no idea of something like Cowork existing or how that could help her. She did hear about it from some friends or colleagues, but she's skeptical it could handle the level of complexity that she needs. Um, and she definitely doesn't use a terminal. She lives in Excel, PowerPoint, and, um, uses her company's shared drive. Does that sound like a good persona to start diving into pain points?
- AGAakash Gupta
Hundreds of millions, so I guess that's the reason we wanted to. Makes sense to me.
- AVAnkit Virmani
Perfect. Um, give me a minute and I will get back to you.
- 32:55 β 38:35
Three pain points
- AVAnkit Virmani
Great. Um, thank you, Aakash. So let's go through the, the top pain points for our persona, Stephanie. So let's say that she's found Cowork and she's trying to use it, um, and think about what goes wrong. To me, the, the three key pain points are first, that Claude treats every session as a blank slate, right? It doesn't necessarily understands Stephanie's recurring workflows, her company's formats, or, um, the standards that she's held to. The second pain point is around Claude not reliably reasoning across many heterogeneous documents simultaneously, such as, you know, it misses data, it hallucinate connections, um, flattens nuances when processing the, the actual files. And then finally, Claude is reactive. Um, it waits for Stephanie to describe the tasks when the, the highest value version of an AI agent would be proactively identifying work that needs doing, right? Um, let's dive deep into each of these three and at the same time assess them on the rubrics of both frequency and severity so that we can prioritize one pain point. Right. Um, the first pain point that we had talked about was Claude treating every session, um, as a blank slate. So every quarter, Stephanie runs the same competitive analysis, right? Um, every time she opens Cowork, she re-explains this, um, the same sources, the same specific metrics to extract, the column structure, let's say that her VP expects to see, the formatting that her team uses. Claude produces a different structure every single time, and she spends an hour just getting the outputs to match the, the consistency from last quarter. Um, the AI doesn't have persistent understanding of her work patterns, essentially. Uh, the frequency is extremely high here, which is this hits every single recurring workflow, um, and this is a majority of knowledge work for most people. Uh, and the severity is also pretty high because it transforms what should be a ten-minute delegation into an hour-long prompt engineering session, eroding what we think of as the, the core value prop of an agentic AI. Uh, the second one was Claude can't reasonably bring together the heterogeneous set of documents, and it hallucinates, um, when it can't find the right data. So Stephanie has, let's say, 15 quarterly reports in different formats, PDF, documents, Excels, et cetera, and Claude might read 12 of these 15 correctly. But if it misreads, uh, one table in one PDF, it's gonna make up some numbers, right? Now, Stephanie might not catch this, but her VP might flag that. Um, and at this point of time, this breaks trust for Stephanie, right? The, the multi-document reasoning has degraded, and S- Stephanie has not necessarily realized what has happened or what led to this error state. The frequency here is pretty high, which is most real-world analytical tasks are gonna involve multiple documents in mixed formats. Um, and I would say the severity is also pretty high, um, because silent errors are pretty common, and but silent errors do, in analytical outputs, destroy trust permanently. In future, Stephanie won't delegate these tasks where numbers and processing are involved. And the, the last one was that Claude is reactive. It wastes-- waits for Stephanie to describe the tasks, and in this case, the quarterly report is generated at the end of every quarter. Um, Claude doesn't know this. Now, we know that Claude Code has ways to build in some kind ofproactivity, but that ne- hasn't really transferred to Cowork in a non-technical manner, right? There are no ways to set up cron jobs of sorts with Cowork. Now, what would this look like if the pain point was solved? Which is, "Hey, your quarterly analysis is due in five days. I see 12 of the, the 15 reports are already in your folder. Do you want me to start processing, and can you give me data points X, Y, Z to really fill this out?" Now, I know we're not getting into solution yet, but still, this really illustrates what that pain point looks like. In this case, though, the frequency is medium. It applies mainly to recurring and time-triggered workflows, and I would call the severity also as medium. Um, it's a missed opportunity more than just a pure failure. Um, but it is the difference between an agent that does what you ask and an agent that genuinely works like a teammate who anticipates your need. So, with that in mind, the-- after having gone through the frequency and severity for the three pain points, the pain point that I'd like to prioritize is the first one, which is the blank slate problem, and Claude's inability to learn and persist Stephanie's recurring workflows, formats, quality standards, et cetera. It is deeply an AI-native problem as well because it is about persistent memory, workflow learning, pers-personalized output calibration, and these are the areas where Opus as a model truly excels in. Um, and all we need to do is make sure that this is baked into the harness that is leveraged for Cowork of sorts. If that seems reasonable, I'd love to deep dive into solutions as next
- 38:35 β 46:06
Three solutions
- AVAnkit Virmani
steps.
- AGAakash Gupta
Yeah. Let's prioritize the blank slate problem.
- AVAnkit Virmani
Perfect. Okay, um, so let's talk through three solutions, um, or potential parts of a roadmap, uh, Akash. The first one I'd like to talk about is what I would call a workflow memory, um, and has both the model and application layer. The second one is, um, what I think of as output calibration, very much a model capability. And then the final one is a proactive agent, which would be, uh, again, a combination of, uh, an application and a model layer. Let me dive into each of these three sequentially. So the first one is, let's say the, the workflow memory, which is after Stephanie completes a task, Claude offers, um, "Hey, Stephanie, this looks like a w-recurring workflow. Do you want me to remember how you have done this?" And once Stephanie agrees or provides feedback, Claude distills the session into a persistent workflow memory, capturing the steps, formatting preferences, metric extraction logic, even certain files that it referenced, um, while doing so. And next quarter, when Stephanie says that, "Run my quarterly competitive analysis," Claude will execute from the memory, only confirming where there is any differential or what is incremental, right? Now, in this case, the, the model layer is distilling the messy sessions into structured reusable workflows and very clearly distinguishing a one-time correction from what would be persistent preferences, right? On the other hand, your application layer is going to store versions of these formatting preferences and surface them at the right moment. I do want to call out some safety considerations here, um, which are that workflow memories are very clearly user-owned and editable as well as re-reviewable. Um, here Claude should show what it remembers before executing, and Stephanie should be able to approve the plan, and agent handles the execution. The, uh, I mean, Claude, even Claude Chat does this where it creates documents that are viewable in the, the pane and in case of these templates or workflow preferences, Claude might have to make this editable in some fashion, right? Um, very quickly, if I were to rank this on a rubric of impact as well as effort, I would say this is high impact but high effort as well, given that it has both a memory as an app and an application, uh, layer or, uh, work associated with it. The second solution that I'd like to talk about is the output calibration, where every time Stephanie edits Claude's output, um, say reformatting a table, fixing labels, or adjusting tones, Claude observes this delta between the output and her requests or desired, uh, version. And over multiple sessions, it builds what I think of as almost a quality profile, um, which includes formatting preferences, rounding conventions, right? Um, summary tone, emphasis patterns, and so on. And Claude's first draft should progressively converge on what Stephanie would have then edited them into. Now, this to me is primarily a model layer, uh, uh, feature where these implicit preferences, uh, are learning from or, or are being built on top of these edit patterns, and very clearly incorporates individual-level personalization through context engineering. Right? Now, the application layer is primarily tracking these edits and deltas across different sessions. Now, from a safety standpoint, this to me only affects formatting and style, not necessarily analytical substance. So Stephanie can still review and correct her quality profile at any time, for what it's worth, right? But the low confidence preferences trigger questions rather than assumptions.Here, right? Um, so with that in mind, I would qualify this as a medium-high impact, but medium on effort, primarily because again, only model site changes. And then we have our third solution, which is the proactive agent, right? Uh, in this case, Stephanie does set a trigger, which is when a new report appears in my folder blah, please start a competitive analysis of sorts, right? Um, in this case, Claude will continuously monitor the folder via some kind of, you know, we, we call it cron job, but let's call it /loopinfra, which, uh, we already have for Claude Code. And when Cowork detects any matching files, it will send a notification in some form, um, if macOS then via dispatch, say, "Twelve new quarterly reports detected. Um, based on your sa- served workflow, do you want to kick this off?" It can be where you check in with them and ask, or you can proactively suggest that, "I will have the draft ready in fifteen minutes," right? And our persona, Stephanie, can, in case we need approval from her on, on some dimension, can approve this directly from her phone. Um, and when she opens her laptop, the draft is waiting for her. Now, this is very much an application, uh, layer and model layer product where application layer extends your /loop or conver-- uh, or, or cron job architecture and dispatch to Cowork, um, for the folders that are being watched. The model layer, on the other hand, will recognize whether the new files match the expected patterns, um, and it handles any partial sets gracefully. So, with that in mind, the, the safety consideration I have top of mind here is that every proactive action should require explicit approval, at least at the start, right? Claude should never modify files unprompted, and users very clearly control which folders, which folders are monitored. This is nothing different, though, from how Claude is already approaching file safety, um, and guardrails. So very much in line with the approach we already follow. Um, in this case, the impact I would say is a, is a medium, um, as it's mainly applicable to repetitive tasks, and the effort also is medium given most of the infra already exists. There is... [coughs] Given the rubrics and the assessment of the solutions on the, uh, rubrics, I would recommend workflow memory, which was our
- 46:06 β 47:29
The recommendation
- AVAnkit Virmani
first solution, as the proposed one that we start with. Now, we have to remember that these solutions are not necessarily contradicting with each other. They can very much be a part of the roadmap that we build iteratively on. But I would recommend starting with workflow memory since it attacks the highest friction moment, right? Um, the, the weekly or monthly reteaching tax that prevents your occasional users from becoming habitual weekly users, right? It also very much creates an AI native flywheel where you're starting from usage, then moving on to learning, improving the output as a result of that, which then drives more usage for Claude Cowork or Claude Code, right? Um, and it also builds a very clear organic, uh, or sorry, it builds on a very clear organic switching cost, which is once Claude knows your workflows, moving to a competitor becomes much harder. It means you have to start from scratch and starts to create that moat around personalization that most products thrive on. Does that seem like the, the right approach to take, Akash? I'm happy to flesh out this V1 more and potentially talk about risks.
- 47:29 β 50:26
Defending the 10x math
- AGAakash Gupta
What I mainly want to understand is how are we going to be sure that this is going to help us 10X?
- AVAnkit Virmani
That is a great question. Um, I would see this as the first step towards 10X-ing. I do see the, the rest of the portions of the portfolio as also being critical on our path to 10X-ing. And, uh, give me maybe 30 seconds to really flesh out my thoughts on this, and I'll, I'll provide a, a structured approach.
- AGAakash Gupta
Sounds good.
- AVAnkit Virmani
Thank you. Great. Akash, I think there are a few levers if we were to think of this portfolio of solutions on how we can get to the 10X math as such. So, the first one would be, um, activation of existing Pro subscribers, which I think is the, is the biggest lever, right? Today, we have millions of Pro and Max subscribers already using Claude Chat, and a lot of them are knowledge workers who have never tried Cowork. So, um, the workflow memory really changes their attention curve with the product, and without it, I would say Cowork's value resets to zero every session. Um, the second lever I would want to talk about is in terms of retention of current Claude Code or Claude Cowork users. Where today the blank slate problem, and, and I'm hypothesizing of course, is a key churn driver, and users try Cowork, then they try reteaching, but the costs are too high for the value to be, uh, delivered. Then they revert to chat or to manual work. Um, in the, the solution that we had talked about in terms of workflow memory and a couple of the other ones, they, they directly reduce this churnWhich is turning a high setup costs and variable output to a zero setup consistent output. And the compounding effect really matters here. And to me, the, the third lever towards the 10x math would be word of mouth acquisition of new subscribers. We've seen Claude Code truly spike because of this word of mouth, primarily in the developer community. For this to happen with something like Cowork, um, we need to unlock this zero setup experience, and word of mouth is the most powerful growth signal, um, where Stephanie is telling her finance team that Claude does my quarterly reports in fifteen minutes. And the story only happens when the output is reliable, consistent, requires minimal setup, has persistent workflows, and so on and so forth. And every single, let's call it knowledge automator, becomes an evangelist within their organization, especially when they can share their workflow templates with their colleagues, right? So this is how your flywheel towards 10x really gets accelerated.
- 50:26 β 52:30
Summarizing for Dario
- AVAnkit Virmani
Does that seem fair?
- AGAakash Gupta
Yep. We're almost out of time here on this interview. If you had to just summarize everything, let's say Dario's walking by, um, you wanted to summarize like your 10x plan and get his feedback really quick. How would you summarize it quickly?
- AVAnkit Virmani
Yes, definitely. Um, so to me, the key insight is that Claude Codes and even Cowork's current weekly active user bottleneck is driven by a developer-centric base and the knowledge automator segment, such as analysts, product managers, finance, uh, folks, is maybe ten to fifty times larger than professional developer pools, right? Many of them may already have a pro or, um, even a max subscription and Cowork access. The product exists, the pricing works for them. What doesn't work is retention, where they hit a blank slate, uh, every time that they try to use Cowork. Now, the solution that I'd like to propose to tackle this is workflow memory, which fixes retention equation across multiple levers, including activation, where we are converting existing chat-only pro subscribers into weekly active Cowork users, um, based on recurring f- uh, workflows, um, that, that actually recur, which could be a massive weekly active user multiplier. Second would be retention, where value for existing code or Cowork users compounds across sessions instead of resetting to zero. And then finally, this also drives the word of mouth flywheel within organizations and even outside. Now, 10x isn't just one lever, it is a combination of the classic growth funnel of activation, retention, organic spend, all unblocked by making certain portfolio decisions to-- by targeting this, um, segment that we have, uh, prioritized.
- 52:30 β 55:15
Feedback: 9/10, seven things Ankit nailed
- AGAakash Gupta
Awesome. Great. So that concludes our interview. Let me give my feedback or my take on this. So I would give this a nine out of ten strong pass, and there are seven things that I think you did really well. Number one is your strategic context section might have been one of the best I've ever seen in a mock. A very, very strong opinionated description of the current position in the AI market. And most people underestimate, when you're gonna go interview at a company, have a really strong understanding of their strategic context and think about the bar that Ankit set here. You wanna do it that well. The second thing that you did really well is I had this Cowork pivot that I basically built into your question. Asked about Claude Code, you were ready to go on Claude Code, then I give you the Cowork pivot. And you did a very strong pivot to Claude Cowork, even insofar as your final solution was more in the Cowork space, if we think about it. And so you really took direction well. And you know, your interviewer, you guys, is trying to help you, and so he read the tea leaves extremely well. The third thing is that you specifically talked about how I have used Cowork, and that is really good because you're basically sprinkling in, like, "I actually tested your product before coming to this interview." And you'd be surprised, guys, people still aren't doing this. They'll go to interview with the Sora or the Codex team, and they will not have extensively used that product. The fourth thing you did extremely well is you went really deep. When you were describing each of your user segments, I could feel that you have, you know, very high percentile ability to empathize with your customers. The next thing, the fifth thing that I thought was really good, was a very clear framework that you used for prioritizing both segments, pain points, and solutions. So across all three, you applied a framework. I could clearly follow it. That was great. Then sixth thing you did really well was a really clear solution description for all three of them, and you incorporated my point earlier around model feedback. So you made it super clear how you're gonna give feedback to the model team, how you're gonna incorporate the app. So again, you f- took the interviewer's lead better than most people do. And then the seventh thing is that you took my 10x question very seriously. You said, "Can I have thirty seconds?" And most people fail to take the thirty seconds to actually prepare their thoughts. When an interviewer's asking a question like this, it's probably, like, on the rubric. They're probably asking other people. So you don't wanna just give a medium, mediocre answer, you wanna give an awesome answer,
- 55:15 β 57:43
What would get this to a 10
- AGAakash Gupta
and so you took the time to do it. Now, how do I think you could have gotten to a ten out of ten? There are four things I think you could have doneNumber one is actually when we were going through the user segments, and I brought this up to you, the way you talked about your prioritization framework led to one group that you didn't prioritize seeming like the group to prioritize. So when you did aspiring builders, you said they're underserved and have high reach. You give them both high. And then when you did knowledge worker, you said h- extremely high reach, but you said they're low underserved, but then you prioritize them anyways. So your framework didn't really help your prioritization, and it led to this kind of awkward conversation where you had to defend yourself. So make sure that you do a quick sanity check after creating your framework and walking through it, that it actually leads to the group you wanna prioritize. The second thing is your mission, when you reiterated it about halfway through, was still very focused on code, but I had given you that feedback on Cowork, so you might wanna want adjust your mission a little bit more. The third thing is, I think that the way the Claude Code team ships, you said it yourself, 120 features in 90 days is a lot of little daily features that they promote on Twitter and other solutions. So I would've loved that your final solution to me was like a broad strategic direction, but then you paint it out to me, these are like the 10 to 20 little features. So that's how we customize the solution to the company we're talking to. And then the fourth thing is manage your time a little bit better. So there might have been some places where your descriptions on your frameworks or walking through your frameworks ended up taking a long time. And so we couldn't talk about something that I hope most people reserve time for, which is the flip side, the risks. And so when you go through your solutions, if you can almost critique your own solution, then they say, "Ah, he doesn't think that his solution is, you know, golden. He can easily see both sides." And so if you can save time for that, that also helps. Does that resonate with you?
- AVAnkit Virmani
No, that is some wonderful feedback. I was starting to feel the crunch in time, uh, Aakash, and definitely wanted to cover the risks, but I, um, understand that we are usually constrained by about like a 30 to 35-minute window, which means that we are responsible for managing our time well. So great call-out on that, and thank you for the feedback.
- 57:43 β 1:00:47
AI product sense vs traditional product sense
- AGAakash Gupta
All right, guys. So we only have a little bit of time left, but we wanna pack this last little bit for you guys with a ton of information. So, so you give a nine out of ten answer. I would say 90% of people are not giving this quality answer. What are the key things that you focused on in doing this AI product sense answer versus what you might have done in a traditional product sense interview?
- AVAnkit Virmani
Sure thing, Aakash. Um, I, I think there are a couple of very key differences between these two. First, I would say use the model capabilities as a constraint and not just features, uh, during a brainstorm. So in a regular product sense, um, question, let's say improve Gmail, you would start with users' pain points, et cetera. With this Claude Code question, I really have to think through what Claude can and cannot do today and how this is going to evolve. Your Cowork, uh, curveball was sort of an addition on top of that, which is now I have to think about Claude Code, the speed with which it is evolving, how the underlying model is evolving, and how Cowork is already built as a potential solution for the knowledge worker. The second, uh, I would say is that safety was very clearly in my solution design, not in the appendix. So when I proposed these three solutions, every single solution had a safety consideration. And I know that my answer wasn't just to add a specific filter in there. Um, it's almost like safety has to be a strategic decision while thinking through the product outcome. And then finally, um, I evaluated every solution against a potential model improvement trajectory as well. That is, the solution wasn't just an improvement or a feature improvement in Cowork, it has very clear model implications as well, and that has to be baked into your thinking whenever you are proposing an AI-based solution. So those would be the three key differences that I'd like to call out.
- AGAakash Gupta
Awesome. And a couple other things he did really well. He had a custom framework. It seemed like it was specific to what the question was asked, and he was constantly checking in with me, which is really important as well. So the framework that he kind of walked through and that you can generalize is mission, vision, and strategy, key users, user problems, creative solutions, prioritize, and summarize. And when you think about what interviewers are looking at or why people fail, this isn't their actual rubric, but why people fail, not having a custom framework, just boring them to death with an old framework is one thing. Not covering on the model and the application later is another. Not being collaborative, not covering safety and ethics proactively, and not ruthlessly prioritizing with math. So Ankit really covered those five areas pretty well, which is really, really important.
- 1:00:47 β 1:02:27
Your roadmap to crack this round
- AGAakash Gupta
So if you want to get good at the AI product sense interview, here's your roadmap. Build your AI foundations, study AI product patterns, then get to practicing the interview, and sharpen and calibrate with people like us who are actually calibrated, and that's why we have the Land PM Job Program. If you want deeper insight, more mocks, if you want to co-solve some of these with us, join us at the Land PM Job Program. The next cohort starts on May fourth, just a few days from now, and we generally are very selective with who we bring into this program. So if you are not serious about getting a PM job, it's not gonna be for you. But if you are ready to revamp your LinkedIn, revamp your resume, come to interviews with us, do one-on-ones with Ankit, then give it out an application, and we hope to see you there. Until then, have a good one. I hope you enjoyed that episode. If you could take a moment to double-check that you have followed on Apple and Spotify podcasts, subscribed on YouTube, left a rating or review on Apple or Spotify, and commented on YouTube, all these things will help the algorithm distribute the show to more and more people. As we distribute the show to more people, we can grow the show, improve the quality of the content and the production to get you better insights to stay ahead in your career. Finally, do check out my bundle at bundle.aakashg.com to get access to nine AI products for an entire year for free. This includes Dovetail, Mobbin, Linear, Reforge Build, Descript, and many other amazing tools that will help you as an AI product manager or builder succeed. I'll see you in the next episode.
Episode duration: 1:02:27
Install uListen for AI-powered chat & search across the full episode β Get Full Transcript
Transcript of episode RQiMP_GtcnU
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome