Aakash GuptaAI for Product Managers: 10X Growth with Smart Experimentation
At a glance
WHAT IT’S REALLY ABOUT
How AI removes experimentation bottlenecks and improves measurement for PMs
- AI’s biggest impact on experimentation is eliminating the build bottleneck by generating variations and experiment code from prompts, sketches, or mockups in minutes instead of sprints.
- Product managers and data scientists remain essential “humans in the loop,” with PMs providing business context and hypotheses while data scientists validate metrics, bias, and statistical rigor.
- The ML wave (circa 2016) improved targeting, traffic allocation, and analysis via techniques like intent scoring, multi-armed bandits, contextual bandits, and automated opportunity detection.
- The GenAI wave (since 2022) expanded experimentation capabilities through content generation, RAG-based assistants inside tools, and “vibe experimenting” (prompt-based experimentation) that democratizes testing.
- Measuring AI features requires going beyond usage to outcomes and experience, plus technical eval metrics (accuracy, relevance, context quality) and methods like LLM-as-judge for RAG evaluation.
IDEAS WORTH REMEMBERING
5 ideasThe build step—not ideation—is why most teams under-experiment.
Teams often have plenty of test ideas, but development bandwidth turns experiments into a scarce resource, forcing prioritization meetings and multi-sprint waits; AI-generated variations aim to collapse that cycle time.
AI makes experimentation faster, but it doesn’t remove the need for accountability.
PMs must supply business constraints, customer context, and a clear hypothesis of success, while data scientists/analysts challenge AI outputs for plausibility, bias, and proper measurement design.
Treat AI as “UX memory” by connecting it to past experiments.
If an AI can retrieve prior test results across teams, it can warn you when an idea has already been tested, summarize what happened, and suggest whether it’s worth re-testing due to changed users or product context.
Use multi-armed bandits when speed matters more than statistical certainty.
Multi-armed bandits optimize performance by shifting traffic toward the current best variant early, trading some accuracy for faster gains—useful in high-velocity contexts like media headlines.
Use contextual bandits for personalization, but only when you have enough traffic.
Contextual bandits attempt to learn which variant works best per user context (hyper-personalization), which demands significant behavioral data and typically high traffic to learn reliably.
WORDS WORTH SAVING
5 quotesYou have an idea, uh, ma- then you make an assumption. If you release that feature in production, it will increase by X, that metric, because of these reason and these reasons, right? And then you build the experiment, the variations... And then you look at your results, right? So it's a simple loop, right?
— Frederic De Todaro
Most tools still rely, you know, a lot on developers. They are already busy, you know, building the next features in your roadmap. And so as a result, most teams, you know, do not A/B test, uh, the majority of what they deliver to their users.
— Frederic De Todaro
It means that you can, you can, you can turn any idea into a running e-experiment just by prompting an AI.
— Frederic De Todaro
But the real quest- question to me isn't can you build it? It's really should you, should you build it, right?
— Frederic De Todaro
Product discovery will tell you what users say they want. Experimentation tell you, uh, what they actually do.
— Frederic De Todaro
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome