What AI PMs REALLY Need to KNOW in 2026 (Agents, Discovery, EVERYTHING)

Name: What AI PMs REALLY Need to KNOW in 2026 (Agents, Discovery, EVERYTHING)
Uploaded: 2025-12-03T00:00:00Z
Duration: 1 h 21 min 53 s
Description: AI PM demand is rising fast, but “AI PM” labeling is a marketing and scarcity game that will be heavily scrutinized in hiring due to rampant AI-washing.

Aakash Gupta and Todd Olson on 2026 AI PM playbook: technical fluency, evals, roadmaps, governance, execution.

Aakash GuptahostTodd Olsonguest

Dec 3, 20251h 21mWatch on YouTube ↗

AI PM hiring trends and AI-washingUpskilling via hands-on model experimentation5-layer AI PM technical pyramidModel selection tradeoffs (quality, speed, cost, privacy)RAG, embeddings, vector databases, context windowsPrompt engineering as instruction and context designTrace analysis for agent/tool orchestrationPM–engineering boundaries and ownership tensionCost/performance optimization and gross margin realitiesEvals, experimentation, and outcome-based metricsAI roadmap strategy: workflows, unique assets, feature killingBoard/stakeholder narrative controlAI governance: privacy, bias, safety, regional regulation togglesPendo demos: agent analytics, dashboards, MCP, discovery automation

In this episode of Aakash Gupta, featuring Aakash Gupta and Todd Olson, What AI PMs REALLY Need to KNOW in 2026 (Agents, Discovery, EVERYTHING) explores 2026 AI PM playbook: technical fluency, evals, roadmaps, governance, execution AI PM demand is rising fast, but “AI PM” labeling is a marketing and scarcity game that will be heavily scrutinized in hiring due to rampant AI-washing.

WHAT IT’S REALLY ABOUT

2026 AI PM playbook: technical fluency, evals, roadmaps, governance, execution

AI PM demand is rising fast, but “AI PM” labeling is a marketing and scarcity game that will be heavily scrutinized in hiring due to rampant AI-washing.
To upskill, PMs should build firsthand model fluency (tradeoffs, privacy, residency), understand data pipelines/RAG, and practice prompt engineering as a core communication layer with LLMs.
As products become agentic, PMs need working knowledge of observability (trace analysis), production realities (SRE boundaries, access controls), and cost/performance optimization tied to gross margins.
Evals become a PM-owned domain because PMs best define quality and user outcomes, while engineers supply the harnesses and infrastructure to run evaluation at scale.
Strong AI roadmaps avoid shiny-object chasing by focusing on hard workflows, leveraging unique data/context, killing weak features quickly, and communicating a clear narrative to stakeholders and boards.
Pendo’s demos illustrate “agent analytics,” rage prompts, hybrid journeys, workflow automation for discovery, and enterprise-wide qualitative synthesis as practical examples of AI product execution.

IDEAS WORTH REMEMBERING

10 ideas

Don’t claim “AI PM” unless you can back it up in depth.

Olson warns that premium pay invites deeper interrogation, and AI-washing is increasingly obvious in resumes and company positioning; credible experience means shipping successful AI features in production at scale.

AI is both a work accelerant and a product capability—PMs must do both.

PMs should use AI for prototyping, competitive research, and speed (e.g., Deep Research, rapid prototyping tools), while also identifying where LLM APIs can quietly improve product features even without “AI” marketing.

Model fluency is table stakes: tradeoffs beat brand loyalty.

Teams should continuously test models (OpenAI, Anthropic, Gemini, open source) across use cases, weighing quality, latency, cost, data residency, and vendor/legal friction like DPAs and country availability.

RAG and data pipelines are foundational because context quality determines output quality.

Understanding embeddings, vector stores, ingestion, retrieval, and context window limits helps PMs reason about relevance, confusion from too much context, latency, and scaling constraints in real products.

Trace analysis matters for agentic systems, but PMs must navigate ownership boundaries.

As agents call tools/other agents, tracing helps pinpoint failures and inefficiencies, yet larger orgs often reserve deep ops/debug work for engineering/SRE due to division of labor and access controls.

Cost is product strategy—gross margin forces architecture change later.

LLM-heavy products can have unsustainable margins; Olson expects a shift toward smaller/tuned models, caching, and performance optimizations, and notes many “vibe-coded” apps ignore cost until painful rewrites.

Evals are the PM’s domain because they define ‘good’ for users and the business.

Unlike traditional automated tests, AI eval quality hinges on what you measure and how; PMs should author/manage evaluation sets and use experimentation to iterate cheaply on prompts, models, and behaviors.

AI roadmaps should optimize workflows and leverage unique data—not wrap ChatGPT.

Olson advocates solving hard, tedious problems (e.g., discovery scheduling) using proprietary context and being ruthless about turning off low-retention AI features to protect trust and adoption.

Outcome metrics beat activity metrics as agents become more autonomous.

For some AI products, DAU matters less than task completion and value delivered (e.g., cost-per-ticket resolved); leading indicators may include frustration signals like “rage prompts” and sentiment patterns.

Build governance ‘toggles’ so you can adapt to safety and regulatory shifts.

Privacy, bias, guardrails, country/industry rules (e.g., Germany worker councils, HIPAA/financial services) require configurable controls; hardwired AI behavior can force product withdrawal when rules change.

WORDS WORTH SAVING

7 quotes

You better damn well be good and know what you're talking about if you're gonna call yourself an AI PM.

— Todd Olson

RAG is kind of a de facto way to build… you wanna give the right context.

— Todd Olson

This is a real issue… how you build and design systems affects your cost of goods sold… and gross margin.

— Todd Olson

The PM is probably the best suited human being to author and manage these [eval] sets.

— Todd Olson

Throw it away. Do not hold onto it.

— Todd Olson

If you show up at a board meeting with no narrative… you're gonna get crushed.

— Todd Olson

If you build it in a way where you can't change how it works, you are screwed… we build… toggles, switches.

— Todd Olson

QUESTIONS ANSWERED IN THIS EPISODE

5 questions

Where exactly is the line between “PM using AI” and a true “AI PM” role in Todd’s view, and how would he test for that in interviews?

AI PM demand is rising fast, but “AI PM” labeling is a marketing and scarcity game that will be heavily scrutinized in hiring due to rampant AI-washing.

What’s a concrete checklist a PM can use to decide whether a feature is ‘just a ChatGPT wrapper’ versus leveraging unique product data/context?

To upskill, PMs should build firsthand model fluency (tradeoffs, privacy, residency), understand data pipelines/RAG, and practice prompt engineering as a core communication layer with LLMs.

In practice, what eval artifacts should PMs own (golden datasets, rubrics, graders, acceptance thresholds), and which pieces should engineering own?

As products become agentic, PMs need working knowledge of observability (trace analysis), production realities (SRE boundaries, access controls), and cost/performance optimization tied to gross margins.

How would you design an eval strategy for a workflow agent where outcomes are ambiguous (unlike ‘tickets resolved’), and what KPIs would you pair with it?

Evals become a PM-owned domain because PMs best define quality and user outcomes, while engineers supply the harnesses and infrastructure to run evaluation at scale.

Todd mentioned model availability and DPAs blocking launches—what are the most common legal/security failure modes PMs should anticipate early?

Strong AI roadmaps avoid shiny-object chasing by focusing on hard workflows, leveraging unique data/context, killing weak features quickly, and communicating a clear narrative to stakeholders and boards.

EVERY SPOKEN WORD

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome

At a glance

2026 AI PM playbook: technical fluency, evals, roadmaps, governance, execution

Don’t claim “AI PM” unless you can back it up in depth.

AI is both a work accelerant and a product capability—PMs must do both.

Model fluency is table stakes: tradeoffs beat brand loyalty.

RAG and data pipelines are foundational because context quality determines output quality.

Trace analysis matters for agentic systems, but PMs must navigate ownership boundaries.

Cost is product strategy—gross margin forces architecture change later.

Evals are the PM’s domain because they define ‘good’ for users and the business.

AI roadmaps should optimize workflows and leverage unique data—not wrap ChatGPT.

Outcome metrics beat activity metrics as agents become more autonomous.

Build governance ‘toggles’ so you can adapt to safety and regulatory shifts.

Where exactly is the line between “PM using AI” and a true “AI PM” role in Todd’s view, and how would he test for that in interviews?

What’s a concrete checklist a PM can use to decide whether a feature is ‘just a ChatGPT wrapper’ versus leveraging unique product data/context?

In practice, what eval artifacts should PMs own (golden datasets, rubrics, graders, acceptance thresholds), and which pieces should engineering own?

How would you design an eval strategy for a workflow agent where outcomes are ambiguous (unlike ‘tickets resolved’), and what KPIs would you pair with it?

Todd mentioned model availability and DPAs blocking launches—what are the most common legal/security failure modes PMs should anticipate early?

Get more out of YouTube videos.