Becoming evidence-guided | Itamar Gilad (Gmail, YouTube, Microsoft)

Itamar Gilad is a product coach, author, and speaker with over two decades of experience in senior product roles at Google, Microsoft, and various startups. He is also the author of Evidence-Guided: Creating High-Impact Products in the Face of Uncertainty and publishes a popular product management newsletter. In today’s episode, we discuss: • What it means to be “evidence-guided” • How to think about your KPIs as metric trees • How to prioritize ideas using the “confidence meter” • The GIST model for roadmapping • Common mistakes with ICE • Advice for using evidence to challenge gut-driven founders — Brought to you by Ezra—The leading full-body cancer screening company: http://www.ezra.com/lenny | Vanta—Automate compliance. Simplify security: https://vanta.com/lenny | LinkedIn Ads—Reach professionals and drive results for your business: https://www.linkedin.com/podlenny Find the full transcript at: https://www.lennysnewsletter.com/p/becoming-evidence-guided-itamar-gilad Where to find Itamar Gilad: • Twitter/X: https://twitter.com/ItamarGilad • LinkedIn: https://www.linkedin.com/in/itamargilad/ • Website: https://itamargilad.com/ Where to find Lenny: • Newsletter: https://www.lennysnewsletter.com • Twitter/X: https://twitter.com/lennysan • LinkedIn: https://www.linkedin.com/in/lennyrachitsky/ In this episode, we cover: (00:00) Itamar’s background (04:35) How his time working on Gmail shaped his philosophy of “opinion-based” development (08:35) Lessons from developing Gmail’s tabbed inbox (13:40) A brief overview of Itamar’s book, Evidence-Guided (14:30) Balancing founder creativity with an evidence-based approach (17:32) Advice on how to push back against founders (19:36) Signs you aren’t as evidence-guided as you may think (21:13) Itamar’s GIST model for becoming more evidence-guided (23:51) How to set overarching goals using his “value exchange loop” (28:45) North star metrics vs. KPIs (33:47) Using “ICE” to assess the value of ideas (37:39) Itamar’s confidence meter (44:28) Speed of delivery vs. speed of discovery (46:14) How to apply Itamar’s frameworks based on company type and stage (49:09) First steps in becoming more evidence-guided (50:21) Next steps in testing (55:41) The task layer in the GIST framework (1:02:54) Thoughts on roadmapping (1:04:56) How OKRs fit into the whole picture (1:07:11) Lightning round Referenced: • Itamar’s presentation slides: https://itamargilad.com/wp-content/uploads/2023/09/Podcast-Slides.pdf • What differentiates the highest-performing product teams | John Cutler (Amplitude, The Beautiful Mess): https://www.lennyspodcast.com/what-differentiates-the-highest-performing-product-teams-john-cutler-amplitude-the-beautiful-mess/ • Evidence-Guided: Creating High-Impact Products in the Face of Uncertainty: https://itamargilad.com/book-evidence-guided/ • The co-founders of Google in Forbes: https://www.forbes.com/profile/larry-page-and-sergey-brin • Kanban: https://www.atlassian.com/agile/kanban • Jira: https://www.atlassian.com/software/jira • The ultimate guide to OKRs | Christina Wodtke (Stanford): https://www.lennyspodcast.com/the-ultimate-guide-to-okrs-christina-wodtke-stanford/ • Amplitude: https://amplitude.com/ • The ultimate guide to A/B testing | Ronny Kohavi (Airbnb, Microsoft, Amazon): https://www.lennyspodcast.com/the-ultimate-guide-to-ab-testing-ronny-kohavi-airbnb-microsoft-amazon/ • ICE framework: https://growthmethod.com/ice-framework/ • Sean Ellis on LinkedIn: https://www.linkedin.com/in/seanellis/ • RICE scoring model: https://www.productplan.com/glossary/rice-scoring-model/ • Idea Prioritization with ICE and the Confidence Meter: https://itamargilad.com/the-tool-that-will-help-you-choose-better-product-ideas/ • Assumptions Mapping: https://designsprintkit.withgoogle.com/methodology/phase2-define/assumptions-mapping • What is Dog Fooding, Fish Fooding a Product?: https://matt-rickard.com/fishfooding-dogfooding-product • SVPG books: https://www.svpg.com/books/ • The Lean series: https://theleanstartup.com/the-lean-series • Dreaming Spanish: https://www.youtube.com/c/DreamingSpanish • ElevenLabs: https://elevenlabs.io/ • Lennybot: https://www.lennybot.com/ Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com. Lenny may be an investor in the companies discussed.

Itamar GiladguestLenny Rachitskyhost

Sep 21, 20231h 12mWatch on YouTube ↗

CHAPTERS

4:42 – 8:47
From Google+ to a turning point: why “plan-and-execute” can waste years
Itamar recounts joining Gmail in 2011 and being pulled into Google+ integration—an enormous company bet that ultimately failed. The experience becomes his personal inflection point for recognizing the dangers of opinion-led, top-down product building without enough learning loops.
- •Google+ was a strategic, top-down bet driven by fear of Facebook’s growth
- •Years of Gmail feature work and broader Google investment didn’t create durable user value
- •Opportunity cost: while Google chased a social network, other products (e.g., WhatsApp-style mobile social) surged
- •The failure illustrates “opinion-based development” and the need for a different system
- •Scale of investment matters: once committed, organizations struggle to course-correct
8:47 – 13:33
Tabbed Inbox as the counterexample: small idea, rigorous discovery, massive impact
The Gmail tabbed inbox starts as a doubtful idea, forcing the team to get clearer on the user problem and validate solutions through research and iterative testing. The result is a feature that many power users initially dismiss—but a huge portion of the broader user base loves.
- •Problem focus: passive users overwhelmed by clutter (promotions/social notifications)
- •Team pressure-tested the idea instead of immediately building it
- •Progressive validation: dogfooding, external testers, disability studies, data mining, ML categorization
- •Surprising adoption: the feature serves the ‘silent majority’ more than expert inbox managers
- •Lesson: successful teams balance judgment with evidence instead of eliminating judgment
13:33 – 14:23
What “Evidence-Guided” is—and why it’s harder than it sounds
Lenny introduces Itamar’s book and frames the promise: helping product teams move from opinions to evidence-backed decisions. Itamar positions the book as a practical, organization-level system for adopting modern product management under uncertainty.
- •Target audience: product leaders trying to change how decisions get made
- •Not just theory—focus on a system to shift organization behavior
- •Evidence-guided ≠ data-only; it’s judgment ‘supercharged’ by evidence
- •A ‘meta-framework’ approach intended to unify common modern PM practices
- •Designed for real-world adoption challenges, not idealized environments
14:23 – 19:29
Founder creativity vs evidence: how to challenge big bets without shutting them down
They explore whether top-down founder-driven direction can be valid—and how it should be handled. Itamar argues founders are vital for idea generation, but ideas still need to face critical scrutiny and evidence-building (including myths about the iPhone’s creation story).
- •Founders should have space to propose bold ideas, especially early-stage
- •The key question becomes: “Where’s your evidence?” not “Stop having ideas”
- •Reframing the iPhone story: discovery, failed attempts, and evidence shaped the final product
- •Evidence enables more objective dialogue vs seniority-driven opinion battles
- •Tactically, small experiments can create leverage—even in hierarchical cultures
19:29 – 21:05
Are you actually evidence-guided? The telltale signs of regression
Itamar lists patterns that reveal ‘evidence-guided’ is mostly a label rather than a practice. The signals tend to show up in goal quality, metric hygiene, planning intensity, lack of experimentation, and team engagement.
- •Unclear/vague or output-centric goals; misalignment across functions
- •Missing user-facing metrics (only business metrics like revenue)
- •Excessive time spent on roadmaps and planning perfection
- •Experimentation without real learning—or no experimentation at all
- •Engineering teams disengaged and measured primarily on delivery/output
21:05 – 25:26
The GIST model (Goals, Ideas, Steps, Tasks): a meta-framework to connect discovery to delivery
Itamar introduces GIST as a simple structure for transforming how product organizations operate. The model separates what you’re trying to achieve, how you might achieve it, how you’ll learn/build iteratively, and how work is managed day-to-day.
- •Goals define the end state; Ideas are hypotheses; Steps are build-measure-learn loops; Tasks are execution units (Jira/Kanban)
- •GIST is a meta-framework combining Lean, design thinking, discovery, and growth practices
- •Emphasis on principles + frameworks + process (process is hardest and must be adapted)
- •Strategy and research sit adjacent: important, but not the primary focus of GIST
- •Common failure: companies confuse ‘goals’ with planning/resources/timelines
25:26 – 33:39
Setting real goals: the value-exchange loop, North Star metric, and metric trees
They go deep on the goals layer: aligning the organization around value delivered to users and value captured by the business. Itamar differentiates ‘true’ North Star metrics from generic “top metric” answers and shows how metric trees drive alignment and ownership.
- •Value exchange loop: measure value delivered (user) and value captured (business) as a feedback loop for growth
- •North Star metric = value created for users; top KPI = value captured (e.g., revenue/profit)
- •Examples: WhatsApp messages sent; Airbnb nights booked; Amplitude’s ‘active learning users’ concept
- •Metric trees break top metrics into drivers, revealing key overlaps/leverage points
- •Metric trees help alignment, team ownership, and even organizational/team topology decisions
33:39 – 37:27
Ideas layer: prioritizing in uncertainty with ICE (and why confidence is often misused)
Itamar explains how idea selection often devolves into politics or HIPO decision-making. ICE offers a simple, transparent heuristic—but only if teams treat confidence as evidence strength rather than gut feel.
- •Organizations are flooded with ideas from founders, stakeholders, research, and competitors
- •Common anti-pattern: battle of opinions / HIPO wins instead of rational comparison
- •ICE = Impact (on goals), Confidence (strength of evidence), Ease (effort inverse)
- •Impact and ease are estimates; breaking the debate into these dimensions improves decisions
- •RICE vs ICE: Itamar prefers folding reach into impact for simplicity
37:27 – 44:41
The Confidence Meter: turning “we’re sure” into a measurable evidence conversation
To prevent teams from giving high confidence to untested ideas, Itamar introduces the Confidence Meter. It categorizes evidence from mere conviction to real-world tests and experiments, helping teams right-size investment and avoid costly commitments too early.
- •Confidence ranges from opinion-based support (near-zero) to strong test-driven validation
- •Pitch decks, themes (e.g., ‘AI’), and stakeholder alignment provide minimal confidence
- •Competitor copying is weak evidence—competitors aren’t necessarily right
- •Medium/high confidence requires building and testing (fake doors, prototypes, A/B, etc.)
- •Use confidence to scale investment: do cheap learning first; don’t over-test low-risk changes
44:41 – 50:05
Speed of delivery vs speed of discovery: optimizing for time-to-outcome
They challenge the false tradeoff between learning and shipping. Itamar argues the real metric is time-to-outcome—getting the right bits into production—especially under high uncertainty in startups and early-stage product work.
- •Either/or framing is misleading: strong teams learn and build simultaneously
- •Opinion-based delivery often looks fast but wastes resources building the wrong thing
- •Evidence-guided approaches learn earlier and reduce expensive late-stage rework
- •Approach varies by stage: early startups may skip heavy OKRs/metric trees, but still need value framing
- •Two companies that benefit most: those transitioning into modern PM, and those that regressed from it
50:05 – 55:46
Steps layer (AFTER): a practical ladder of tests from “fake it” to experiments to release learning
Itamar lays out the AFTER model—Assessment, Fact-finding, Tests, Experiments, Release results—to show teams they can validate assumptions progressively and cheaply. He shares a memorable Wizard-of-Oz Gmail tabbed inbox test that produced evidence before building real functionality.
- •Assessment: alignment checks, business modeling, ICE, assumption mapping, stakeholder risk review
- •Fact-finding: analytics, surveys, competitive analysis, interviews, field observation (ideally ongoing)
- •Tests: fake door, smoke tests, Wizard-of-Oz, concierge, usability tests (learn without full build)
- •Experiments: controlled A/B or multivariate testing (scientific definition of experiment)
- •Release learning: staged rollouts, percentage launches, holdbacks to keep validating post-launch
55:46 – 1:04:45
Tasks layer & the GIST Board: reconnecting delivery teams to goals, learning, and outcomes
Itamar describes the gap between planning/roadmaps and the ticket-moving agile world, and how it exhausts PMs and disengages engineers. The GIST Board bridges that gap by keeping goals, active ideas, and learning steps visible and regularly reviewed by the whole team.
- •Anti-pattern: planning happens without the people doing the work; delivery becomes output-only
- •PMs become the ‘glue’ and burn out managing roadmaps + backlogs without time for discovery
- •GIST Board structure: team key results (goals), current ideas (often ICE-scored), and next validation steps
- •Cadence: frequent team reviews to adjust ideas/steps based on evidence and progress
- •Roadmapping guidance: avoid low-confidence “release roadmaps”; prefer outcome roadmaps and switch to delivery only when confidence is high
1:04:45 – 1:12:51
How OKRs fit, where to start, and the lightning round wrap-up
They connect metric trees and team missions to OKR creation, emphasizing supplementary health OKRs as needed. The episode closes with practical advice on choosing the first layer to fix (goals vs ideas vs steps vs tasks), where to find the book, and a quick lightning round.
- •OKRs draw from mission + metric trees + team missions; alignment happens top-down and bottom-up
- •Start where pain is highest: unclear goals, politicized prioritization, low learning, or disengaged delivery
- •Confidence Meter is broadly useful as an early adoption tool across contexts
- •Book/resources: itamargilad.com, evidenceguided.com (and Amazon)
- •Lightning round highlights: SVPG + Lean series book recs; niche-audience design interview prompt; ElevenLabs as a favorite AI product

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome

From Google+ to a turning point: why “plan-and-execute” can waste years

Tabbed Inbox as the counterexample: small idea, rigorous discovery, massive impact

What “Evidence-Guided” is—and why it’s harder than it sounds

Founder creativity vs evidence: how to challenge big bets without shutting them down

Are you actually evidence-guided? The telltale signs of regression

The GIST model (Goals, Ideas, Steps, Tasks): a meta-framework to connect discovery to delivery

Setting real goals: the value-exchange loop, North Star metric, and metric trees

Ideas layer: prioritizing in uncertainty with ICE (and why confidence is often misused)

The Confidence Meter: turning “we’re sure” into a measurable evidence conversation

Speed of delivery vs speed of discovery: optimizing for time-to-outcome

Steps layer (AFTER): a practical ladder of tests from “fake it” to experiments to release learning

Tasks layer & the GIST Board: reconnecting delivery teams to goals, learning, and outcomes

How OKRs fit, where to start, and the lightning round wrap-up

Get more out of YouTube videos.