This AI Expert's Method Will Change How You Do Customer Research

AI for user research is unreliable. But Caitlin Sullivan, one of the world's leading experts in user research, knows exactly how to fix it. In this episode, she demos the complete workflow for analyzing surveys and interviews with AI, using Claude, Claude Code, and agentic workflows that cut analysis time in half without hallucinating. Complete write-up: https://www.news.aakashg.com/p/caitlin-sullivan-podcast ---- Timestamps: 0:00 - Intro 1:54 - What Good AI Research Actually Looks Like 8:22 - Step 0: Loading Context Into Claude 11:34 - Why Claude Is the Best Model for Analysis 16:12 - Step 1: Per-Participant Analysis Prompting 26:06 - Step 2: Verification & Contradiction Checking 34:51 - Survey Analysis: Why You Must Code First 46:18 - Adding Emotional Intensity Ratings 51:31 - Step 3: Auditing AI's Own Work 57:42 - Claude Code: The Agentic Parallel Version 1:09:01 - Final Output & Results ---- 🧠 Key Takeaways: 1. Replicate the human process - Good AI analysis mirrors how experienced researchers work: comb through data first, then synthesize. Never jump straight to "give me themes." 2. Use multi-step prompting - Load context in one prompt, run per-participant analysis in the next, then verify. Cramming everything into one prompt degrades quality. 3. Code before you count - For surveys, apply inductive coding labels to every response before asking for patterns. Skipping this step leads to miscategorized, unreliable results. 4. Always audit AI's work - Force the model to re-check its own analysis. It catches contradictions, overexaggerated intensity ratings, and miscoded responses regularly. 5. Claude wins on nuance, Gemini wins on frequency - Claude gives more thorough, complete analysis by default. Gemini surfaces top-frequency themes faster but misses smaller patterns. 6. Define everything explicitly - Quotes, ratings, emotional intensity levels, contradiction types. If you assume the model shares your definitions, you'll get inconsistent results. 7. Markdown files beat raw transcripts - Converting transcripts to structured markdown improves accuracy and helps you work around token limits on non-Max plans. 8. Parallelize with Claude Code agents - Set up agent markdown files for interview and survey analysis, then run both simultaneously. Cuts total analysis time in half again. ---- 🏆 Sponsors: 1. Maven: Get 15% off Caitlin’s courses with code AAKASHxMAVEN - https://bit.ly/4rHCCrb 2. Pendo: The #1 software experience management platform - http://www.pendo.io/aakash 3. Jira Product Discovery: Plan with purpose, ship with confidence - https://www.atlassian.com/software/jira/product-discovery 4. Kameleoon: AI experimentation platform - http://www.kameleoon.com/ 5. Amplitude: The market-leader in product analytics - https://amplitude.com/session-replay?utm_campaign=session-replay-launch-2025&utm_source=linkedin&utm_medium=organic-social&utm_content=productgrowthpodcast ---- 👨‍💻 Where to find Aakash: Twitter: https://www.x.com/aakashg0 LinkedIn: https://www.linkedin.com/in/aagupta/ Newsletter: https://www.news.aakashg.com Premium Bundle: https://bundle.aakashg.com Where to find Caitlin: LinkedIn: https://www.linkedin.com/in/caitlindsullivan/ Maven: https://bit.ly/4rHCCrb #aitools #userresearch ---- 🧠 About Product Growth: The world's largest podcast focused solely on product + growth, with over 200K+ listeners. 🔔 Subscribe and turn on notifications to get more videos like this.

Aakash GuptahostCaitlin Sullivanguest

Feb 11, 20261h 12mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

A rigorous, multi-step AI workflow for trustworthy customer research analysis

Good AI research mirrors rigorous human research by separating analysis, verification, and synthesis rather than jumping straight to themes.
A Step 0 “context load” prompt onboards the model with business goals and product details to reduce wrong assumptions and instruction drop-off.
Interview analysis is strengthened by per-participant extraction (e.g., value anchors and fragile points) followed by contradiction checks to prevent cherry-picking and hallucinations.
Survey analysis should start with inductive coding before counting frequencies, then add calibrated emotional intensity ratings to prioritize what matters most.
Agentic workflows in Claude Code can parallelize survey and interview analysis, output structured markdown deliverables, and cut analysis time dramatically—while still requiring audits and human judgment.

IDEAS WORTH REMEMBERING

5 ideas

Don’t start with synthesis; start with granular analysis.

The workflow forces the model to comb through each file/response first (like a human researcher would) before summarizing themes, which reduces missed nuance and overconfident generalizations.

Separate “context loading” from task prompts to prevent instruction loss.

A dedicated Step 0 prompt onboards the model on goals and product/tier details and ends with “do not run analysis yet,” improving focus and reducing incorrect product assumptions.

Per-participant extraction creates traceability and better foundations.

Extracting value anchors, fragile points, quotes, and a churn/stability rating per participant replicates line-by-line human review and produces evidence you can later synthesize confidently.

Add a verification pass specifically designed to catch contradictions.

Having the model re-scan for conflicting statements (and defining what counts as a contradiction) prevents cherry-picking one narrative when the participant’s account is inconsistent.

For surveys, code first—then count.

Inductive open coding (with rules like mutually exclusive primary codes) produces a defensible codebook and prevents the model from miscategorizing or forcing responses into premature themes.

WORDS WORTH SAVING

5 quotes

Good AI customer research and analysis actually looks like replicating the way that we do rigorous analysis as humans.

— Caitlin Sullivan

What most people do… is jumping straight ahead to synthesis, and that's exactly what we don't wanna do.

— Caitlin Sullivan

Internalize this only. Do not run analysis yet.

— Caitlin Sullivan

When we're working with survey responses or short customer feedback, we want to code first.

— Caitlin Sullivan

I’ll call this the CYA way to use AI. Cover your ass.

— Aakash Gupta

Replicating human research rigor with AIStep 0 context loading and instruction managementClaude vs Gemini vs ChatGPT for analysis tradeoffsPer-participant interview analysis (value anchors, fragile points, churn risk)Verification: contradiction checking and audit passesSurvey workflow: inductive coding → quantification → intensity ratingsAgentic/parallel workflows with Claude Code and markdown files

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.