Lenny's Podcast

Reganti & Badam: Why most AI products fail in production

Why treating LLMs as non-deterministic APIs and earning autonomy beats hype; human-in-the-loop calibration prevents the failures that sink AI products.

Lenny RachitskyhostAishwarya Naresh RegantiguestKiriti Badamguest

Jan 10, 20261h 26mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Why AI products fail: managing non-determinism, autonomy, and feedback loops

Aishwarya Reganti and Kiriti Badam explain why many AI products fail: teams treat LLMs like deterministic software and rush to fully autonomous agents without earning trust.
They argue AI product development must account for non-deterministic inputs/outputs and the agency–control trade-off, which changes how you design, ship, and operate products.
Their core prescription is to start with low-risk, high-control versions (human-in-the-loop), build measurement and learning flywheels, and gradually increase autonomy as surprises diminish.
They introduce a CI/CD-inspired framework—continuous calibration / continuous development—combining scoped datasets, eval/monitoring, behavior analysis, and iterative fixes, with leadership and culture as key enablers.

IDEAS WORTH REMEMBERING

5 ideas

Treat LLMs as non-deterministic APIs, not normal software components.

Unlike traditional UIs and workflows, users express intent in countless ways and LLM outputs vary with phrasing and context. You must design for probabilistic behavior, not predictable state machines.

Autonomy must be earned—more agency means less control and higher risk.

Every added capability (tool use, decisions, actions) increases the chance of costly mistakes and trust erosion. Start with constrained decision-making and expand only after reliability is demonstrated.

Build AI products in versions that progressively increase autonomy.

Examples: support agent (suggest → draft to customer → issue refunds/actions), coding assistant (inline snippets → refactors/tests → open PRs), marketing assistant (copy drafts → run campaigns → auto-optimize). This reduces blast radius while you learn failure modes.

Use human-in-the-loop stages to create a learning flywheel, not just safety.

Copilot phases let you log edits, accept/reject decisions, and trace behavior—turning human oversight into training data and product insight to improve prompts, tools, and workflow design over time.

Successful AI adoption is a people-and-process transformation, not only technical.

They highlight a “success triangle”: leaders who rebuild intuition hands-on, a culture that empowers SMEs (vs. replacement fear), and technical rigor grounded in workflow understanding and right-tool selection.

WORDS WORTH SAVING

5 quotes

Most people tend to ignore the non-determinism.

— Aishwarya Naresh Reganti

Every time you hand over decision-making capabilities to agentic systems, you're kind of relinquishing some amount of control on your end.

— Aishwarya Naresh Reganti

When you start small... one easy slippery slope is to keep thinking about complexities of the solution and forget the problem that you're trying to solve.

— Kiriti Badam

It’s not about being the first company to have an agent… It’s about, have you built the right flywheels in place so that you can improve over time?

— Aishwarya Naresh Reganti

Pain is the new moat.

— Kiriti Badam

Non-determinism in AI UX (input and output)Agency vs. control trade-off for agentsStepwise autonomy progression (V1→V3 patterns)Reliability and trust in enterprise deploymentsContinuous calibration / continuous development (CC/CD) frameworkEvals vs. production monitoring vs. “vibes”Leadership hands-on learning and org culture (SME buy-in)Workflow decomposition, context engineering, messy enterprise dataMulti-agent systems: misunderstood coordination patternsFuture: proactive/background agents; multimodal experiences“Pain is the new moat” (persistence as competitive advantage)

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.