Skip to content
Aakash GuptaAakash Gupta

AI Agents for PMs in 69 Minutes — Masterclass with IBM VP

Armand Ruiz, VP of AI Platform at IBM, reveals why most enterprise AI implementations fail and what Fortune 500 companies are actually building that works. He breaks down the difference between chatbots and agents, the 4-step framework powering real AI systems, and why RAG dominates 90% of enterprise use cases. ---- Transcript: https://www.news.aakashg.com/p/armand-ruiz-podcast ---- ⏰ Timestamps: 00:00 Intro 02:39 What Makes AI Agents Special 04:40 The Four Steps of AI Agents 07:14 AI Agent Development Frameworks 12:59 RAG Explained 16:55 Ads 18:46 Common RAG Mistakes 26:48 Managing Multiple AI Agents 31:39 Ads 33:57 How AI Changes Product Management 37:43 Problem Investigation vs Feature Factory 41:22 Roadmap to Build AI Agents 43:30 Can Open Source AI Win? 51:39 IBM's AI Strategy 59:32 Career Journey: Intern to VP 1:02:36 Building 200K LinkedIn Followers 1:08:18 Outro ---- 🏆 Thanks to our sponsors: 1. Kameleoon: Prompt-based experimentation platform - kameleoon.com/prompt 2. AI Evals Course for PMs & Engineers: Get $800 off https://maven.com/parlance-labs/evals?promoCode=ag-product-growth 3. Vanta: Security and compliance for fast-moving teams - https://www.vanta.com/lp/demo-1k 4. Amplitude: Mobile user engagement analytics - https://amplitude.com/digital-maturity-model 5. Product Faculty: Product Strategy Certificate for Leaders (Get $550 off) https://maven.com/product-faculty/ai-product-management-certification?promoCode=AAKASH25 ---- Key Takeaways: 1. AI Agents vs Chatbots: Chatbots respond to queries while agents execute complete workflows. The difference between getting suggestions and getting finished work. 2. Four-Step Agent Framework: Every agent needs Thinking (reasoning), Planning (task breakdown), Action (system execution), and Reflection (learning from outcomes). 3. RAG Dominates Enterprise: 90% of enterprise AI uses RAG to connect LLMs to proprietary data. Success requires 95%+ accuracy through sophisticated evaluation. 4. Vision RAG Unlocks Value: Most business data lives in charts and tables that traditional text-only RAG completely misses. 5. Framework Selection Matters: Use coding frameworks (LangGraph, CrewAI) for complex systems. Use no-code tools (Lindy, n8n) for rapid prototyping. 6. PM Ratios Transform: Traditional 1:6-10 PM-to-developer ratios become 1:2-30 when agents handle research and documentation. 7. Prototypes Beat PRDs: Show working systems instead of 20-page documents teams misinterpret. AI enables functional demos. 8. Open Source Wins: Despite closed-source capabilities, enterprises choose open source for licensing control and infrastructure flexibility. 9. Technical Literacy Essential: Understanding agents, RAG, and frameworks becomes baseline knowledge for everyone, not just developers. 10. Implementation Reality: Enterprise RAG needs heavy data engineering. Teams underestimate accuracy requirements and engineering complexity. ---- 👨‍💻 Where to find Armand: LinkedIn: linkedin.com/in/armandruiz IBM AI Platform: ibm.com/ai ---- 👨‍💻 Where to find Aakash: Twitter: twitter.com/aakashg0 LinkedIn: linkedin.com/in/aagupta/ #AIAgents #EnterpriseAI #RAGSystems #ProductManagement ---- 🧠 About Product Growth: The world's largest podcast focused solely on product + growth, with over 185K listeners. Hosted by Aakash Gupta, who spent 16 years in PM, rising to VP of product, this 2x/week show covers product and growth topics in depth. 🔔 Subscribe and turn on notifications to master AI agent implementation!

Aakash GuptahostArmand Ruizguest
Sep 5, 20251h 9mWatch on YouTube ↗

CHAPTERS

  1. Why AI agents are “the wall of automation” beyond chatbots

    Aakash and Armand frame AI agents as the next step after predictive analytics and chatbots—systems that can automate real work end-to-end. Armand shares why enterprises (and CIOs) now prioritize agents, but also why safe, secure production deployment is still the hard part.

    • Agents as the fulfillment of AI’s automation promise vs. chat-only interfaces
    • Evolution: predictive analytics → chatbots → agentic automation
    • Enterprise urgency: agents high on CIO agendas
    • Core tension: enabling experimentation while keeping production secure
    • Variation in risk tolerance and innovation appetite across companies
  2. The 4-step mental model: Think → Plan → Act → Reflect

    Armand walks through his simple four-step diagram that explains what an agent does internally. The model clarifies how agents reason, decompose tasks, take actions in real systems, and improve via reflection loops over time.

    • Think: LLM reasoning (more tokens/inference for better reasoning)
    • Plan: break tasks into subtasks and goals; challenge prior outputs
    • Act: execute in tools (CRM, email, Workday-style systems); enabled by protocols like MCP
    • Reflect: learn from history and human feedback to improve future runs
    • Why reflection is key to moving from ‘raw’ to reliable agents
  3. Choosing an agent-building approach: code frameworks vs no-code builders

    They categorize agent development tooling into two camps: programming frameworks that provide maximum control and low/no-code tools that speed up experimentation. The discussion highlights popular options and when to use each.

    • Code-first frameworks: LangGraph, CrewAI, LlamaIndex, AutoGen (often Python-based)
    • Low/no-code tools: LangFlow, Lindy, n8n, Stack AI, Flowise (faster iteration)
    • When to prefer code: complex agentic implementations needing control and flexibility
    • Open-source innovation loop: GitHub issues/PRs drive rapid evolution
    • PM-friendly takeaway: use no-code to start; partner with engineers for robustness
  4. RAG demystified: adding fresh enterprise context to LLMs

    Armand explains Retrieval-Augmented Generation (RAG) as the dominant method for injecting up-to-date knowledge into LLM outputs. He contrasts RAG with fine-tuning and shares why RAG became the default enterprise pattern post-ChatGPT.

    • LLMs are trained at a point in time; enterprises need updated/private context
    • RAG connects LLMs to knowledge bases and databases (structured + unstructured)
    • Fine-tuning vs RAG: fine-tuning isn’t ideal for frequently changing data
    • Enterprise reality: “90% of use cases” initially were RAG-driven
    • RAG as a goldmine for companies sitting on large internal datasets
  5. RAG inside agent workflows: enterprise search becomes ‘answer + action’

    They position RAG as a core component of agentic systems, especially during planning where agents fetch needed data. Examples show how RAG turns traditional enterprise search into direct, usable intelligence for decisions and downstream work.

    • RAG fits naturally into the Plan phase as a ‘fetch data’ step
    • Moves beyond metadata search to document-level understanding
    • Example: analyzing top customers, extracting insights from internal reports
    • PM use case: faster assumption-building and feature prioritization from internal data
    • Enterprise explosion of use cases as access to intelligence broadens
  6. RAG architecture building blocks (and why it’s mostly data engineering)

    Armand outlines the real components behind RAG pipelines—embeddings, vector databases, filtering/ranking, and orchestration. The key message: most RAG failures and successes are driven by data engineering complexity, not just the LLM choice.

    • Core components: embedding models, vector DBs, retrieval, filtering, ranking
    • Embedding models vary by speed, language performance, and quality
    • Tooling split: app-layer frameworks (LangChain/LlamaIndex) vs data stack (Spark/Airflow)
    • Scaling data connections is complex in real enterprise environments
    • ‘AI problems are data engineering problems’ framing
  7. Vision RAG: extracting value from charts, tables, and rich PDFs

    The conversation expands RAG from text-only to multimodal information retrieval. Vision RAG enables agents to understand charts/tables and visually dense documents, unlocking industries where critical data lives in non-text formats.

    • Vision RAG = classic RAG + multimodal extraction/understanding
    • Use cases: complex PDFs, charts, tables, healthcare/finance visuals
    • Need both: strong extraction pipeline + capable multimodal models
    • IBM example: DocLing (open-source) for document parsing across formats
    • Enables new ‘previously inaccessible’ enterprise knowledge workflows
  8. Common RAG mistakes: accuracy expectations, ‘vanilla’ pipelines, and weak evals

    Armand focuses on the gap between consumer tolerance for imperfect answers and enterprise requirements for accuracy and trust. Teams often deploy generic templates without rigorous evaluation, leading to frustration and unreliable systems.

    • Enterprise standard: 70% accuracy isn’t acceptable for critical workflows
    • Mistake: using off-the-shelf/vanilla RAG templates without iteration
    • Accuracy is a system-level/data problem, not just a prompt problem
    • Define “acceptable business accuracy” per use case
    • RAG’s power is huge—but doing it right is engineering-intensive
  9. Evals everywhere: how to test agent/RAG systems like real software

    They argue evaluation must happen at multiple steps in an agentic workflow, not only at the final answer. Armand explains evals as a way to inject human expertise, scale SME input, and continuously improve systems in production.

    • Multi-step workflows require stepwise evaluation, not only end-output checks
    • Evals as human expertise validating system behavior and trustworthiness
    • Blend approaches: synthetic data, ground-truth datasets, SME review loops
    • IBM ‘Evaluation Studio’ as a GUI approach to bring SMEs into evals
    • Operational reality: continuous monitoring/maintenance, not one-and-done testing
  10. Managing 10–20 agents: orchestration as a new knowledge-worker skill

    Armand describes a near-future where employees supervise fleets of specialized agents. The challenge becomes orchestration—assigning tasks, setting approvals, and judging outputs—especially in traditional companies where adoption takes longer.

    • Function-based agent portfolios (e.g., marketing copy, creative, A/B testing)
    • Human-in-the-loop design to prevent brand/quality failures
    • Orchestration as the emerging skill: coordinating agents and validating outputs
    • Adoption timeline: AI-native startups vs traditional enterprise journey
    • AI literacy + hands-on practice as the path to better orchestration
  11. How AI reshapes product management: fewer PMs, broader scope, more leverage

    They explore how agents can change PM-to-engineer ratios and expand a PM’s coverage area. Armand maps agents across the PM lifecycle—from competitive research to feedback synthesis to PRD drafting and prototyping.

    • Potential ratio shift: 1 PM per 6–10 devs → 1 PM per 20–30 devs (with agents)
    • Agent examples for PMs: competitive intel, market monitoring, sales enablement inputs
    • Feedback triage: combine SaaS usage data with NPS/social/user feedback
    • PRDs accelerated: AI can draft 80–90% before human refinement
    • Prototype and validate earlier—compressing the idea-to-validation loop
  12. Prototype-first vs write-first: avoiding ‘feature factory’ while moving faster

    Armand shares a career story where a prototype beat slides and PRDs in an exec meeting, illustrating why prototypes communicate better. They also address the risk of rushing into solutions without deep problem investigation and customer understanding.

    • Personal example: prototype won leadership buy-in when English/slides wouldn’t
    • Prototypes reduce loss in translation across global teams and handoffs
    • AI makes ‘vibe coding’ and rapid prototyping accessible to PMs
    • Risk: jumping to prototypes can create feature-factory behavior
    • Counterbalance: customer-first discovery and deep problem investigation
  13. Roadmap for learning and building agents: concepts → one agent → deeper tooling

    Armand gives a practical learning sequence: start with fundamentals, build a single useful agent, then progress toward more advanced tools as needed. He emphasizes hands-on exploration as the only way to learn the ‘art of the possible.’

    • Start with concepts: LLMs, reasoning, RAG, agent basics
    • Build one agent using a no/low-code tool to internalize workflows
    • Then explore vibe-coding tools and (optionally) Python for deeper control
    • Use targeted short courses (e.g., quick RAG courses) to level up fast
    • Leadership vs practitioner paths: change management vs domain-specific agents
  14. Can open source AI win? Why enterprises default to open ecosystems

    Armand argues open source tends to win in enterprise contexts due to deployability, control, and ecosystem momentum. He also acknowledges the reality that closed-source labs may stay ahead temporarily, but open source catches up over time.

    • Enterprise needs: deploy anywhere, keep data private, avoid vendor lock-in
    • Open licenses matter; deploy models on-prem/private cloud/hybrid
    • Community innovation: open source narrows gaps even if behind briefly
    • Ecosystem examples: Kubernetes, vLLM, PyTorch underpin modern AI stacks
    • Not just models—tools, runtimes, and frameworks are open-source-driven
  15. IBM’s AI strategy: flexibility, Granite models, scaling inference, and governance

    Armand describes IBM’s positioning around deployment flexibility (any cloud/on-prem), a family of models (Granite), and enterprise-grade governance. He emphasizes that compliance and policy management must be designed in from the start.

    • Core bet: flexibility to run AI close to data across hybrid environments
    • Granite models: small, cost-efficient models tuned for enterprise tasks
    • Manage and govern access to many AI ‘engines’ across environments
    • Scaling inference across clusters as a key enterprise requirement
    • Governance: inventory of use cases, compliance readiness, evolving regulation
  16. Career + creator playbook: intern-to-VP journey and building 200k followers

    Armand closes by sharing how intentional moves, consistency in AI through ‘winters,’ and customer proximity accelerated his career. He also breaks down his daily LinkedIn system, why he now uses less AI in writing, and how targeting the right audience beats chasing virality.

    • Career strategy: intentional path to Silicon Valley + consistent AI focus
    • Customer closeness: travel and deep discovery as a differentiator
    • Corporate growth levers: network, add value, show results fast, ‘show not tell’
    • Content system: daily posting routine, idea capture, formatting, metric awareness
    • AI in content: previously heavy for research/virality; now more human thinking for differentiation

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.