Aakash GuptaAI Agents for PMs in 69 Minutes — Masterclass with IBM VP
CHAPTERS
Why AI agents are “the wall of automation” beyond chatbots
Aakash and Armand frame AI agents as the next step after predictive analytics and chatbots—systems that can automate real work end-to-end. Armand shares why enterprises (and CIOs) now prioritize agents, but also why safe, secure production deployment is still the hard part.
- •Agents as the fulfillment of AI’s automation promise vs. chat-only interfaces
- •Evolution: predictive analytics → chatbots → agentic automation
- •Enterprise urgency: agents high on CIO agendas
- •Core tension: enabling experimentation while keeping production secure
- •Variation in risk tolerance and innovation appetite across companies
The 4-step mental model: Think → Plan → Act → Reflect
Armand walks through his simple four-step diagram that explains what an agent does internally. The model clarifies how agents reason, decompose tasks, take actions in real systems, and improve via reflection loops over time.
- •Think: LLM reasoning (more tokens/inference for better reasoning)
- •Plan: break tasks into subtasks and goals; challenge prior outputs
- •Act: execute in tools (CRM, email, Workday-style systems); enabled by protocols like MCP
- •Reflect: learn from history and human feedback to improve future runs
- •Why reflection is key to moving from ‘raw’ to reliable agents
Choosing an agent-building approach: code frameworks vs no-code builders
They categorize agent development tooling into two camps: programming frameworks that provide maximum control and low/no-code tools that speed up experimentation. The discussion highlights popular options and when to use each.
- •Code-first frameworks: LangGraph, CrewAI, LlamaIndex, AutoGen (often Python-based)
- •Low/no-code tools: LangFlow, Lindy, n8n, Stack AI, Flowise (faster iteration)
- •When to prefer code: complex agentic implementations needing control and flexibility
- •Open-source innovation loop: GitHub issues/PRs drive rapid evolution
- •PM-friendly takeaway: use no-code to start; partner with engineers for robustness
RAG demystified: adding fresh enterprise context to LLMs
Armand explains Retrieval-Augmented Generation (RAG) as the dominant method for injecting up-to-date knowledge into LLM outputs. He contrasts RAG with fine-tuning and shares why RAG became the default enterprise pattern post-ChatGPT.
- •LLMs are trained at a point in time; enterprises need updated/private context
- •RAG connects LLMs to knowledge bases and databases (structured + unstructured)
- •Fine-tuning vs RAG: fine-tuning isn’t ideal for frequently changing data
- •Enterprise reality: “90% of use cases” initially were RAG-driven
- •RAG as a goldmine for companies sitting on large internal datasets
RAG inside agent workflows: enterprise search becomes ‘answer + action’
They position RAG as a core component of agentic systems, especially during planning where agents fetch needed data. Examples show how RAG turns traditional enterprise search into direct, usable intelligence for decisions and downstream work.
- •RAG fits naturally into the Plan phase as a ‘fetch data’ step
- •Moves beyond metadata search to document-level understanding
- •Example: analyzing top customers, extracting insights from internal reports
- •PM use case: faster assumption-building and feature prioritization from internal data
- •Enterprise explosion of use cases as access to intelligence broadens
RAG architecture building blocks (and why it’s mostly data engineering)
Armand outlines the real components behind RAG pipelines—embeddings, vector databases, filtering/ranking, and orchestration. The key message: most RAG failures and successes are driven by data engineering complexity, not just the LLM choice.
- •Core components: embedding models, vector DBs, retrieval, filtering, ranking
- •Embedding models vary by speed, language performance, and quality
- •Tooling split: app-layer frameworks (LangChain/LlamaIndex) vs data stack (Spark/Airflow)
- •Scaling data connections is complex in real enterprise environments
- •‘AI problems are data engineering problems’ framing
Vision RAG: extracting value from charts, tables, and rich PDFs
The conversation expands RAG from text-only to multimodal information retrieval. Vision RAG enables agents to understand charts/tables and visually dense documents, unlocking industries where critical data lives in non-text formats.
- •Vision RAG = classic RAG + multimodal extraction/understanding
- •Use cases: complex PDFs, charts, tables, healthcare/finance visuals
- •Need both: strong extraction pipeline + capable multimodal models
- •IBM example: DocLing (open-source) for document parsing across formats
- •Enables new ‘previously inaccessible’ enterprise knowledge workflows
Common RAG mistakes: accuracy expectations, ‘vanilla’ pipelines, and weak evals
Armand focuses on the gap between consumer tolerance for imperfect answers and enterprise requirements for accuracy and trust. Teams often deploy generic templates without rigorous evaluation, leading to frustration and unreliable systems.
- •Enterprise standard: 70% accuracy isn’t acceptable for critical workflows
- •Mistake: using off-the-shelf/vanilla RAG templates without iteration
- •Accuracy is a system-level/data problem, not just a prompt problem
- •Define “acceptable business accuracy” per use case
- •RAG’s power is huge—but doing it right is engineering-intensive
Evals everywhere: how to test agent/RAG systems like real software
They argue evaluation must happen at multiple steps in an agentic workflow, not only at the final answer. Armand explains evals as a way to inject human expertise, scale SME input, and continuously improve systems in production.
- •Multi-step workflows require stepwise evaluation, not only end-output checks
- •Evals as human expertise validating system behavior and trustworthiness
- •Blend approaches: synthetic data, ground-truth datasets, SME review loops
- •IBM ‘Evaluation Studio’ as a GUI approach to bring SMEs into evals
- •Operational reality: continuous monitoring/maintenance, not one-and-done testing
Managing 10–20 agents: orchestration as a new knowledge-worker skill
Armand describes a near-future where employees supervise fleets of specialized agents. The challenge becomes orchestration—assigning tasks, setting approvals, and judging outputs—especially in traditional companies where adoption takes longer.
- •Function-based agent portfolios (e.g., marketing copy, creative, A/B testing)
- •Human-in-the-loop design to prevent brand/quality failures
- •Orchestration as the emerging skill: coordinating agents and validating outputs
- •Adoption timeline: AI-native startups vs traditional enterprise journey
- •AI literacy + hands-on practice as the path to better orchestration
How AI reshapes product management: fewer PMs, broader scope, more leverage
They explore how agents can change PM-to-engineer ratios and expand a PM’s coverage area. Armand maps agents across the PM lifecycle—from competitive research to feedback synthesis to PRD drafting and prototyping.
- •Potential ratio shift: 1 PM per 6–10 devs → 1 PM per 20–30 devs (with agents)
- •Agent examples for PMs: competitive intel, market monitoring, sales enablement inputs
- •Feedback triage: combine SaaS usage data with NPS/social/user feedback
- •PRDs accelerated: AI can draft 80–90% before human refinement
- •Prototype and validate earlier—compressing the idea-to-validation loop
Prototype-first vs write-first: avoiding ‘feature factory’ while moving faster
Armand shares a career story where a prototype beat slides and PRDs in an exec meeting, illustrating why prototypes communicate better. They also address the risk of rushing into solutions without deep problem investigation and customer understanding.
- •Personal example: prototype won leadership buy-in when English/slides wouldn’t
- •Prototypes reduce loss in translation across global teams and handoffs
- •AI makes ‘vibe coding’ and rapid prototyping accessible to PMs
- •Risk: jumping to prototypes can create feature-factory behavior
- •Counterbalance: customer-first discovery and deep problem investigation
Roadmap for learning and building agents: concepts → one agent → deeper tooling
Armand gives a practical learning sequence: start with fundamentals, build a single useful agent, then progress toward more advanced tools as needed. He emphasizes hands-on exploration as the only way to learn the ‘art of the possible.’
- •Start with concepts: LLMs, reasoning, RAG, agent basics
- •Build one agent using a no/low-code tool to internalize workflows
- •Then explore vibe-coding tools and (optionally) Python for deeper control
- •Use targeted short courses (e.g., quick RAG courses) to level up fast
- •Leadership vs practitioner paths: change management vs domain-specific agents
Can open source AI win? Why enterprises default to open ecosystems
Armand argues open source tends to win in enterprise contexts due to deployability, control, and ecosystem momentum. He also acknowledges the reality that closed-source labs may stay ahead temporarily, but open source catches up over time.
- •Enterprise needs: deploy anywhere, keep data private, avoid vendor lock-in
- •Open licenses matter; deploy models on-prem/private cloud/hybrid
- •Community innovation: open source narrows gaps even if behind briefly
- •Ecosystem examples: Kubernetes, vLLM, PyTorch underpin modern AI stacks
- •Not just models—tools, runtimes, and frameworks are open-source-driven
IBM’s AI strategy: flexibility, Granite models, scaling inference, and governance
Armand describes IBM’s positioning around deployment flexibility (any cloud/on-prem), a family of models (Granite), and enterprise-grade governance. He emphasizes that compliance and policy management must be designed in from the start.
- •Core bet: flexibility to run AI close to data across hybrid environments
- •Granite models: small, cost-efficient models tuned for enterprise tasks
- •Manage and govern access to many AI ‘engines’ across environments
- •Scaling inference across clusters as a key enterprise requirement
- •Governance: inventory of use cases, compliance readiness, evolving regulation
Career + creator playbook: intern-to-VP journey and building 200k followers
Armand closes by sharing how intentional moves, consistency in AI through ‘winters,’ and customer proximity accelerated his career. He also breaks down his daily LinkedIn system, why he now uses less AI in writing, and how targeting the right audience beats chasing virality.
- •Career strategy: intentional path to Silicon Valley + consistent AI focus
- •Customer closeness: travel and deep discovery as a differentiator
- •Corporate growth levers: network, add value, show results fast, ‘show not tell’
- •Content system: daily posting routine, idea capture, formatting, metric awareness
- •AI in content: previously heavy for research/virality; now more human thinking for differentiation