How I AIHow Intercom 2X'd engineering velocity with Claude Code | Brian Scanlan
CHAPTERS
- 0:00 – 5:01
Meet Brian Scanlan & Intercom’s urgency to “meet the moment” with AI
Claire introduces Brian and frames Intercom as a company that embraced AI both in customer-facing product and internally in engineering. Brian explains why the organization felt urgency: early tools hinted at value, but they were waiting for a true inflection point that would make adoption unquestionably transformative.
- •Intercom’s AI-first product posture made internal adoption easier to justify
- •Early coding tools felt incremental rather than transformative
- •A sense of impatience: if AI is huge for product, it must be huge for building too
- •The team was primed for a ‘breakthrough moment’ to go all-in
- 5:01 – 7:02
The inflection: Opus 4.6, Christmas break, and deciding to go all-in on Claude Code
Brian and Claire describe a sharp capability jump around late 2025 models (Opus 4.6), where prompting shifted from ‘tool babysitting’ to ‘ideas at speed.’ Intercom returned from the holidays convinced the world had changed and committed to standardizing on Claude Code over a mix of tools.
- •‘Imagination becomes the constraint’ once models are strong enough
- •Christmas break amplified experimentation and viral learning via social channels
- •Intercom moved from tool fragmentation (Cursor/Augment/etc.) to a focused bet
- •Personal hacking time becomes a stealth AI upskilling mechanism
- 7:02 – 12:50
Proving velocity: tracking merged PRs per R&D head and setting a 2× goal
Brian shows how Intercom operationalized adoption with measurable goals, using merged pull requests per R&D head as a leading indicator. Claire contextualizes this as ‘treating the org like a product,’ and they discuss why PR throughput—while imperfect—can still be a useful adoption signal in a high-trust culture.
- •CTO-set target: double R&D throughput using PRs as a proxy metric
- •Metrics include all of R&D (engineers, PMs, designers, TPMs) shipping code
- •Hiring growth means raw PR counts rose even more than the normalized 2× metric
- •CI became a bottleneck and was fixed; code review later became the new bottleneck
- •High-trust assumption: focus on outcomes over gaming the metric
- 12:50 – 14:27
Agent-first work: reimagining technical workflows from first principles
Brian argues the real change isn’t ‘work faster,’ but redesigning how work happens when agents are the default. They describe an “agent-first” future where alarms, planning, and delivery involve agents doing the first pass, freeing humans for higher-level concerns and better quality.
- •Shift mindset from speed pressure to workflow redesign
- •Expectation: most technical work becomes agent-first in the near term
- •Even with today’s models, orgs can keep migrating work into agentic flows
- •Requires openness to change and systematic enablement, not just tool access
- 14:27 – 21:22
Cost tradeoffs: AI token spend as an investment (for now)
Claire presses on the exploding bill as usage scales. Brian explains Intercom’s current stance: run the best models broadly (e.g., Opus with large context) to maximize learning and compounding gains, postponing optimization until after major benefits are captured.
- •AI spend rises like ‘hiring whole new offices’—material and fast-growing
- •Intercom prefers speed and learning now; optimize costs later
- •Acknowledges not every company can take the same ‘run Opus everywhere’ posture
- •Cheaper/faster models exist; model routing is a future optimization phase
- 21:22 – 24:03
Demo: shipping a Rails monolith redirect with Claude Code
Brian demonstrates a small change in Intercom’s large Ruby on Rails monolith: adding a redirect with a lobster emoji. The demo becomes a window into how Intercom uses AI to accelerate routine work while keeping guardrails for correctness and workflow consistency.
- •Using Claude Code inside a mature, multi-million-line Rails monolith
- •AI rapidly finds the right location for changes and proposes a PR flow
- •Humans still sanity-check details (e.g., correct URL)
- •The goal is to eliminate tedious work (like trivial redirects) from human focus
- 24:03 – 26:33
Raising PR description standards: custom ‘Create PR’ skill + enforcement hooks
Intercom discovered AI-generated PR descriptions were degrading, focusing on code translation instead of intent. They built a ‘Create PR’ skill that uses session context to produce better intent-driven descriptions, then enforced it via hooks that block PR creation unless the skill is used.
- •Problem: AI PR descriptions regurgitated code instead of explaining intent
- •Built an LLM judge to detect declining PR description quality over time
- •Solution: ‘Create PR’ skill leverages session context to write better descriptions
- •Enforcement: hook blocks GitHub CLI PR creation unless the skill is used
- •Result: measured improvement in PR description quality (per internal evaluation)
- 26:33 – 30:15
Toward a ‘software factory’: predictable standards without killing craftsmanship
Claire and Brian discuss how skills and hooks mirror CI/CD but move standards upstream into the act of building. Brian frames the approach as building a “software factory” that produces consistent, high-quality outputs—while Claire notes this can actually improve developer experience and morale.
- •Skills provide deterministic guardrails earlier than traditional CI/CD
- •Factory metaphor: predictable, repeatable quality standards at scale
- •Upstream guardrails reduce reliance on wikis/SOPs and human memory
- •Quality constraints can make fast shipping feel better—not micromanaged
- 30:15 – 39:20
Telemetry stack: Honeycomb skill tracking + session collection to S3 + insights tools
Brian shows how Intercom avoids ‘flying blind’ by instrumenting usage. Skill invocations are tracked in Honeycomb for visibility and adoption, while raw Claude Code session data is collected (with anonymization) into S3 for deeper analysis and personalized coaching feedback.
- •Honeycomb dashboards track which skills are invoked and when
- •Shared instrumentation enables skill creators to understand real adoption
- •Sessions are collected to S3 and anonymized to protect privacy
- •Internal tooling gives users personalized usage insights and improvement feedback
- •Goal: identify drop-off patterns, effective skills, and areas needing refinement
- 39:20 – 42:16
Skills repository at scale: distribution via IT sync, core vs team-specific skills, evals
Brian tours the internal GitHub repo powering Intercom’s plugin/skills ecosystem and explains how they reliably distribute it. They bypassed flaky plugin update mechanisms by syncing to laptops via IT, and they maintain a quality bar (including evals) for foundational ‘everyone gets it’ skills.
- •Repository houses a growing set of plugins/skills contributed across teams
- •Distribution strategy: IT sync to laptops for reliability (avoid flaky plugin updates)
- •Layering: minimal base plugin for everyone + higher-bar ‘developer tools’ set
- •Quality controls: evals/tests required for core foundational skills
- •Close partnership with IT becomes a deployment ‘cheat code’
- 42:16 – 56:18
Deep dive: ‘Flaky specs’ skill and the ‘and then’ workflow to reach ~100× capability
Brian explains how a flaky-test-fixing skill evolved from ‘as good as a human’ to something approaching a distinguished engineer. The breakthrough is the iterative “and then” pattern: fix one, document the novel fix, update the skill, fan out to similar failures, and keep learning via feedback loops and real test runs.
- •Start with a clear, testable goal and a backlog of real flaky failures
- •Use historical data to build checklists and classify failure patterns
- •Feedback loops: run builds, validate fixes, and iteratively refine instructions
- •Self-improvement: when a fix is novel, update the skill/playbook in-session
- •Fan-out: find similar specs impacted by the same root cause and fix them too
- 56:18 – 1:03:49
Customer implications: making SaaS agent-friendly (CLIs, hints, and onboarding flows)
Intercom’s internal agent experience changes how Brian thinks about product UX: agents increasingly ‘decide’ solutions, sometimes building instead of buying. He argues SaaS must become agent-friendly through discoverable automation surfaces (CLIs/MCP/APIs) and helpful hints that guide agents through signup/onboarding steps like email verification and content setup.
- •Agents often default to ‘build it’ unless products are easy for agents to adopt
- •Agent-friendly surfaces: CLIs, MCP, REST APIs, and possibly ephemeral/multi-step APIs
- •Use ‘helpful hint’ style guidance to steer agents through critical steps
- •Onboarding should anticipate agent workflows (e.g., verifying email via accessible tools)
- •Discoverability and documentation must work for non-human users too
- 1:03:49
Invisible conversion drop-off in agent workflows + lightning round on culture and skeptics
Claire highlights a new risk: in agent-driven onboarding, drop-off can be invisible—users just hit escape and switch approaches. In the lightning round, Brian describes improved team fun and energy, the importance of leaders granting permission and absorbing risk, and how faster feedback loops make work more varied and satisfying.
- •Agent conversion drop-off can be as simple as ‘press escape’—harder to instrument
- •Culture impact: more fun, faster feedback loops, broader individual impact
- •Leadership lesson: give permission, push boundaries responsibly, and own failures
- •Skeptic management: set a clear vision (agent-first) and enable experimentation
- •Closing reflections: builders converge across roles; backlog zero feels achievable
Quality and customer value: shipping faster without ‘slop’
They address skepticism that higher PR volume means lower quality. Brian shares leading and trailing indicators: reduced time from first code to customer-visible updates, increased feature volume, incident monitoring, and external analysis suggesting code quality is improving.
- •Tracking lead-to-ship time (first line of code → customer update) is trending down
- •Higher PR volume appears to correlate with real shipped customer features
- •Incident/outage monitoring shows no alarming increase in production issues
- •Stanford research collaboration suggests measured code quality is improving
- •AI magnifies strengths/weaknesses—mature delivery practices help Intercom benefit