Successfully coding with AI in large enterprises: Centralized rules, workflows for tech debt, & more

Name: Successfully coding with AI in large enterprises: Centralized rules, workflows for tech debt, & more
Uploaded: 2025-07-21T12:00:00Z
Duration: 44 min 56 s
Description: Claire Vo and Zach Davis argue that “vibe coding” doesn’t translate to enterprise-scale software, where reliability, maintainability, and team-wide consistency matter.

How I AIJul 21, 202544m

Claire Vo (host), Zach Davis (guest)

Why “vibe coding” fails in enterprisesAI enablement ownership and change managementIn-repo documentation as LLM-readable contextCentralized, tool-agnostic agent rules (.agentsrules)Devin Wiki/knowledge vs. avoiding duplicated guidanceRule creation via agents + human reviewTech-debt burn-down via prioritized checklists and agentsAI-assisted interviewer coaching via scorecard grading

In this episode of How I AI, featuring Claire Vo and Zach Davis, Successfully coding with AI in large enterprises: Centralized rules, workflows for tech debt, & more explores enterprise AI coding: centralized rules, docs, tech-debt workflows, hiring rigor Claire Vo and Zach Davis argue that “vibe coding” doesn’t translate to enterprise-scale software, where reliability, maintainability, and team-wide consistency matter.

Enterprise AI coding: centralized rules, docs, tech-debt workflows, hiring rigor

Claire Vo and Zach Davis argue that “vibe coding” doesn’t translate to enterprise-scale software, where reliability, maintainability, and team-wide consistency matter.

LaunchDarkly improved AI-agent performance by making the repo easier for both humans and LLMs: moving scattered docs into the repo, and centralizing guidance in a single rules source of truth that multiple tools can reference.

They demonstrate practical workflows: using Devin (and its wiki/knowledge) to generate and refine documentation/rules, and using Cursor/Claude to plan and execute incremental tech-debt cleanup with checklists that both bots and humans can follow.

They also show a hiring workflow where a custom GPT evaluates interview scorecard quality against a rubric and produces coachable feedback (including Slack-ready messages) to keep hiring standards consistent.

Key Takeaways

Treat AI adoption as an engineering enablement program, not an individual hack.

Zach emphasizes assigning clear responsibility to someone close to the code who experiments, identifies failure modes, and ensures skeptics get an early “aha” instead of a first-run failure.

Get the full analysis with uListen AI

What’s good for humans is good for LLMs—fix your repo’s “usability.”

They moved key guides (style, accessibility, frontend organization) into a repo docs directory so both engineers and tools can reliably find and apply standards without hunting through Confluence/Docs.

Get the full analysis with uListen AI

Centralize rules once, then point every tool to the same source of truth.

Instead of maintaining separate Claude. ...

Get the full analysis with uListen AI

Use agents to draft rules/docs, but review with a fine-tooth comb.

Zach seeds structure using Devin’s wiki and agent output, then manually verifies correctness and trims rules to be concise, linking to deeper docs when needed.

Get the full analysis with uListen AI

Write rules that prevent domain ambiguity and common tool mistakes.

LaunchDarkly had cases where models confused “feature flags in LaunchDarkly product” vs “feature flags in code,” so they wrote explicit flagging guidance to improve correctness and automation success (e. ...

Get the full analysis with uListen AI

Turn tech debt into an agent-executable backlog with tiers and checkboxes.

They captured noisy test output, had an LLM categorize/quantify offenders, then created a migrations checklist (tiered tasks) so any agent (Cursor, Devin, background agents) can pick the “next unit of work,” produce PRs, and get reviewed/merged incrementally.

Get the full analysis with uListen AI

Use AI to raise hiring quality by coaching the coaches.

A custom GPT grades interview scorecards (not candidates) against a rubric, highlights strengths/improvements, and drafts a Slack message—helping a conflict-avoidant leader deliver consistent feedback and improving interviewer calibration over time.

Get the full analysis with uListen AI

Notable Quotes

““Vibe coding is not an acceptable enterprise development strategy.””
— Claire Vo

““What’s good for humans is also good for LLMs.””
— Zach Davis

““Everyone was on their own journey… and that just doesn’t scale very well.””
— Zach Davis

““If it’s hard to get Devin up and running, it’s probably hard for your human developers to get up and running.””
— Zach Davis

““Technical debt is my favorite use case for AI to supercharge a medium-sized organization.””
— Zach Davis

Questions Answered in This Episode

What was the minimum viable structure of your .agents directory (top-level files/folders) before it started noticeably improving outputs across tools?

Claire Vo and Zach Davis argue that “vibe coding” doesn’t translate to enterprise-scale software, where reliability, maintainability, and team-wide consistency matter.

Get the full analysis with uListen AI

Can you share an example of a rule that specifically stopped the “feature flag product vs. code” confusion, and what the before/after behavior looked like?

Get the full analysis with uListen AI

How do you decide what belongs in concise agent rules vs. longer human docs—do you have a heuristic besides “keep it short”?

Get the full analysis with uListen AI

You mentioned tool-specific chunking (e.g., ~200-line chunks in Cursor). How do you validate what each tool actually reads and ensure your most important constraints are never skipped?

Get the full analysis with uListen AI

In the tech-debt workflow, what acceptance criteria do you require before merging an agent-generated cleanup PR (tests, lint, screenshots, manual QA)?

Get the full analysis with uListen AI

Transcript Preview

Claire Vo

vibe coding is not an acceptable enterprise development strategy. I love it. I can do a hundred commits a week by myself, on my side project, on my startup, but when you're working on a code base in a platform like LaunchDarkly that powers trillions and trillions of experiences every day, you can't take the same strategies and tactics that a vibe coder could take.

Zach Davis

One of the things that I realized is what's good for humans is also good for LLMs, and so I really started with: How do we make sure that the repo is well set up for humans to know how to work in it? So we have front-end organization, we have accessibility, we have a JS style guide. So all... It's this very detailed documentation that we've put into the repo itself, rather than have it in other places. And this way, LLMs can access it, humans can access it, et cetera.

Claire Vo

I think all the engineers out there are, like, crossing their fingers and hoping that there's one rules protocol to rule them all that shows up. And I think what you've shown is you can just create that yourself, and then that makes it much more scalable. [upbeat music] Welcome back to How I AI. I'm Claire Vo, product leader and AI obsessive, here on a mission to help you build better with these new tools. Today, we have a great episode for anybody trying to deploy AI agents in a real engineering team with a real codebase, not just vibe coding. We have Zach Davis, director of engineering at LaunchDarkly, who's gonna show us how he sets up centralized rules and docs for all his AI agents, uses AI to burn down tech debt, and keep his hiring bar high. Let's get to it. This episode is brought to you by WorkOS. AI has already changed how we work. Tools are helping teams write better code, analyze customer data, and even handle support tickets automatically. But there's a catch: these tools only work well when they have deep access to company systems. Your copilot needs to see your entire codebase. Your chatbot needs to search across internal docs. And for enterprise buyers, that raises serious security concerns. That's why these apps face intense IT scrutiny from day one. To pass, they need secure authentication, access controls, audit logs, the whole suite of enterprise features. Building all that from scratch, it's a massive lift. That's where WorkOS comes in. WorkOS gives you drop-in APIs for enterprise features, so your app can become enterprise-ready and scale upmarket faster. Think of it like Stripe for enterprise features. OpenAI, Perplexity, and Cursor are already using WorkOS to move faster and meet enterprise demands. Join them and hundreds of other industry leaders at workos.com. Start building today. Zach, I'm so excited to have you here because I feel like I maybe turned you into an AI fiend at this point. [chuckles] Before the show, we were talking about how many tools you're now using. So before we dive in, can you just tell us a quick list, maybe not-so-quick list, of all the AI tools the technology team at LaunchDarkly are now using?

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript

Get more from every podcast

AI summaries, searchable transcripts, and fact-checking. Free forever.

Add to Chrome