Aaron Levie and Steven Sinofsky on the AI-Worker Future

What exactly is an AI agent, and how will agents change the way we work? In this episode, a16z general partners Erik Torenberg and Martin Casado sit down with Aaron Levie (CEO, Box) and Steven Sinofsky (a16z board partner; former Microsoft exec) to unpack one of the hottest debates in AI right now. They cover: - Competing definitions of an “agent,” from background tasks to autonomous interns - Why today’s agents look less like a single AGI and more like networks of specialized sub-agents - The technical challenges of long-running, self-improving systems - How agent-driven workflows could reshape coding, productivity, and enterprise software - What history — from the early PC era to the rise of the internet — tells us about platform shifts like this one The conversation moves from deep technical questions to big-picture implications for founders, enterprises, and the future of work. Timecodes: 0:00 Introduction: The Evolution of AI Agents 0:36 Defining Agency and Autonomy 1:39 Long-Running Agents and Feedback Loops 4:27 Specialization and Task Division in AI 6:04 Anthropomorphizing AI and Economic Impact 9:10 Predictions, Progress, and Platform Shifts 11:31 Recursive Self-Improvement and Technical Challenges 13: 13 Hallucinations, Verification, and Expert Productivity 16:16 The Role of Experts and Tool Adoption 22:14 Changing Workflows: Agents Reshaping Work Patterns 45:55 Division of Labor, Specialization, and New Roles 48:47 Verticalization, Applied AI, and the Future of Agents 54:44 Platform Competition and the Application Layer Resources: Find Aaron on X: https://x.com/levie Find Martin on X: https://x.com/martin_casado Find Steven on X: https://x.com/stevesi Stay Updated: Let us know what you think: https://ratethispodcast.com/a16z Find a16z on Twitter: https://twitter.com/a16z Find a16z on LinkedIn: https://www.linkedin.com/company/a16z Subscribe on your favorite podcast app: https://a16z.simplecast.com/ Follow our host: https://x.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details, please see a16z.com/disclosures.

Erik TorenberghostMartin CasadohostSteven Sinofskyguest

Aug 25, 202556mWatch on YouTube ↗

CHAPTERS

From chat UI to background “workers”: what AI agents are becoming
The discussion opens by reframing the “talking to a chatbot” form factor as a temporary phase. The panel argues the endpoint is autonomous, background-running software that does real work with minimal user intervention.
Defining agency vs autonomy: long-running tasks and self-feedback
They separate autonomy (running for a long time) from true agency (taking outputs and feeding them back as new inputs). This introduces the technical and safety constraints of closed-loop behavior and why check-ins are needed.
Why multi-agent decomposition is winning (and the return of Unix-style tools)
Rather than one monolithic AGI-like system, they see a practical architecture emerging: many specialized agents orchestrated together. Smaller, scoped tasks reduce drift and increase reliability, echoing the Unix philosophy of small composable tools.
Stop anthropomorphizing AI: clarifying AGI and the economics of impact
They argue AGI talk often imports human/robot narratives that distort economic reality. Even very capable systems don’t automatically imply job destruction or immediate equilibrium shifts—costs, incentives, and deployment constraints still matter.
Predictions, timelines, and exponential progress: why “by 2027” is a trap
The group critiques date-based forecasting, noting that exponential improvement breaks intuition and makes metrics contentious. Instead, they suggest focusing on capability drivers like compute, data, and model/tool integration.
Recursive self-improvement: feedback loops are real, but not magic
They unpack “recursive self-improvement” as a slogan that hides difficult control-theory questions. Feedback loops can converge, diverge, or asymptote; improvement doesn’t imply runaway superintelligence, especially without well-defined distributions and constraints.
Hallucinations to verification culture: enterprise adoption is maturing
Enterprise attitudes have shifted from initial excitement to concern about hallucinations to a more nuanced operational stance. As model quality improves and tooling (RAG, context handling) matures, companies adopt AI for more critical tasks—paired with systematic review.
Experts get supercharged: tool mastery, prompting, and ‘formal language’ returning
They argue AI amplifies experts first because experts can ask better questions and detect errors. Prompting isn’t disappearing; it’s becoming more like jargon/formal language—efficient communication among domain experts—yielding better outputs with richer instructions.
Workflows invert: tools don’t just automate work—work adapts to tools
A core theme is the moment when people stop forcing new tech into old processes and instead redesign processes around the new capability. They draw analogies to phones losing keypads, expense reporting evolving from forms to receipts, and email wiping out formatted agendas.
Abdicating logic vs reducing work: platform shifts and lost control
They debate whether using LLMs means apps are ‘abdicating logic’ to third parties, contrasting with prior shifts that mostly abstracted resources (cloud) or devices (drivers). The broader point: each platform shift changes both user interaction and what developers build against.
Parallel work via background agents: PR-level control and context-rot constraints
They explore why senior engineers run many background coding agents and review at the pull-request layer. The driver is practical: context windows degrade (“context rot”), so partitioning work across scoped agents (often aligned to microservices) improves reliability and throughput.
Division of labor accelerates: agents reshape org design and task serialization
Agents enable parallelization of work that was previously serialized by human bandwidth and tooling constraints. They forecast a shift where individuals orchestrate many sub-agents across workstreams (events, legal matters, etc.), with new ‘AI productivity’ roles emerging.
Verticalization and applied AI: why domain-specific agents create thousands of companies
They argue the future is highly vertical: agents that do specific jobs deeply (payroll specialist, signing, niche workflows). As pretraining’s broad generalization gives way to post-training, RL, and enterprise-private data, applied companies gain durable advantage.
Platform competition and the application layer: why model providers won’t eat everything
They push back on fears that foundation model companies will subsume all apps, citing historical overestimation of incumbents’ ability to dominate every category. Aggressive ‘Sherlocking’ chills ecosystems, and it’s operationally hard to go deep in dozens of verticals—leaving room for specialists.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome

From chat UI to background “workers”: what AI agents are becoming

Defining agency vs autonomy: long-running tasks and self-feedback

Why multi-agent decomposition is winning (and the return of Unix-style tools)

Stop anthropomorphizing AI: clarifying AGI and the economics of impact

Predictions, timelines, and exponential progress: why “by 2027” is a trap

Recursive self-improvement: feedback loops are real, but not magic

Hallucinations to verification culture: enterprise adoption is maturing

Experts get supercharged: tool mastery, prompting, and ‘formal language’ returning

Workflows invert: tools don’t just automate work—work adapts to tools

Abdicating logic vs reducing work: platform shifts and lost control

Parallel work via background agents: PR-level control and context-rot constraints

Division of labor accelerates: agents reshape org design and task serialization

Verticalization and applied AI: why domain-specific agents create thousands of companies

Platform competition and the application layer: why model providers won’t eat everything

Get more out of YouTube videos.