No Priors Ep. 41 | With Imbue Co-Founders Kanjun Qiu and Josh Albrecht

00:00 - Introduction to Imbue 04:55 - The Spectrum of Agent Tasks 10:23 - Specialization and Generalization With Agents 14:08 - Code and Language in AI Agents 21:00 - Evaluating AI Development Tools Efficiently 26:39 - Prioritizing GPU Usage

Sarah GuohostKanjun QiuguestJosh AlbrechtguestElad Gilhost

Nov 15, 202332mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Imbue Envisions Reliable Reasoning Agents To Transform Everyday Computer Use

Imbue co-founders Kanjun Qiu and Josh Albrecht discuss their mission to build AI agents that can reason, act autonomously, and write robust code, moving beyond today’s chatbots and token-prediction LLMs.
They argue that current computers require constant micromanagement and that the next revolution is agents that understand goals, plan, and reliably execute tasks on users’ behalf.
A core focus is reasoning and reliability: building architectures, evaluation methods, and tooling that transform fragile agent loops into production-ready systems, with coding agents as a primary proving ground.
They foresee a future where everyone effectively becomes a “software engineer” through natural-language programming of agents, leading to more, better, and highly customized software built with heavy compute but relatively small teams.

IDEAS WORTH REMEMBERING

5 ideas

Treat agent reliability as a first-class engineering problem.

Imbue frames current agent limitations primarily as reliability issues—getting the system to consistently choose good plans, know when it’s uncertain, and correct its own errors—rather than expecting a single massive model to magically solve everything.

Use real, high-frequency workflows to drive research (“serious use”).

They deliberately build agents for tasks they themselves need daily (coding, internal operations, recruiting), which exposes failure modes, forces better tooling, and incrementally pushes reliability from ~60% toward production-grade performance.

Combine general agents with specialized sub-agents and smaller models.

Agent workflows can mix large general-purpose models for planning with specialized, smaller, cheaper models or sub-agents for repeated subtasks, achieving both cost efficiency and strong performance.

Leverage code as both a reasoning medium and an evaluation goldmine.

Coding tasks provide objective signals (tests passing, type checks, style constraints) and allow a smooth spectrum between fuzzy language-based reasoning and concrete hard-coded logic, making them ideal for building and measuring reasoning agents.

Continuously decompose evaluation into granular, measurable criteria.

Rather than only asking “is the output correct?”, they break evaluation into sub-metrics such as style, minimal diffs, variable naming, and test quality, which yields richer feedback and better training signals for agents.

WORDS WORTH SAVING

5 quotes

Our computers today need to be micromanaged. Nothing really happens unless I'm in front of it turning all these little knobs.

— Kanjun Qiu

The real promise of AI is if we can get systems that can actually act on our behalf and accomplish goals.

— Josh Albrecht

Writing agents today feels like writing code in assembly.

— Kanjun Qiu

If you want something to actually execute the general algorithm for addition, you need a thing that works in a different way than a pure language model.

— Josh Albrecht

In the future everyone will be a software engineer, and so everyone will need dev tools.

— Kanjun Qiu

Origins of Imbue and early experience with autonomous recruiting agentsWhy agents need different architectures than pure large language modelsReliability, reasoning, and evaluation as central challenges for agentsTradeoffs between general-purpose models, specialization, and compute costCode as a core domain for reasoning, evaluation, and agent capabilitiesInternal strategy: serious-use agents, incremental autonomy, and toolingLong-term vision: natural-language programming and personalized software agents

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.