No Priors Ep. 47 | With Sourcegraph CTO Beyang Liu

No PriorsJan 18, 202446m

Sarah Guo (host), Beyang Liu (guest)

Origin and core thesis of Sourcegraph as a code understanding and search companyDesign and capabilities of Cody, Sourcegraph’s AI coding assistantRetrieval-augmented generation (RAG), graph context, and search pipelines for codeHandling real-world code quality and customizable context in enterprise codebasesInner loop vs. outer loop in software development and where AI fits todayPaths toward autonomous “AI engineers” and the limits of current agent architecturesFuture of software engineering skills, CS fundamentals, and team coordination with AI

In this episode of No Priors, featuring Sarah Guo and Beyang Liu, No Priors Ep. 47 | With Sourcegraph CTO Beyang Liu explores sourcegraph CTO Beyang Liu on AI’s Real Impact on Coding Workflows Sourcegraph CTO and co-founder Beyang Liu explains how their AI assistant Cody builds on a decade of work in code search and code understanding to make developers dramatically more productive. He details how retrieval-augmented generation (RAG), graph-based code context, and search-style pipelines are as critical as the underlying large language models. The conversation explores near-term “inner loop” tools like completions and targeted commands versus the longer-term goal of AI engineers that can go from issue description to production-ready pull requests. Liu also shares a forward-looking view of software development where AI compresses boilerplate work, magnifies the importance of CS fundamentals and domain expertise, and helps teams operate with far greater cohesion and visibility.

Sourcegraph CTO Beyang Liu on AI’s Real Impact on Coding Workflows

Sourcegraph CTO and co-founder Beyang Liu explains how their AI assistant Cody builds on a decade of work in code search and code understanding to make developers dramatically more productive. He details how retrieval-augmented generation (RAG), graph-based code context, and search-style pipelines are as critical as the underlying large language models. The conversation explores near-term “inner loop” tools like completions and targeted commands versus the longer-term goal of AI engineers that can go from issue description to production-ready pull requests. Liu also shares a forward-looking view of software development where AI compresses boilerplate work, magnifies the importance of CS fundamentals and domain expertise, and helps teams operate with far greater cohesion and visibility.

Key Takeaways

Context and retrieval quality are at least as important as the base model.

Cody’s performance jumps “night and day” when its context engine pulls the right code and docs into the model’s window, showing that smart search, indexing, and graph traversal can rival or surpass benefits from only upgrading to bigger models.

Get the full analysis with uListen AI

Treat AI coding systems as search pipelines attached to LLMs.

Sourcegraph uses a two-stage search architecture—multiple retrievers (keyword, regex, embeddings, symbol-aware indexing) plus a re-ranking layer—to maximize recall then precision, mirroring classic search engines and generalizing well to other domains.

Get the full analysis with uListen AI

AI can be steered toward better code by curating what it sees.

Enterprises want Cody to ignore legacy or “anti-pattern” areas and focus on “golden” code regions, and RAG makes it straightforward to filter or weight context sources so generated code reflects desired standards and architectures.

Get the full analysis with uListen AI

Near-term value is in accelerating the developer inner loop, not full autonomy.

Current AI tools shine in in-editor completions, chat, and targeted commands like generate-tests or explain-code that reduce toil, while fully automatic issue-to-PR workflows remain limited by compounding failure and model reliability.

Get the full analysis with uListen AI

There are two main strategies toward autonomous AI engineers, both with trade-offs.

You can decompose tasks into many smaller chained steps (suffering from compounding error) or attempt large, context-heavy one-shot diffs with sampling and validation (suffering from cost and search space explosion), so better context and validation are crucial.

Get the full analysis with uListen AI

Small, well-chosen models plus strong context can rival large proprietary models.

Cody uses models like StarCoder and Mistral; with good retrieval these smaller models can match or approach larger systems for completions, bringing major speed and cost advantages—though very large models still help for complex chat reasoning.

Get the full analysis with uListen AI

CS fundamentals and domain expertise become more, not less, important with AI.

Liu argues AI will compress middleware and boilerplate, but humans who understand data structures, algorithms, and the problem domain will be best positioned to design the right abstractions and high-impact features that AI then helps implement.

Get the full analysis with uListen AI

Notable Quotes

“The bread and butter really is, you have code search that lets you pinpoint the needle in the haystack, and then you walk the reference graph of code.”
— Beyang Liu

“The dirty secret is that keyword search can probably get you more than 90% of the way there.”
— Beyang Liu

“When our context fetching engine works, the quality of code generated by Cody is like night and day.”
— Beyang Liu

“We always want to do the simplest thing first, because oftentimes someone comes along with a much dumber, cheaper baseline that works as well or better than the fancy model.”
— Beyang Liu

“My maybe contrarian hot take is that CS fundamentals, if anything, are going to grow in importance.”
— Beyang Liu

Questions Answered in This Episode

How far can retrieval and context quality be pushed before you truly need bespoke, fine-tuned code models to see additional gains?

Sourcegraph CTO and co-founder Beyang Liu explains how their AI assistant Cody builds on a decade of work in code search and code understanding to make developers dramatically more productive. ...

Get the full analysis with uListen AI

What concrete metrics does Sourcegraph use to evaluate 'good' context and end-to-end success for Cody beyond simple accuracy or latency?

Get the full analysis with uListen AI

How should engineering leaders restructure teams and processes if individual developers can own much larger feature scopes with AI tools like Cody?

Get the full analysis with uListen AI

Where is the practical tipping point at which fully autonomous issue-to-PR agents become safe and economically viable for production systems?

Get the full analysis with uListen AI

For students learning to code in an AI-native world, what specific CS fundamentals and domain skills should they prioritize to stay relevant over the next decade?

Get the full analysis with uListen AI

Transcript Preview

Sarah Guo

(instrumental music plays) Hi, listeners, and welcome to another episode of No Priors. This week, we're talking to Beyang Liu, the co-founder and CTO of Sourcegraph, which builds tools that helps developers innovate faster. Their most recent launch was an AI coding assistant called Cody. We're excited to have Beyang on to talk about how AI changes software development. Welcome.

Beyang Liu

Cool. Thanks, Sarah. It's great to be on. Thanks for having me.

Sarah Guo

Yeah. So you guys founded Sourcegraph all the way back in 2013, right? I feel like I met you and Quinn at GopherCon either that year or the year after. Do you remember?

Beyang Liu

Uh, yeah, I think that's right. We met at one of those, like, after conference, uh, events. And I remember you asked me a bunch of questions about developer productivity and, and code search and what we were doing back then.

Sarah Guo

Many listeners to the podcast are technical, but can you describe the core thesis of the company?

Beyang Liu

Quinn and I are both developers by background. We felt that there was kind of, like, this gap between the promise of programming, being in flow and getting stuff done and creating something, uh, new that everyone experiences. It's probably the reason that many of us got into programming in the first place, the joy of creation. Then you compare that with, uh, the day-to-day of most professional software engineers, which is, uh, a lot of toil and a lot of drudgery. When we kind of drilled into that, you know, why is that, I think we both realized that we're spending a lot of our time in the process of reading and understanding the existing code, uh, rather than, uh, building new features, 'cause all that is a prerequisite for being able to build, uh, quickly and efficiently. And that was a pain point that we saw again and again, both with the people that we collaborated with, uh, inside, uh, the company we were working at at the time, Palantir, as well as a lot of the enterprise customers that Palantir was working with. So we were kind of dropshipping into large banks and Fortune 500 companies and building software, kind of embedded with their software teams. And if anything, the, the pain points they had around understanding legacy code, uh, and figuring out the context, uh, of the code base so they could work, uh, effectively was, you know, 10X, 100X of, of the challenges that we were experiencing. So it was partially, you know, scratching our own itch and partially like, hey, like, the pain we feel is reflected across all these different industries trying to build software.

Sarah Guo

Yeah, and we're gonna come back to context and how important it is for, um-

Beyang Liu

(laughs)

Sarah Guo

... using this generation of AI. But I want to go, actually go back to, like, some roots you have in, in thinking about AI and your interning at, um, the Stanford AI Research Lab way back when.

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript

Get more from every podcast

AI summaries, searchable transcripts, and fact-checking. Free forever.

Add to Chrome