How to digest 36 weekly podcasts without spending 36 hours listening | Tomasz Tunguz

How to digest 36 weekly podcasts without spending 36 hours listening | Tomasz Tunguz

How I AIAug 25, 202535m

Tomasz Tunguz (guest), Claire Vo (host)

Podcast downloading + transcription automationLocal-first tooling: Parakeet, Ollama, DuckDBFFmpeg audio conversion workflowTranscript cleanup prompts (Gemma 3)Daily summaries: themes, quotes, thesesEntity extraction: libraries vs LLMsBlog drafting + iterative grading (AP English rubric)Terminal UX and developer latencyVector search with LanceDB over past postsModel “duke it out” technique (Claude vs Gemini)

In this episode of How I AI, featuring Tomasz Tunguz and Claire Vo, How to digest 36 weekly podcasts without spending 36 hours listening | Tomasz Tunguz explores tomasz Tunguz’s terminal pipeline transcribes podcasts and drafts blog posts fast Tomasz Tunguz built a “podcast ripper” to keep up with 36 podcasts without spending 36 hours listening, converting daily episodes into cleaned transcripts and structured summaries.

Tomasz Tunguz’s terminal pipeline transcribes podcasts and drafts blog posts fast

Tomasz Tunguz built a “podcast ripper” to keep up with 36 podcasts without spending 36 hours listening, converting daily episodes into cleaned transcripts and structured summaries.

His pipeline downloads audio from podcast feeds, converts files with FFmpeg, transcribes locally (Whisper initially; now NVIDIA Parakeet), cleans transcripts with an LLM (Gemma 3), and tracks processing in DuckDB.

From each day’s transcripts, he generates summaries, key topics/themes, quotable highlights, startup/company mentions for potential CRM enrichment, draft tweets, and venture “investment theses.”

He also experiments with a blog-post generator that uses his archive (~2,000 posts) as style context, then iteratively grades and revises drafts using an “AP English teacher” rubric to reach an A-/~91 score before manual final editing/publishing.

Key Takeaways

Turn listening into a searchable reading workflow.

Tunguz prefers reading to listening because he can skim and jump ahead; converting podcasts to text makes high-volume audio content quickly digestible and easier to reuse.

Get the full analysis with uListen AI

A simple pipeline beats a “perfect” app for personal fit.

Instead of waiting for an off-the-shelf product, he built a terminal-based system that matches his exact workflow, then iterates quickly when requirements change.

Get the full analysis with uListen AI

Local-first can work—until a task needs bigger “brains.”

He aimed to run everything locally (Ollama, Parakeet, libraries), but found named-entity extraction improved dramatically when using more powerful LLMs rather than classic ML packages.

Get the full analysis with uListen AI

Transcript cleanup mattered more early than it does now.

Cleaning transcripts helped traditional NER tools recognize companies/proper nouns; once he switched to stronger LLM extraction, cleanup became less critical.

Get the full analysis with uListen AI

Quotes are the highest-leverage output for decision-making.

Among summaries, themes, and topics, he values quotable passages most because they’re fast to scan and often spark concrete next steps like market maps or thesis work.

Get the full analysis with uListen AI

Use structured outputs to drive downstream actions.

Daily documents include investment theses, noteworthy observations for tweets, and mentioned startups; these outputs are designed to feed real workflows (internal discussion, CRM enrichment, content creation).

Get the full analysis with uListen AI

Style transfer in writing remains hard; iterate with grading.

Even with 2,000 prior posts as context and fine-tuning attempts, AI drafts still “sound like a computer. ...

Get the full analysis with uListen AI

Terminal UX can be a feature, not a bug.

He prefers the terminal for low latency and scriptability (batch actions, automations, integrations), arguing it reduces friction and frustration compared to GUI-heavy workflows.

Get the full analysis with uListen AI

Notable Quotes

“I have a list of 36 podcasts, but I don't have 36 hours every week to listen to 36 podcasts.”

Tomasz Tunguz

“The part that's most valuable for me are these quotes.”

Tomasz Tunguz

“Everything that I can easily replace with a single prompt is not going to have any value.”

Tomasz Tunguz

“One of the techniques that I found the most effective… is to ask it to grade it like an AP English teacher.”

Tomasz Tunguz

“An AI will only deliver you a grammatically perfect specimen.”

Tomasz Tunguz

Questions Answered in This Episode

What’s the exact schema of your daily summary output (sections, formatting, and constraints), and which parts ended up being most/least reliable?

Tomasz Tunguz built a “podcast ripper” to keep up with 36 podcasts without spending 36 hours listening, converting daily episodes into cleaned transcripts and structured summaries.

Get the full analysis with uListen AI

How do you source and manage the podcast feed list (RSS discovery, authentication for paid feeds, episode deduping, and backfilling old episodes)?

His pipeline downloads audio from podcast feeds, converts files with FFmpeg, transcribes locally (Whisper initially; now NVIDIA Parakeet), cleans transcripts with an LLM (Gemma 3), and tracks processing in DuckDB.

Get the full analysis with uListen AI

What are the biggest accuracy differences you’ve seen between Whisper and NVIDIA Parakeet on real-world podcasts (speaker diarization, jargon, accents, crosstalk)?

From each day’s transcripts, he generates summaries, key topics/themes, quotable highlights, startup/company mentions for potential CRM enrichment, draft tweets, and venture “investment theses.”

Get the full analysis with uListen AI

Can you share the DuckDB tables you use to track processed episodes and prevent reprocessing—what keys and metadata do you store?

He also experiments with a blog-post generator that uses his archive (~2,000 posts) as style context, then iteratively grades and revises drafts using an “AP English teacher” rubric to reach an A-/~91 score before manual final editing/publishing.

Get the full analysis with uListen AI

You said transcript cleanup is “not that useful anymore”—in what scenarios does cleaning still measurably improve outcomes (quotes, company extraction, summaries)?

Get the full analysis with uListen AI

Transcript Preview

Tomasz Tunguz

I have a list of 36 podcasts, but I don't have 36 hours every week to listen to 36 podcasts. So what I did is I created a system that goes through each of those podcasts every day and downloads the podcast files, and then transcribes them.

Claire Vo

Can you show us how it's actually built? Like, where do you get this feed? It sounds like you run it locally. How does this all work?

Tomasz Tunguz

I wrote this thing called the Parakeet Podcast Processor, and this podcast processor basically takes in a file, and what it'll do is it will read the file, it'll download it, and then it will convert it via FFmpeg. Then that will take the audio and convert it to text. So here's the podcast summaries for today. There's Lenny's podcast, the host, the guests, a comprehensive summary. So here's a conversation with Bob Baxley, key topics, and then key themes. The part that's most invaluable for me are these quotes, and those quotes, I'll read them. It'll suggest a bunch of actionable investment theses for a venture capital firm, which is put into the prompt, like, "Okay, maybe we should be looking at AI-assisted design tools."

Claire Vo

You've gotten not only the content you want, but the user experience you want. You control it end to end, and you can build this hyper-personalized software experience. [upbeat music] Welcome back to How I AI. I'm Claire Vo, product leader and AI obsessive, here on a mission to help you build better with these new tools. Today, I have Tomasz Tunguz, a legend in the enterprise software business and founder of Theory Ventures, which invests in early-stage enterprise AI, data, and blockchain companies. Tom is followed by over a half a million folks on his blog and LinkedIn, and he's gonna show us today how he uses AI to keep up with all the podcasts, including this one, and draft blog posts that would be approved by your AP English teacher. Let's get to it. This episode is brought to you by Notion. Notion is now your do-everything AI tool for work. With new AI meeting notes, enterprise search, and research mode, everyone on your team gets a note-taker, researcher, doc drafter, brainstormer. Your new AI team is here, right where your team already works. I've been a longtime Notion user, and have been using the new Notion AI features for the last few weeks. I can't imagine working without them. AI meeting notes are a game changer. The summaries are accurate, and extracting action items is super useful. For stand-ups, team meetings, one-on-ones, customer interviews, and, yes, podcast prep, Notion's AI meeting notes are now an essential part of my team's workflow. The fastest-growing companies, like OpenAI, Ramp, Vercel, and Cursor, all use Notion to get more done. Try all of Notion's new AI features for free by signing up with your work email at notion.com/howiai. To celebrate twenty-five thousand YouTube followers on How I AI, we're doing a giveaway. You can win a free year to my favorite AI products, including v0, Replit, Lovable, Bolt, Cursor, and of course, ChatPRD, by leaving a rating and review on your favorite podcast app and subscribing to YouTube. To enter, simply go to howiaipod.com/giveaway, read the rules, and leave us a review and subscribe. Enter by the end of August, and we will announce our winners in September. Thanks for listening. Okay, Tom, I'm so happy to have you here, because you are gonna show us how you are solving a problem I'm creating for you.

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript

Get more from every podcast

AI summaries, searchable transcripts, and fact-checking. Free forever.

Add to Chrome