How to digest 36 weekly podcasts without spending 36 hours listening | Tomasz Tunguz

Tomasz Tunguz is the founder of Theory Ventures, which invests in early-stage enterprise AI, data, and blockchain companies. In this episode, Tomasz reveals his custom-built “Parakeet Podcast Processor,” which helps him extract value from 36 podcasts weekly without spending 36 hours listening. He walks through his terminal-based workflow that downloads, transcribes, and summarizes podcast content, extracting key insights, investment theses, and even generating blog post drafts. We explore how AI enables hyper-personalized software experiences that weren’t feasible before recent advances in language models. *What you’ll learn:* 1. How to build a terminal-based podcast processing system that downloads, transcribes, and extracts key insights from multiple podcasts daily 2. A workflow for using Nvidia’s Parakeet and other AI tools to clean transcripts and generate structured summaries of podcast content 3. How to extract actionable investment theses and company mentions from podcast transcripts using AI prompting techniques 4. A systematic approach to generating blog post drafts with AI that maintains your personal writing style through iterative feedback 5. Why using an “AP English teacher” grading system can help improve AI-generated content through multiple revision cycles 6. How to leverage Claude Code for maintaining and updating personal productivity tools with minimal friction *Brought to you by:* Notion—The best AI tools for work: https://www.notion.com/howiai Miro—A collaborative visual platform where your best work comes to life: http://miro.com/ *25k giveaway:* To celebrate 25,000 YouTube followers, we’re doing a giveaway. Win a free year of my favorite AI products, including v0, Replit, Lovable, Bolt, Cursor, and, of course, ChatPRD, by leaving a rating and review on your favorite podcast app and subscribing to the podcast on YouTube. To enter: https://www.howiaipod.com/giveaway *Where to find Tomasz Tunguz:* Blog: https://tomtunguz.com/ Theory Ventures: https://theory.ventures/ LinkedIn: https://www.linkedin.com/in/tomasztunguz/ X: https://x.com/ttunguz *Where to find Claire Vo:* ChatPRD: https://www.chatprd.ai/ Website: https://clairevo.com/ LinkedIn: https://www.linkedin.com/in/clairevo/ X: https://x.com/clairevo *In this episode, we cover:* (00:00) Introduction to Tomasz Tunguz (03:32) Overview of the podcast ripper system and its components (05:06) Demonstration of the transcript cleaning process (06:59) Extracting quotes, investment theses, and company mentions (10:20) Why Tomasz prefers terminal-based tools (12:38) The benefits of personalized software versus off-the-shelf solutions (15:31) A workflow for generating blog posts from podcast insights (17:34) Using the “AP English teacher” grading system for blog posts (18:25) Challenges with matching personal writing style using AI (22:00) Tomasz’s three-iteration process for improving blog posts (26:13) The grading prompt and evaluation criteria (28:16) AI’s role in writing education (30:28) Final thoughts *Tools referenced:* • Whisper (OpenAI): https://openai.com/research/whisper • Parakeet: https://build.nvidia.com/nvidia/parakeet-ctc-0_6b-asr • Ollama: https://ollama.com/ • Gemma 3: https://deepmind.google/models/gemma/gemma-3/ • Claude: https://claude.ai/ • Claude Code: https://claude.ai/code • Gemini: https://gemini.google.com/ • FFmpeg: https://ffmpeg.org/ • DuckDB: https://duckdb.org/ • LanceDB: https://lancedb.com/ *Other references:* • 35 years of product design wisdom from Apple, Disney, Pinterest, and beyond | Bob Baxley: https://www.lennysnewsletter.com/p/35-years-of-product-design-wisdom-bob-baxley • Dan Luu’s blog post on latency: https://danluu.com/input-lag/ • GitHub CEO: The AI Coding Gold Rush, Vibe Coding & Cursor: https://www.readtobuild.com/p/github-ceo-the-ai-coding-gold-rush • Stanford Named Entity Recognition library: https://nlp.stanford.edu/software/CRF-NER.html _Production and marketing by https://penname.co/._ _For inquiries about sponsoring the podcast, email jordan@penname.co._

Tomasz TunguzguestClaire Vohost

Aug 24, 202535mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Tomasz Tunguz’s terminal pipeline transcribes podcasts and drafts blog posts fast

Tomasz Tunguz built a “podcast ripper” to keep up with 36 podcasts without spending 36 hours listening, converting daily episodes into cleaned transcripts and structured summaries.
His pipeline downloads audio from podcast feeds, converts files with FFmpeg, transcribes locally (Whisper initially; now NVIDIA Parakeet), cleans transcripts with an LLM (Gemma 3), and tracks processing in DuckDB.
From each day’s transcripts, he generates summaries, key topics/themes, quotable highlights, startup/company mentions for potential CRM enrichment, draft tweets, and venture “investment theses.”
He also experiments with a blog-post generator that uses his archive (~2,000 posts) as style context, then iteratively grades and revises drafts using an “AP English teacher” rubric to reach an A-/~91 score before manual final editing/publishing.

IDEAS WORTH REMEMBERING

5 ideas

Turn listening into a searchable reading workflow.

Tunguz prefers reading to listening because he can skim and jump ahead; converting podcasts to text makes high-volume audio content quickly digestible and easier to reuse.

A simple pipeline beats a “perfect” app for personal fit.

Instead of waiting for an off-the-shelf product, he built a terminal-based system that matches his exact workflow, then iterates quickly when requirements change.

Local-first can work—until a task needs bigger “brains.”

He aimed to run everything locally (Ollama, Parakeet, libraries), but found named-entity extraction improved dramatically when using more powerful LLMs rather than classic ML packages.

Transcript cleanup mattered more early than it does now.

Cleaning transcripts helped traditional NER tools recognize companies/proper nouns; once he switched to stronger LLM extraction, cleanup became less critical.

Quotes are the highest-leverage output for decision-making.

Among summaries, themes, and topics, he values quotable passages most because they’re fast to scan and often spark concrete next steps like market maps or thesis work.

WORDS WORTH SAVING

5 quotes

“I have a list of 36 podcasts, but I don't have 36 hours every week to listen to 36 podcasts.”

— Tomasz Tunguz

“The part that's most valuable for me are these quotes.”

— Tomasz Tunguz

“Everything that I can easily replace with a single prompt is not going to have any value.”

— Tomasz Tunguz

“One of the techniques that I found the most effective… is to ask it to grade it like an AP English teacher.”

— Tomasz Tunguz

“An AI will only deliver you a grammatically perfect specimen.”

— Tomasz Tunguz

Podcast downloading + transcription automationLocal-first tooling: Parakeet, Ollama, DuckDBFFmpeg audio conversion workflowTranscript cleanup prompts (Gemma 3)Daily summaries: themes, quotes, thesesEntity extraction: libraries vs LLMsBlog drafting + iterative grading (AP English rubric)Terminal UX and developer latencyVector search with LanceDB over past postsModel “duke it out” technique (Claude vs Gemini)

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.