How to digest 36 weekly podcasts without spending 36 hours listening | Tomasz Tunguz

Tomasz Tunguz is the founder of Theory Ventures, which invests in early-stage enterprise AI, data, and blockchain companies. In this episode, Tomasz reveals his custom-built “Parakeet Podcast Processor,” which helps him extract value from 36 podcasts weekly without spending 36 hours listening. He walks through his terminal-based workflow that downloads, transcribes, and summarizes podcast content, extracting key insights, investment theses, and even generating blog post drafts. We explore how AI enables hyper-personalized software experiences that weren’t feasible before recent advances in language models. *What you’ll learn:* 1. How to build a terminal-based podcast processing system that downloads, transcribes, and extracts key insights from multiple podcasts daily 2. A workflow for using Nvidia’s Parakeet and other AI tools to clean transcripts and generate structured summaries of podcast content 3. How to extract actionable investment theses and company mentions from podcast transcripts using AI prompting techniques 4. A systematic approach to generating blog post drafts with AI that maintains your personal writing style through iterative feedback 5. Why using an “AP English teacher” grading system can help improve AI-generated content through multiple revision cycles 6. How to leverage Claude Code for maintaining and updating personal productivity tools with minimal friction *Brought to you by:* Notion—The best AI tools for work: https://www.notion.com/howiai Miro—A collaborative visual platform where your best work comes to life: http://miro.com/ *25k giveaway:* To celebrate 25,000 YouTube followers, we’re doing a giveaway. Win a free year of my favorite AI products, including v0, Replit, Lovable, Bolt, Cursor, and, of course, ChatPRD, by leaving a rating and review on your favorite podcast app and subscribing to the podcast on YouTube. To enter: https://www.howiaipod.com/giveaway *Where to find Tomasz Tunguz:* Blog: https://tomtunguz.com/ Theory Ventures: https://theory.ventures/ LinkedIn: https://www.linkedin.com/in/tomasztunguz/ X: https://x.com/ttunguz *Where to find Claire Vo:* ChatPRD: https://www.chatprd.ai/ Website: https://clairevo.com/ LinkedIn: https://www.linkedin.com/in/clairevo/ X: https://x.com/clairevo *In this episode, we cover:* (00:00) Introduction to Tomasz Tunguz (03:32) Overview of the podcast ripper system and its components (05:06) Demonstration of the transcript cleaning process (06:59) Extracting quotes, investment theses, and company mentions (10:20) Why Tomasz prefers terminal-based tools (12:38) The benefits of personalized software versus off-the-shelf solutions (15:31) A workflow for generating blog posts from podcast insights (17:34) Using the “AP English teacher” grading system for blog posts (18:25) Challenges with matching personal writing style using AI (22:00) Tomasz’s three-iteration process for improving blog posts (26:13) The grading prompt and evaluation criteria (28:16) AI’s role in writing education (30:28) Final thoughts *Tools referenced:* • Whisper (OpenAI): https://openai.com/research/whisper • Parakeet: https://build.nvidia.com/nvidia/parakeet-ctc-0_6b-asr • Ollama: https://ollama.com/ • Gemma 3: https://deepmind.google/models/gemma/gemma-3/ • Claude: https://claude.ai/ • Claude Code: https://claude.ai/code • Gemini: https://gemini.google.com/ • FFmpeg: https://ffmpeg.org/ • DuckDB: https://duckdb.org/ • LanceDB: https://lancedb.com/ *Other references:* • 35 years of product design wisdom from Apple, Disney, Pinterest, and beyond | Bob Baxley: https://www.lennysnewsletter.com/p/35-years-of-product-design-wisdom-bob-baxley • Dan Luu’s blog post on latency: https://danluu.com/input-lag/ • GitHub CEO: The AI Coding Gold Rush, Vibe Coding & Cursor: https://www.readtobuild.com/p/github-ceo-the-ai-coding-gold-rush • Stanford Named Entity Recognition library: https://nlp.stanford.edu/software/CRF-NER.html _Production and marketing by https://penname.co/._ _For inquiries about sponsoring the podcast, email jordan@penname.co._

Tomasz TunguzguestClaire Vohost

Aug 25, 202535mWatch on YouTube ↗

CHAPTERS

0:00 – 3:32
Why Tomasz built a “podcast ripper” to keep up with 36 shows
Tomasz explains the core problem: he wants insights from dozens of weekly podcasts but doesn’t have time to listen. His solution is an automated pipeline that downloads episodes, transcribes them, and produces skimmable outputs he can read quickly.
3:32 – 5:06
Architecture overview: downloading feeds, converting audio, transcribing locally
Tomasz walks through the system he built (Parakeet Podcast Processor) and the main plumbing that turns audio files into text. The toolchain emphasizes local processing and modular steps that can be swapped as models improve.
5:06 – 6:59
Transcript cleanup: using an LLM as a transcript editor
After transcription, Tomasz cleans transcripts by removing filler words while preserving technical content and length. He demonstrates a “transcript editor” prompt and describes why cleaning mattered more earlier in the project than it does now.
6:59 – 10:20
Orchestration + storage: tracking processed episodes with DuckDB
To make the workflow reliable, Tomasz stores processing metadata locally so episodes aren’t reprocessed unnecessarily. He describes an orchestrator that pulls daily transcripts from the database and runs summarization prompts in batches.
10:20 – 12:38
Daily digest outputs: summaries, key topics/themes, and the most valuable quotes
Tomasz shows what the daily generated document looks like: each podcast gets host/guest context, a comprehensive summary, key topics, and key themes. He emphasizes that curated quotes are the highest-signal output for his workflow.
12:38 – 15:31
From content to action: investment theses, tweets, and company discovery
Beyond summarization, the pipeline generates venture-style “investment theses,” draft tweets, and lists of companies mentioned in episodes. These outputs connect podcast listening to concrete next steps like market maps, CRM enrichment, and outreach.
15:31 – 17:34
Why the terminal: speed, low latency, and scriptability
Claire probes why Tomasz stays in the terminal instead of building a UI. Tomasz argues the terminal offers the lowest interaction latency, reduces frustration, and makes it easy to script automations across email, CRM actions, and AI tools.
17:34 – 18:25
Hyper-personal tools vs off-the-shelf apps: “glove-like fit” with modern AI
They discuss why personalized internal tools are now practical: AI reduces the cost of building and modifying bespoke software. Tomasz highlights how quickly he can tweak workflows (like reordering sections or emailing digests) using tools like Claude Code.
18:25 – 22:00
Podcast insights to blog drafts: the blog post generator pipeline
Tomasz introduces a second system that turns a specific podcast quote/topic into a blog post draft. It uses the podcast transcript as context and pulls relevant prior writing to shape content and tone, though a demo bug appears during search.
22:00 – 26:13
The “AP English teacher” grader: setting a quality bar and iterating
To improve drafts, Tomasz uses an evaluation prompt that grades the post like an AP English teacher, then revises until it reaches roughly an A−. He describes why hooks and conclusions matter most and how he runs multiple improvement passes.
26:13 – 28:16
Style matching is hard: model voices, personal quirks, and linking limitations
Both agree AI struggles to capture a writer’s authentic rhythm—especially in short-form. Tomasz compares model “personalities,” notes AI’s tendency toward grammatical perfection, and shares unresolved challenges like automatically linking to related posts.
28:16 – 30:28
Prompt + rubric details: what his generator optimizes for (brevity, no headers, flow)
Tomasz reveals the blog generator prompt design and the structural constraints he enforces from his own analytics. He explains dynamic style extraction from related posts and why he avoids headers due to dwell-time impacts.
30:28 – 35:14
AI for writing education + wrap-up: future tiny teams and “AI model cage matches”
They broaden to education: AI can handle first-pass grammar/structure feedback, freeing teachers to focus on creativity. Tomasz then answers rapid questions: his vision of a 30-person $100M company and his tactic of using multiple models to critique each other when outputs degrade.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome

Why Tomasz built a “podcast ripper” to keep up with 36 shows

Architecture overview: downloading feeds, converting audio, transcribing locally

Transcript cleanup: using an LLM as a transcript editor

Orchestration + storage: tracking processed episodes with DuckDB

Daily digest outputs: summaries, key topics/themes, and the most valuable quotes

From content to action: investment theses, tweets, and company discovery

Why the terminal: speed, low latency, and scriptability

Hyper-personal tools vs off-the-shelf apps: “glove-like fit” with modern AI

Podcast insights to blog drafts: the blog post generator pipeline

The “AP English teacher” grader: setting a quality bar and iterating

Style matching is hard: model voices, personal quirks, and linking limitations

Prompt + rubric details: what his generator optimizes for (brevity, no headers, flow)

AI for writing education + wrap-up: future tiny teams and “AI model cage matches”

Get more out of YouTube videos.