Y CombinatorBetter AI Models, Better Startups
CHAPTERS
- 0:00 – 2:09
Why every new model release scares (and helps) startups
The hosts open by noting how each OpenAI launch triggers anxiety that “this kills my startup.” They argue the opposite is often true: better base models raise the ceiling for what startups can ship, especially when improvements are a simple model swap.
- •Startups fear platform companies will bundle and replace them after each release
- •Model upgrades can instantly improve existing products with minimal code changes
- •New modalities/capabilities expand what startups can build on top
- •Winning depends on product quality plus distribution strength
- 2:09 – 5:12
What improved models change in practice: code, tools, and reliable structured output
They discuss how capability gains aren’t just “smarter chat,” but include practical workflow improvements—like having models write and run code to solve problems. They also highlight GPT-4o’s better structured/JSON output as a big integration unlock for startups.
- •Models can outperform pure prompting by writing code and executing it
- •Multimodal capability increases the number of product surfaces (voice, vision, etc.)
- •Structured outputs (e.g., JSON) reduce brittle parsing and glue code
- •Easier integration makes LLM features more dependable in real apps
- 5:12 – 8:06
GPT-4o vs. Gemini 1.5: modular add-ons vs. true multimodal Mixture-of-Experts
Diana contrasts OpenAI’s approach (GPT-4 with added speech/vision modules) with Gemini 1.5’s “from the ground up” multimodal Mixture-of-Experts design. The takeaway: 4o’s headline is modality and product polish, while Gemini’s underlying architecture may be more technically exciting and efficient.
- •4o adds speech/video on top of a primarily text transformer foundation
- •OpenAI likely ‘bolts on’ components like Whisper and image models
- •Gemini 1.5 uses Mixture-of-Experts with specialized network paths per modality
- •Google’s TPU infrastructure enables expensive large-scale training and efficiency
- 8:06 – 9:47
Huge context windows and the ‘Is RAG dead?’ debate
Gemini’s million-token window raises the question of whether retrieval-augmented generation tooling becomes obsolete. The group argues large context is powerful for consumer experiences, but RAG remains important for privacy, control, and reliable retrieval—especially in enterprise settings.
- •Gemini 1.5: 1M tokens (and research claims up to 10M) shifts product possibilities
- •Consumer ‘personal assistant with all my context’ becomes more feasible
- •RAG still matters for privacy, data control, and model customization
- •Enterprises need auditable retrieval, access controls, and logging
- 9:47 – 14:19
Why big context still isn’t perfect: retrieval accuracy, memory, and caching analogies
They caution that “bigger context” doesn’t guarantee precise recall; founders report spotty specificity in practice. They compare the likely long-term architecture to computer caching layers: multiple memory tiers where RAG becomes a durable foundational pattern.
- •In-field reports: very large context can struggle with precise retrieval
- •A smaller, reliable window can beat a larger, fuzzier one in production
- •ChatGPT ‘Memory’ illustrates summarization + selective retention workflows
- •Analogy: modern systems still use multiple caching layers despite bigger RAM
- 14:19 – 16:58
A healthier startup ecosystem needs multiple top-tier model providers
The hosts emphasize that startup risk is highest when one model dominates. Competition among OpenAI, Google, Anthropic, Meta, and others enables routing, price competition, and healthier margins—making it easier for many startups to thrive.
- •Monopoly model dominance creates dependency and pricing risk for startups
- •Multiple strong models create a marketplace with better economics
- •Founders already mix models for prototyping vs. scaling and use routers/ops tools
- •Meta (future LLaMA releases) could reshape the competitive landscape
- 16:58 – 21:26
History repeats: Google/Facebook era lessons about incumbents ‘killing’ startups
Jared compares today’s OpenAI anxiety to 2005–2010 fears of Google/Facebook copying startups. The group notes that some categories were predictably strategic for incumbents (hard to survive), while others succeeded by changing UI, data, and monetization—especially in verticalized ways.
- •Investor question then: ‘What if Google does this?’ mirrors today’s OpenAI fear
- •Head-on competition (e.g., general search) usually fails for startups
- •Some verticalized products won by owning unique workflows/data integrations
- •Success often required different monetization, not incumbent-like business models
- 21:26 – 22:19
Bundling risk and the Dropbox example: when ‘obvious’ incumbent moves don’t end you
They revisit how Dropbox faced “GDrive will kill you” doom narratives—yet survived and became huge. The conversation frames bundling as a classic incumbent weapon, but not always decisive if startups deliver superior product focus and user experience.
- •Google Drive leaks sparked widespread belief Dropbox was finished
- •Incumbents can subsidize/bundle (e.g., ‘infinite storage’ / ‘infinite tokens’)
- •Despite bundling risk, focused products can win on execution and UX
- •Lesson: incumbent moves can be predictable, but outcomes aren’t guaranteed
- 22:19 – 25:10
Where OpenAI is likely heading: multimodal desktop assistant and the danger zone for consumer apps
Garry sketches the trajectory implied by GPT-4o plus the desktop app: an always-available assistant with access to files, apps, IDE, browser, and transactions. They warn that if your product looks like a near-term OpenAI demo, you may be building directly in the blast radius.
- •Desktop assistant implies deep OS/app/file access and transactional capability
- •OpenAI’s ‘Her’-like direction suggests a broad general-purpose interface
- •If your product is easy to imagine as next release, it might be
- •Hardest place to compete: the core general assistant experience
- 25:10 – 27:51
Build the valuable but ‘unsexy’ products incumbents won’t demo
They propose a survival strategy: focus on workflows that are useful but don’t match the sci‑fi narrative OpenAI markets. Examples like Perplexity show how a product can win by optimizing for a specific job (research with sources) rather than general chat magic.
- •Perplexity succeeds by returning sources/links and supporting research workflows
- •Incumbents optimize for impressive demos, not necessarily utilitarian UX
- •Unsexy, workflow-driven products can be durable wedges
- •New markets are bigger than people expect; many winners can coexist
- 27:51 – 30:51
B2B is the massive opportunity: proprietary workflows, regulation, and ROI-driven pricing
The group argues B2B is less likely to be owned by consumer-focused incumbents and is constrained by privacy/regulatory requirements. They cite fintech/compliance and niche workflow automation as places where startups can deliver immediate ROI and charge more as models improve.
- •B2B requires sales, domain nuance, and edge-case handling incumbents avoid
- •Regulated/proprietary data (fintech/healthcare) limits centralized assistant models
- •Examples: construction permitting, KYC, compliance, payments ops automation
- •Better models enable upsells; customers pay for outcomes, not model branding
- 30:51 – 33:36
Better models → better startup economics: faster growth, more revenue, more automation
They connect model progress to SaaS-like (or larger) market expansion: automating jobs, not just augmenting them. YC observations suggest dramatic revenue ramps are possible when automation creates clear ROI and pricing power.
- •AI automation opportunity could rival ‘all SaaS combined’ in market size
- •AI can turn labor costs into software revenue with higher margins
- •YC anecdotes: extremely fast revenue growth over a single batch
- •Startups should fear competitors more than OpenAI—speed and execution matter
- 33:36 – 37:11
Consumer opportunities in ‘edgy’ zones: deepfakes, satire, companions, and incumbents’ risk limits
They pivot back to consumer and argue incumbents are constrained by legal/PR risk, creating openings for startups. They discuss AI companions and deepfake-enabled media tools as examples of weird-but-real markets with strong retention and virality that big platforms may avoid.
- •Incumbents avoid legally/PR-risky features; startups can explore them
- •AI companions (Replika/Character.AI) show deep retention and non-obvious demand
- •Infinity AI example: turning scripts into movies with famous characters
- •Election season amplifies the gray area between satire, memes, and deception
- 37:11 – 41:06
Lightning round: the most exciting updates—emotion, translation, robotics, and cost curves
In response to an audience question, each host highlights what stood out most: emotional voice, pocket translation, unified-model implications for robotics, and major cost reductions. They close by connecting cheaper, more stable models to new devices and product categories.
- •OpenAI voice: emotional prosody makes speech feel meaningfully more human
- •Live translation: a ‘universal translator’ changes travel, relationships, and work
- •Unified model direction may accelerate robotics progress (eventually)
- •Cost drops signal maturation and enable on-device/low-power compute paths