a16zWhere does consumer AI stand at the end of 2025?
CHAPTERS
Consumer AI market snapshot: signs of winner-take-most
The team frames 2025 as the year big model providers pushed aggressively into consumer, and asks whether general-purpose assistants are consolidating into a small number of winners. They cite usage and paying behavior data suggesting most consumers stick to one primary assistant, with ChatGPT still far ahead—though momentum is shifting late in the year.
Who ‘won’ consumer AI in 2025: incumbents vs. fast-changing momentum
They benchmark the leader (ChatGPT) against challengers’ relative usage, and discuss how quickly the competitive landscape can turn with a viral model or distribution lever. The conversation highlights Gemini’s accelerating growth versus ChatGPT’s slower growth rate, and the idea that different players may “own” different user segments.
Big 2025 launches: OpenAI’s ‘inside ChatGPT’ strategy vs. Google’s many surfaces
Justine contrasts how OpenAI and Google shipped consumer functionality. OpenAI largely consolidated features inside ChatGPT (with Sora as a notable standalone), while Google spread experiences across Gemini, AI Studio, Labs, and standalone sites—enabling more tailored UIs but also fragmentation.
Image & video models went viral: from ‘Ghibli moment’ to NanoBanana, Veo, and Sora
They argue the most viral consumer breakthroughs in 2025 were in image and video generation rather than text. OpenAI’s image generation (the “Ghibli moment”) and Sora competed with Google’s Veo and NanoBanana releases, driving new waves of mainstream experimentation.
What actually improved in multimodal: realism, consistency, and ‘reasoning’
The team explains how image/video models progressed from style/aesthetics toward realism and detail-level coherence. They highlight improved multi-input reasoning (multiple images + text), better text rendering (infographics), and character consistency that encourages iterative creation.
The under-hyped unlock: search + accuracy inside creative generation
Anish argues NanoBanana’s most underrated advantage is search integration, which boosts factual accuracy for historically correct or product-accurate images. They relate this to Veo’s virality, where adding audio alongside video made outputs feel more complete and shareable.
Under-hyped productivity primitives: Pulse and connectors (but execution gaps)
They discuss OpenAI’s attempts to become more proactive and integrated into daily workflows through features like Pulse and connectors to calendar/email/docs. While the primitives are compelling, they note reliability and UX issues that limit daily adoption so far.
Prosumer tools & power-user workflows: Perplexity’s Comet browser as a standout
Olivia calls Perplexity’s Comet browser her most-used impressive product of the year, even though she doesn’t use Perplexity as her main assistant. The key is agentic browsing plus reusable workflows that run on schedules or triggers, beating even ChatGPT’s browser (Atlas) on sustained interest in their view.
Gemini vs. ChatGPT: distribution and brand—plus the importance of first-step UX
They debate whether Gemini can catch ChatGPT, noting Google’s distribution advantages (Android, Chrome, Docs/Gmail) but ChatGPT’s brand dominance as the default ‘Kleenex’ of AI. Product sensibility matters: ChatGPT’s trending prompts/templates lower activation energy, while Gemini’s blank-slate UI can confuse users.
Social features in AI: group chat and Sora 2’s TikTok-like feed—bearish on social retention
Bryan is skeptical that productivity-first assistants naturally expand into social, because the emotional drivers differ (self-improvement vs. entertainment/status/connection). Sora 2 succeeded more as a creator tool than a standalone social network; content tends to spread on existing platforms where consumption already lives.
Challengers’ positioning: Claude for power users, Meta’s hidden strengths, Grok’s rapid climb
They evaluate challengers as increasingly differentiated. Claude wins hearts among technical users with artifacts/skills and deep work capability but struggles with mainstream discoverability; Meta’s strongest models (SAM 3) are developer-oriented despite consumer DNA; Grok’s image/video progress is described as the steepest slope, paired with entertainment-first bets.
Predictions for 2026: enterprise pull-through, apps ecosystems, and ‘anything-in/anything-out’ multimodality
The team predicts enterprise adoption and app ecosystems will reshape consumer usage, with ChatGPT’s enterprise push potentially driving default consumer habit. They also expect deeper multimodality—editing and transforming across text/image/video—driven by efforts to unify separate model capabilities into more general systems.
Startup opportunities: opinionated UX, multi-model advantages, compute constraints, and monetization
They argue labs excel at models and incremental core improvements but often fail at opinionated standalone consumer UIs—leaving room for startups. Startups can also be multi-model (best tool per task) and avoid the labs’ compute tradeoffs between viral entertainment and serious inference; monetization is shifting with usage-based expansions yielding >100% net revenue retention.
What to try today: recommended products and workflows for builders and users
They close by sharing specific tools worth experimenting with to understand the current frontier. Recommendations span multimodal ad generation, multi-model creative suites, audio reading, slides, meeting notes, AI-native browsing, and app generation/coding tools.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome