Skip to content
No PriorsNo Priors

No Priors Ep. 63 | With Sarah Guo and Elad Gil

This week on No Priors hosts, Sarah and Elad are catching up on the latest AI news. They discuss the recent developments in AI like Meta’s new AI assistant and the latest in music generation, and if you’re interested in generative AI music, stay tuned for next week’s interview! Sarah and Elad also get into device-resident models, AI hardware, and ask just how smart smaller models can really get. These hardware constraints were compared to the hurdles AI platforms are continuing to face including computing constraints, energy consumption, context windows, and how to best integrate these products in apps that users are familiar with. Have a question for our next host-only episode, or feedback for our team? Reach out to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil Show Notes: 0:00 Intro 1:25 Music AI generation 4:02 Apple’s LLM 11:39 The role of AI-specific hardware 15:25 AI platform updates 18:01 Forward thinking in investing in AI 20:33 Unlimited context 23:03 Energy constraints

Sarah GuohostElad Gilhost
May 9, 202429mWatch on YouTube ↗

CHAPTERS

  1. 0:00 – 1:17

    Cold open: Hats, Bitcoin halving, and “H100-priced” swag jokes

    Sarah and Elad kick off with light banter about Elad’s hats, crypto, and the running gag that podcast merch should cost an H100 GPU (or a Bitcoin). It sets a playful tone before diving into the week’s AI topics.

    • Elad shows off a Bitcoin halving hat and a “Make AI Great Again” hat
    • Jokes about monetizing guest swag (hats/tequila)
    • Humor about paying in H100s vs. Bitcoin
  2. 1:17 – 3:00

    Why AI music is suddenly breaking through (Suno, Udio)

    Elad describes the rapid rise of AI music tools like Suno and Udio as the next major generative format wave. He frames it as a continuation from text to images to chat to video—and now music.

    • Suno and Udio as breakout music-generation products
    • Generative media progressing through multiple formats over time
    • Music models enable style control, lyrics, and vocals
  3. 3:00 – 4:02

    From generation to personalization: Do people actually want to make music?

    Sarah questions how many users will become creators when creation becomes easy and “good enough.” They explore the idea of a personalized soundtrack for everyday life, potentially even in the style/voice of famous artists.

    • Creator vs. consumer ratios on media platforms
    • Lowering friction may expand who creates (and how much)
    • Personalized music as a new consumer category
    • Implications of “in the style/voice of” popular artists
  4. 4:02 – 5:21

    Apple enters open models: Small LLMs and the edge-device opportunity

    Sarah breaks down Apple’s release of relatively small models and why developers want capable 1B–3B parameter models for on-device use. The big shift is latency, cost, and enabling passive/proactive experiences without constant cloud inference.

    • Apple’s small-model release and developer interest
    • Edge deployment changes latency and UX possibilities
    • On-device inference reduces ongoing compute costs
    • Prediction: Apple will build first-class local model interfaces
  5. 5:21 – 8:52

    OS-native assistants vs. third-party apps: Can independent “LLM launchers” survive?

    Elad describes early Mac/iPhone resident LLM apps (local indexing, desktop integrations) and the likelihood that these experiences become standard OS features. Sarah asks whether such products can endure independently given platform power dynamics.

    • Rise of OS-resident LLM utilities: indexing, embeddings, desktop assistants
    • The likely path to OS-bundled AI experiences
    • Platform risk: OS vendors integrate what becomes most valuable
    • Cross-platform reach as a potential differentiator
  6. 8:52 – 11:29

    What small models can’t do: Reasoning, knowledge, multimodality, and the device/cloud boundary

    Elad outlines a practical framework for the limits of small on-device models: reasoning capability, synthesis, multimodality, and how much knowledge can be packed into the weights. They discuss the shifting line between device-resident and cloud-required capabilities.

    • Capability buckets: reasoning, synthesis, multimodality, embedded knowledge
    • Small-model constraints vs. cloud model strengths
    • Need for clearer analysis of what’s feasible on-device today
    • Device capability improvements will steadily move the boundary
  7. 11:29 – 12:30

    AI-specific hardware and new consumer form factors: Passive, always-on sensing

    Sarah and Elad discuss where new hardware might matter beyond phones and watches. The key question is whether AI needs a new form factor for continuous data collection (e.g., glasses) versus simply adding sensors to existing devices.

    • Compute distribution: client vs. cloud vs. hybrid architectures
    • Potential value of passive, always-on vision capture
    • Ray-Ban Meta glasses as an example of new sensing-first form factor
    • Debate: new device category vs. augmenting existing devices
  8. 12:30 – 14:00

    Meta AI product launch: Multi-modality and distribution strategy

    They evaluate Meta’s AI launch as impressively executed across modalities, but note it hasn’t yet been pushed aggressively into Meta’s core apps. Elad imagines Meta AI as a chat contact/channel inside WhatsApp or Messenger, and highlights playful creation features.

    • Meta AI launched as a standalone product (meta.ai) despite huge distribution
    • Expectation that Meta will phase AI into existing surfaces
    • Chat-based distribution via WhatsApp/Messenger bots as a natural fit
    • One-click animation and family-friendly image creation features
  9. 14:00 – 15:24

    Meta’s open-source splash and the scale-vs-efficiency debate

    Sarah argues Meta’s compute scale changes the research tradeoffs: simply training longer or bigger can still yield gains beyond “supposedly optimal” points. They discuss what this implies about the importance of efficiency versus brute-force scaling for demos vs. real-world serving.

    • Meta’s massive GPU availability enables continued scaling experiments
    • Evidence that performance may keep improving beyond expected asymptotes
    • Efficiency matters for serving costs/latency, even if demos can brute-force
    • Large firms gain advantage by investing aggressively at scale
  10. 15:24 – 17:59

    Data/compute platforms (Snowflake, Databricks): Do they need to own models?

    Elad asks whether data platforms must build proprietary models or can rely on open-source and third-party models. Sarah describes a “coopetition” landscape where platforms host external models, train their own, and distribute across hyperscalers—making the competitive map messy.

    • Platforms hosting third-party models while also training in-house
    • Models distributed across clouds (e.g., Azure) blurring boundaries
    • Near-term value: expertise in tuning, deployment, and customer enablement
    • Owning models may be less necessary than operational competence + marketing
  11. 17:59 – 20:31

    The investing question: How much AI CapEx is rational, and who can afford the frontier?

    They tackle investor anxiety about whether spending levels are justified. Sarah contextualizes AI compute spend against historical infrastructure buildouts, while Elad notes frontier model economics push sponsorship toward hyperscalers and the few players who can invest far ahead of ROI.

    • AI spending decisions concentrated among a small set of high-conviction actors
    • Context: hyperscalers spending on the order of ~$200B/year on AI compute
    • Analogies to oil majors, broadband infrastructure, rail CapEx, chip fabs
    • Likely bifurcation: frontier models vs. specialized mid/small models
  12. 20:31 – 22:51

    “Unlimited context” and what it unlocks: codebases, legal corpora, support logs, biology

    Sarah asks about the push toward “unlimited context,” and Elad explains long-context breakthroughs (e.g., Magic’s multi-million token window, Gemini 1.5). They explore how longer context changes prompting and enables new applications, including surprising gains in biology/protein folding.

    • Magic’s early long-context work (multi-million token windows)
    • Long context enables entire repos, document sets, and queues in one prompt
    • Potential shift toward 10M+ token contexts over time
    • Unexpected impact: biology models improving with larger context
  13. 22:51 – 28:20

    Energy and physical-world bottlenecks: Data centers, permitting, nuclear, and geopolitics

    They move from compute constraints to energy as the next limiting factor, discussing the scale of 500MW–1GW data centers and why co-location matters for training. Sarah and Elad argue that permitting and infrastructure (atoms) can slow progress, and debate nuclear power’s role and geopolitical implications.

    • Training requires co-located GPUs due to interconnect/data transfer needs
    • Potential near-term bottlenecks: chips → packaging → data centers → energy
    • Permitting and infrastructure constraints can slow “software-speed” scaling
    • Nuclear power as a solvable lever; broader implications for national security and geopolitics
  14. 28:20 – 29:09

    Wrap-up: Back to hats, merch-for-H100 jokes, and where to follow the show

    They close by returning to the hat joke and mock “trade” offers for GPUs, then share where listeners can find the podcast, transcripts, and subscriptions. The episode ends on the same playful note it began with.

    • Callbacks to hats and “infrastructure” jokes
    • Offer: tequila/mockquila + No Priors hat in exchange for H100s
    • Call to action: Twitter, YouTube, podcast platforms
    • Website for emails and transcripts

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.