Skip to content
No PriorsNo Priors

No Priors Ep. 15 | With Kelvin Guu, Staff Research Scientist, Google Brain

How do you personalize AI models? A popular school of thought in AI is to just dump all the data you need into pre-training or fine tuning. But that's costly and less controllable than using AI models as a reasoning engine against an external data source, and thus the intersection of retrieval with LLMs has become an increasingly interesting topic. Kelvin Guu, Staff Research Scientist at Google, wants to make machine learning cheaper, easier, and more accessible. Kelvin joins Sarah and Elad this week to talk about the newer methods his team is working on in machine learning, training, and language understanding. He has completed some of the earliest work on retrieval-augmented language models (REALM) and training LLMs to follow instructions (FLAN). 00:00 - Introduction 01:44 - Kelvin’s background in math, statistics and natural language processing at Stanford 03:24 - The questions driving the REALM Paper 07:08 - Frameworks around retrieval augmentation & expert models 10:16 - Why is modularity important 11:36 - FLAN Paper and instruction following 13:28 - Updating model weights in real time and other continuous learning methods 15:08 - Simfluence Paper & explainability with large language models 18:11 - ROME paper, “Model Surgery” exciting research areas 19:51 - Personal opinions and thoughts on AI agents & research 24:59 - How the human brain compares to AGI regarding memory and emotions 28:08 - How models become more contextually available 30:45 - Accessibility of models 33:47 - Advice to future researchers

Sarah GuohostKelvin GuuguestElad Gilhost
May 4, 202337mWatch on YouTube ↗

Episode Details

EPISODE INFO

Released
May 4, 2023
Duration
37m
Channel
No Priors
Watch on YouTube
▶ Open ↗

EPISODE DESCRIPTION

How do you personalize AI models? A popular school of thought in AI is to just dump all the data you need into pre-training or fine tuning. But that's costly and less controllable than using AI models as a reasoning engine against an external data source, and thus the intersection of retrieval with LLMs has become an increasingly interesting topic. Kelvin Guu, Staff Research Scientist at Google, wants to make machine learning cheaper, easier, and more accessible. Kelvin joins Sarah and Elad this week to talk about the newer methods his team is working on in machine learning, training, and language understanding. He has completed some of the earliest work on retrieval-augmented language models (REALM) and training LLMs to follow instructions (FLAN). 00:00 - Introduction 01:44 - Kelvin’s background in math, statistics and natural language processing at Stanford 03:24 - The questions driving the REALM Paper 07:08 - Frameworks around retrieval augmentation & expert models 10:16 - Why is modularity important 11:36 - FLAN Paper and instruction following 13:28 - Updating model weights in real time and other continuous learning methods 15:08 - Simfluence Paper & explainability with large language models 18:11 - ROME paper, “Model Surgery” exciting research areas 19:51 - Personal opinions and thoughts on AI agents & research 24:59 - How the human brain compares to AGI regarding memory and emotions 28:08 - How models become more contextually available 30:45 - Accessibility of models 33:47 - Advice to future researchers

SPEAKERS

  • Sarah Guo

    host
  • Kelvin Guu

    guest
  • Elad Gil

    host

EPISODE SUMMARY

In this episode of No Priors, featuring Sarah Guo and Kelvin Guu, No Priors Ep. 15 | With Kelvin Guu, Staff Research Scientist, Google Brain explores google’s Kelvin Guu on Retrieval, Modularity, Memory, and Future AI Kelvin Guu, a staff research scientist at Google Brain, discusses the evolution from pre-trained language models like BERT to retrieval-augmented models (REALM), mixture-of-experts architectures, and instruction-tuned systems such as FLAN. He explains why modularity and retrieval are increasingly important—especially for personalization, enterprise adaptation, and up-to-date knowledge—alongside emerging techniques like prompt tuning, model surgery, and training data attribution (Simfluance). The conversation explores limitations of current agents, challenges in memory, safety, and hallucinations, and how neuroscience-inspired ideas like fear, chunking, and consolidation might inform AI design. Guu closes with thoughts on open training, “family values” alignment, and advice for future researchers in a world where technical execution is increasingly assisted by AI and creativity and problem formulation become key differentiators.

RELATED EPISODES

Amex Global Business Travel: The World’s First AI Take Private with Long Lake CEO Alexander Taubman

Amex Global Business Travel: The World’s First AI Take Private with Long Lake CEO Alexander Taubman

Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud

Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud

No Priors Ep. 27 | With Sarah Guo & Elad Gil

No Priors Ep. 27 | With Sarah Guo & Elad Gil

No Priors Ep. 105 | With Director of the Center of AI Safety Dan Hendrycks

No Priors Ep. 105 | With Director of the Center of AI Safety Dan Hendrycks

No Priors Ep. 6 | With Daphne Koller from Insitro

No Priors Ep. 6 | With Daphne Koller from Insitro

No Priors Ep. 5 | With Huggingface’s Clem Delangue

No Priors Ep. 5 | With Huggingface’s Clem Delangue

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome