No Priors Ep. 15 | With Kelvin Guu, Staff Research Scientist, Google Brain

How do you personalize AI models? A popular school of thought in AI is to just dump all the data you need into pre-training or fine tuning. But that's costly and less controllable than using AI models as a reasoning engine against an external data source, and thus the intersection of retrieval with LLMs has become an increasingly interesting topic. Kelvin Guu, Staff Research Scientist at Google, wants to make machine learning cheaper, easier, and more accessible. Kelvin joins Sarah and Elad this week to talk about the newer methods his team is working on in machine learning, training, and language understanding. He has completed some of the earliest work on retrieval-augmented language models (REALM) and training LLMs to follow instructions (FLAN). 00:00 - Introduction 01:44 - Kelvin’s background in math, statistics and natural language processing at Stanford 03:24 - The questions driving the REALM Paper 07:08 - Frameworks around retrieval augmentation & expert models 10:16 - Why is modularity important 11:36 - FLAN Paper and instruction following 13:28 - Updating model weights in real time and other continuous learning methods 15:08 - Simfluence Paper & explainability with large language models 18:11 - ROME paper, “Model Surgery” exciting research areas 19:51 - Personal opinions and thoughts on AI agents & research 24:59 - How the human brain compares to AGI regarding memory and emotions 28:08 - How models become more contextually available 30:45 - Accessibility of models 33:47 - Advice to future researchers

Sarah GuohostKelvin GuuguestElad Gilhost

May 4, 202337mWatch on YouTube ↗

EPISODE INFO

Released: May 4, 2023
Duration: 37m
Channel: No Priors
Watch on YouTube: ▶ Open ↗

EPISODE DESCRIPTION

How do you personalize AI models? A popular school of thought in AI is to just dump all the data you need into pre-training or fine tuning. But that's costly and less controllable than using AI models as a reasoning engine against an external data source, and thus the intersection of retrieval with LLMs has become an increasingly interesting topic. Kelvin Guu, Staff Research Scientist at Google, wants to make machine learning cheaper, easier, and more accessible. Kelvin joins Sarah and Elad this week to talk about the newer methods his team is working on in machine learning, training, and language understanding. He has completed some of the earliest work on retrieval-augmented language models (REALM) and training LLMs to follow instructions (FLAN). 00:00 - Introduction 01:44 - Kelvin’s background in math, statistics and natural language processing at Stanford 03:24 - The questions driving the REALM Paper 07:08 - Frameworks around retrieval augmentation & expert models 10:16 - Why is modularity important 11:36 - FLAN Paper and instruction following 13:28 - Updating model weights in real time and other continuous learning methods 15:08 - Simfluence Paper & explainability with large language models 18:11 - ROME paper, “Model Surgery” exciting research areas 19:51 - Personal opinions and thoughts on AI agents & research 24:59 - How the human brain compares to AGI regarding memory and emotions 28:08 - How models become more contextually available 30:45 - Accessibility of models 33:47 - Advice to future researchers

SPEAKERS

Sarah Guo
host
Kelvin Guu
guest
Elad Gil
host

EPISODE SUMMARY

In this episode of No Priors, featuring Sarah Guo and Kelvin Guu, No Priors Ep. 15 | With Kelvin Guu, Staff Research Scientist, Google Brain explores google’s Kelvin Guu on Retrieval, Modularity, Memory, and Future AI Kelvin Guu, a staff research scientist at Google Brain, discusses the evolution from pre-trained language models like BERT to retrieval-augmented models (REALM), mixture-of-experts architectures, and instruction-tuned systems such as FLAN. He explains why modularity and retrieval are increasingly important—especially for personalization, enterprise adaptation, and up-to-date knowledge—alongside emerging techniques like prompt tuning, model surgery, and training data attribution (Simfluance). The conversation explores limitations of current agents, challenges in memory, safety, and hallucinations, and how neuroscience-inspired ideas like fear, chunking, and consolidation might inform AI design. Guu closes with thoughts on open training, “family values” alignment, and advice for future researchers in a world where technical execution is increasingly assisted by AI and creativity and problem formulation become key differentiators.

RELATED EPISODES