No PriorsNo Priors Ep. 11 | With Matei Zaharia, CTO of Databricks
Episode Details
EPISODE INFO
- Released
- April 25, 2023
- Duration
- 40m
- Channel
- No Priors
- Watch on YouTube
- ▶ Open ↗
EPISODE DESCRIPTION
If you have 30 dollars, a few hours, and one server, then you are ready to create a ChatGPT-like model that can do what’s known as instruction-following. Databricks’ latest launch, Dolly, foreshadows a potential move in the industry toward smaller and more accessible but extremely capable AIs. Plus, Dolly is open source, requires less computing power, and fewer data parameters than its counterparts. Matei Zaharia, Cofounder & Chief Technologist at Databricks, joins Sarah and Elad to talk about how big data sets actually need to be, why manual annotation is becoming less necessary to train some models, and how he went from a Berkeley PhD student with a little project called Spark to the founder of a company that is now critical data infrastructure that’s increasingly moving into AI. 00:00 - Introduction 01:29 - Origin of Databricks 04:30 - Work at Stanford Lab 05:29 - Dolly and Role of Open Source 12:30 - Industry focus on high parameter count, understanding reasoning at small model scale 18:42 - Enterprise applications for Dolly & chat bots 25:06 - Making bets as an academic turned CTO 36:23 - The early stages of AI and future predictions
SPEAKERS
Sarah Guo
hostMatei Zaharia
guestElad Gil
hostNarrator
other
EPISODE SUMMARY
In this episode of No Priors, featuring Sarah Guo and Matei Zaharia, No Priors Ep. 11 | With Matei Zaharia, CTO of Databricks explores databricks CTO on democratizing data, open LLMs, and AI’s future Matei Zaharia traces Databricks’ origins from UC Berkeley research and Apache Spark to a billion‑dollar cloud data and ML platform unifying data engineering, warehousing, and machine learning. He explains how Databricks and his Stanford lab are pushing scalable systems and language-model applications that combine LLMs with search, APIs, and other reliable data sources. A major focus is democratizing instruction-following models like Dolly using open-source foundations, challenging assumptions that only giant proprietary models can be conversational and useful. Zaharia also discusses enterprise needs, tooling gaps, where traditional ML still drives ROI, and why he believes model scale will commoditize while data quality, application design, and domain-specific systems become the real moat.
RELATED EPISODES
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome




