Skip to content
No PriorsNo Priors

No Priors Ep. 11 | With Matei Zaharia, CTO of Databricks

If you have 30 dollars, a few hours, and one server, then you are ready to create a ChatGPT-like model that can do what’s known as instruction-following. Databricks’ latest launch, Dolly, foreshadows a potential move in the industry toward smaller and more accessible but extremely capable AIs. Plus, Dolly is open source, requires less computing power, and fewer data parameters than its counterparts. Matei Zaharia, Cofounder & Chief Technologist at Databricks, joins Sarah and Elad to talk about how big data sets actually need to be, why manual annotation is becoming less necessary to train some models, and how he went from a Berkeley PhD student with a little project called Spark to the founder of a company that is now critical data infrastructure that’s increasingly moving into AI. 00:00 - Introduction 01:29 - Origin of Databricks 04:30 - Work at Stanford Lab 05:29 - Dolly and Role of Open Source 12:30 - Industry focus on high parameter count, understanding reasoning at small model scale 18:42 - Enterprise applications for Dolly & chat bots 25:06 - Making bets as an academic turned CTO 36:23 - The early stages of AI and future predictions

Sarah GuohostMatei ZahariaguestElad Gilhost
Apr 25, 202340mWatch on YouTube ↗

Episode Details

EPISODE INFO

Released
April 25, 2023
Duration
40m
Channel
No Priors
Watch on YouTube
▶ Open ↗

EPISODE DESCRIPTION

If you have 30 dollars, a few hours, and one server, then you are ready to create a ChatGPT-like model that can do what’s known as instruction-following. Databricks’ latest launch, Dolly, foreshadows a potential move in the industry toward smaller and more accessible but extremely capable AIs. Plus, Dolly is open source, requires less computing power, and fewer data parameters than its counterparts. Matei Zaharia, Cofounder & Chief Technologist at Databricks, joins Sarah and Elad to talk about how big data sets actually need to be, why manual annotation is becoming less necessary to train some models, and how he went from a Berkeley PhD student with a little project called Spark to the founder of a company that is now critical data infrastructure that’s increasingly moving into AI. 00:00 - Introduction 01:29 - Origin of Databricks 04:30 - Work at Stanford Lab 05:29 - Dolly and Role of Open Source 12:30 - Industry focus on high parameter count, understanding reasoning at small model scale 18:42 - Enterprise applications for Dolly & chat bots 25:06 - Making bets as an academic turned CTO 36:23 - The early stages of AI and future predictions

SPEAKERS

  • Sarah Guo

    host
  • Matei Zaharia

    guest
  • Elad Gil

    host
  • Narrator

    other

EPISODE SUMMARY

In this episode of No Priors, featuring Sarah Guo and Matei Zaharia, No Priors Ep. 11 | With Matei Zaharia, CTO of Databricks explores databricks CTO on democratizing data, open LLMs, and AI’s future Matei Zaharia traces Databricks’ origins from UC Berkeley research and Apache Spark to a billion‑dollar cloud data and ML platform unifying data engineering, warehousing, and machine learning. He explains how Databricks and his Stanford lab are pushing scalable systems and language-model applications that combine LLMs with search, APIs, and other reliable data sources. A major focus is democratizing instruction-following models like Dolly using open-source foundations, challenging assumptions that only giant proprietary models can be conversational and useful. Zaharia also discusses enterprise needs, tooling gaps, where traditional ML still drives ROI, and why he believes model scale will commoditize while data quality, application design, and domain-specific systems become the real moat.

RELATED EPISODES

Amex Global Business Travel: The World’s First AI Take Private with Long Lake CEO Alexander Taubman

Amex Global Business Travel: The World’s First AI Take Private with Long Lake CEO Alexander Taubman

Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud

Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud

No Priors Ep. 27 | With Sarah Guo & Elad Gil

No Priors Ep. 27 | With Sarah Guo & Elad Gil

No Priors Ep. 105 | With Director of the Center of AI Safety Dan Hendrycks

No Priors Ep. 105 | With Director of the Center of AI Safety Dan Hendrycks

No Priors Ep. 6 | With Daphne Koller from Insitro

No Priors Ep. 6 | With Daphne Koller from Insitro

No Priors Ep. 5 | With Huggingface’s Clem Delangue

No Priors Ep. 5 | With Huggingface’s Clem Delangue

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome