No Priors Ep. 35 | With Sarah Guo and Elad Gil

What Does it Take to Improve by 10x or 100x? This week is another host-only episode. Sarah and Elad talk about the path to better model quality, the potential for fine tuning to different use cases, retrieval systems (RAG), feedback systems (RLHF, RLAIF) and Meta’s sponsorship of the open source model ecosystem. Plus Sarah and Elad ask if we’re finally at the beginning of a new set of consumer applications and social networks. 00:00 Introduction 03:00 - AI Models, Open AI Advances, and Fine Tuning 08:59 - Addressing Hallucinations in AI Models 13:22 - Open Source Models in Consumer Engagement 16:23 - New Trends in Social Content Creation 21:53 - Balancing Ambition With Realistic Customer Expectations

Sarah GuohostElad Gilhost

Oct 5, 202323mWatch on YouTube ↗

EVERY SPOKEN WORD

50 min read · 9,584 words

0:00 – 3:00
Introduction
1. SGSarah Guo
  (music plays) Hi, No Priors listeners. Time for a host-only episode. This week, Elade and I talk about the path to better model quality from here, the potential of fine-tuning, RLHF, RLAIF, RAG and retrieval systems generally, Meta's sponsorship of the open source model ecosystem, and finally, the beginning of a new set of consumer applications and social networks. Thanks for tuning in. So one thing everybody is thinking about, uh, is what it takes to get to 10X or 100X better AI systems. Like, I-I think it'd be useful just to sort of enumerate the- the elements to sort of step function better. Elade, what do you think?
2. EGElad Gil
  Yeah. You know, it's interesting because there's- there's a few different aspects of that that people always talk about. There's scalability of datasets and compute and parameters and all these things. But the reality is, I think a lot of people believe that in order to 10X or even 100X use cases and usages for AI, outside of that there's things that could just be done on existing models today. So you don't need to wait for GPT-7 or whatever. You could start with GPT-4 or GPT-3.5 and add these things. And I think they are kinda bucketed into five or six areas. Number one is multimodality. So that means being able to use text or voice or images or videos, both input and output. So you should be able to talk to a model, type to it, upload an image and ask about the image (laughs) , and then it could output anything from code to a short video for you. Um, second is long-context windows. So basically when you prompt a model, you basically are feeding it data or commands or other things, and everybody realizes that you need longer and longer and longer context windows. So Magic, for example, is doing that for code. You know, you should be able to dump in an entire code repo into a coding model instead of having to do it piecemeal. Um, third, which we're gonna talk about today, is model customization. So that's things like fine-tuning, something known as RAG, r-, uh, there's data cleaning, there's labeling, there's a bunch of stuff that just makes models work better for you. Uh, fourth is some form of memory, so the AI actually remembers what it's doing. Uh, fifth is some form of recursion, so looping back and reusing models. And then sixth, which is related, is potentially a bunch of small models that are very specialized being orchestrated by a central model or sort of AI router that says, "Well, for this, for this specific task or use case, I'm gonna route the prompt or the data or the output into this other model that's doing this other thing," which is basically how the human brain works, right? You, uh, process visual information through your visual cortex, but then you use other parts of your brain to make decisions, right? And so it's very similar to what evolutions were decided was an optimal approach. But I think it's really interesting because I think many people in the field know that these five or six things are absolutely coming, and they- they can dramatically improve the performance on existing systems. Again, 10X, 100X better for certain things. And so it's more just a matter of when, right? It's not really an if anymore. A bunch of people are working on different aspects of this, and, you know, I think it's all coming really fast. And so what... you know, there's sort of two things that came out in the last week or two that are really relevant to this, it'd be great to get your thoughts on.
3:00 – 8:59
AI Models, Open AI Advances, and Fine Tuning
1. EGElad Gil
  One is OpenAI announcing that they're not gonna allow people to fine-tune models, and the second is Google, where they looked at human-generated feedback versus AI-generated feedback for models and sort of fine-tuning models that way. So I don't know if you wanna tell people a bit more about what happened with OpenAI and why that's important.
2. SGSarah Guo
  Yeah. So fine-tuning as a capability has been offered by OpenAI for several years, right? But they've made, like, a- a specific investment in allowing people to do that with more sophisticated models, in particular, like, 3.5, and then also making it possible for, uh, more enterprise use cases, right? And if you think about sort of, like, why that matters at all, as you said, like, you know, you have a bunch of these labs who are working on general capability and working on this sort of direction of scaling laws. Like, transformers predictably improve with scale, data, and compute. But I think what's really interesting is, like, the way every... the way these models end up being used in many business or even consumer application contexts is against a specific task, right? And so we've talked a lot about, like, where research effort is being w- put or compute is being spent in the industry right now, and there's a really... I think there's a really interesting question of we don't even know how good models can be at certain scale, right? At 70 or 30 or 100 billion parameters or more, but not at GPT-4 scale based on really high-quality data and curation of that data, um, because it hasn't- hasn't been explored. And so I think we should talk about some of the different ways you get these models to actually operate against a specific task, uh, with either fine-tuning with RLHF, uh, against, you know, um, the reward for your task, uh, or with- with RAG, as you said, in terms of retrieving from a dataset that you've specified, right? And there's reasons you would do all three of these. But I- I think it's actually a pretty big step for OpenAI to enable this because I- I think there was... A- at certain points in the- in the research world, there's been a narrative that, like, fine-tuning doesn't really matter, right? The general model matters. And I'd be curious if you think that's a change in research point of view or just a commercial decision in terms of the labs wanting to make money or that being more important than ever.
3. EGElad Gil
  Yeah, I think everybody realized that fine-tuning works really well when ChatGPT came out, because what ChatGPT is, is they took this model, GPT-3.5, which existed at the time, and it wasn't seeing as much usage, at least from, you know, people just going in and querying it, unless they were really good at prompts. And they basically hired a bunch of people, and the people ranked the output of the model, and they effectively fine-tuned the model against that feedback from the people who were assessing, "Is this the answer that I wanted based on the prompt that I put in," right? And so fine-tuning really just means you create a lot of feedback, usually, at least today, through people responding to output and saying is it good or bad. And it created a dramatic step function in the utility of GPT-3.5.... for end consumers or end users or students or lawyers, or all sorts of different types of people, and it really helped, uh, it was kind of the starting gun for this whole AI revolution right now, because everybody's suddenly realized how powerful these models were. And the model underlying it fundamentally hadn't really changed that much. What they'd done is they fine-tuned it with, uh, with reinforcement learning through human feedback, or RLF. And so I think that created this, uh, viewpoint that these, these types of fine-tunings or, you know, we can talk about RAG in a minute, love to get your thoughts on that, can fundamentally change the user affinity for a product. And so you could imagine in an enterprise you say, "Well, I really wanna fine-tune this model so that it reflects medical data that I have that's proprietary, that could help make a better doctor assistant." Or, "I wanna fine-tune it against this, you know, set of HR responses that are unique to my company, so that if I have a, uh, an employee who really wants a good answer to a question, they can get a really good answer back." And so it really gets into those sorts of things where you can dramatically improve the output of a model against something that you specialize. Do you wanna talk about how RAG ties into that? 'Cause I think that's a really key component of it too.
4. SGSarah Guo
  I think the sort of basic premise with RAG that everybody should understand is you want to retrieve against a specific corpus, right? And so you ca- you're still going to reason, you might have a generation or an answer based on that corpus, but if you pick a set of documents, it could be legal cases, it could be internal company documents, it could be medical information, as you said, right? So you still want the reasoning capabilities of the model, right? A diagnosis requires reasoning, uh, but you want it to come from a specific set of data versus, like, let's say all of the pre-training data of, you know, random information on the internet about whether or not you have this disease, right? And every, um, piece of, uh, forum conversation about this disease that has ever happened. So, you know, I think of the, um, the core driver as, like, trustworthiness, right? Citation, control of information source. And, and so now you have this architecture where people are using, um, think of it as, like, traditional information retrieval techniques and search in combination with these models. I think the other sort of driver besides trustworthiness on these RAG approaches is two things. One is cost, and the other is, like, freshness, right? So every time you, uh, retrain a model or even fine-tune a model, like, there is compute involved. See the idea that, you know, being able to incorporate new information without retraining and just using the reasoning capabilities of the models, I think this is very attractive to people, and very... That's also related to the freshness point of view, which is like, you actually want the most recent medical research, or the cases from this past year. I think that's, that's sort of a, a set of the drivers behind people being excited to take this approach and use it against their private data sets.
5. EGElad Gil
  Yeah, and that actually helps a lot with, uh, hallucinations,
8:59 – 13:22
Addressing Hallucinations in AI Models
1. EGElad Gil
  right? And so I think it's important to sort of explicitly point that out, because one of the knocks on the current set of AI technologies as well may hallucinate or say, um, you know, say things that aren't necessarily true or cite a legal case that doesn't exist, and by using RAG you can actually help say, "Okay, I'm only gonna use things that, that I know exist, or I'm gonna filter for things that are gonna be, um, answers that fit well with, you know, the, the current set of knowledge that people have relative to th- these sets of issues." So to your point on trustworthiness, I think it's really important to call out hallucinations explicitly since that's something people, people keep bringing up as sort of naysayers, "Oh my gosh, what if it hallucinates and some terrible misinterpretation happens and therefore we need to regulate this thing?" Right? (laughs) So, uh, it's kind of interesting. You know, I guess related to that, there's this reinforcement learning through human feedback versus AI feedback, and Google just came out with a really interesting paper on that where, you know, they showed that you can have an AI similarly provide feedback to whether the AI itself is generating good output, and for certain use cases, that works as well as people. And so suddenly instead of hire- having to hire an army of people to go and help fine-tune these models, you can actually have an AI help fine-tune this model. And I think the, the early signs that that was gonna be true was actually MedPaLM 2 where Google showed that they trained a model specifically on medical data, and the output from the model tended to be more correct than human physician experts. And so for certain use cases, we were already seeing AI provide more accurate answers than specialists, experts, right? And in RLA-AIF you're trying to sort of generalize that and say, "What are all the different ways that instead of using expensive people to do this, we can use really cheap AI models to provide that same feedback and sort of train things?" And so there's all these techniques and technologies that are coming now as part of this sort of list of six, six big innovations (laughs) that are part of the future AI 10X or 100X roadmap that are starting to fall into place. I think it's a very exciting time, and I think, th- d- you know, in the next year we'll keep seeing stuff like that. So there's a few other announcements that have come out related to this in terms of using different data sets or different models, but coming from social networks. So for example, Twitter, or I guess now we should call it X, um, said it will train ML models off of Twitter data, and that may have really interesting consumer applications or, um, outcomes. And then Meta is really now emerging as a primary sponsor for open source models. LLaMA and LLaMA 2 have really taken off in sort of the developer and enterprise ecosystem around LLMs. So it'd be great to hear what you think in terms of why are they doing this? You know, why are, why are they becoming the primary sponsor for open source AI, and how do you think they're gonna apply it within their own company?
2. SGSarah Guo
  I really draw a analogy from the current sponsorship of Meta and Zuck of, you know, LLaMA and the open source model ecosystem to, like-... MySQL, right? So for those of us who remember, like, wh- what happened with these open source database companies, MySQL ended up being originally made by, um, this guy Monty Widenius and some Swedish company, it became part of Sun, it became part of Oracle, um, and in the early days, like, MySQL would crash and corrupt data and there were, uh, some early internet scale companies like Facebook who wanted to use it, wanted to not be beholden to commercial database vendors, made it scale, made it more robust, and contributed back, right? And I, I think, like, it's a reasonable analogy in terms of, like, some core technology to your company where you don't want to have a vendor, uh, you don't see it as part of your core business model, but you want there to be open source options, right? And so I have a lot of admiration for what Meta is doing and I think, like, I think that it's very likely to be a big mover in the ecosystem because if they sponsor some baseline of models that are big enough to be valuable, high quality enough to be valuable with Facebook AI research, and then enough people find these models useful and strategic and they create a developer ecosystem, it's hard for me to picture them not being sustained as an important ecosystem, an alternative to these, you know, research labs that in many ways compete with Facebook or Meta in different ways and are very expensive to maintain.
3. EGElad Gil
  But if you look at the history of open source, is that really true? So say
13:22 – 16:23
Open Source Models in Consumer Engagement
1. EGElad Gil
  for example you look at Linux, right? And Linux in part was very much sponsored by IBM throughout the late '90s to the tune of in some, in some years a billion dollars a year, and so even these external ecosystems tend to get quite expensive, you know? And the, the reason that IBM sponsored Linux was to provide a real offset to Microsoft, right? They basically said Microsoft is dominant on the desktop, they're really getting aggressive on sort of the server and infrastructure world and so therefore, let's fund this offset for open source. How do you think that analog applies at, to Meta or does it, or do you just think it's a different reason in terms of why they're pursuing it?
2. SGSarah Guo
  Um, well, I think they're pursuing it because they want to use it and they don't want to be trapped, right?
3. EGElad Gil
  Oh sure, but they don't have to open source it, right? They could just continue to develop it like they have been and so why open source it?
4. SGSarah Guo
  One piece of it is, like, wanting to offset the development costs and the, um, compute costs at some point, right? And, and just, like, ex- but that's sort of one of the core premises of open source. They've also done, like, other really related things like the Open Compute Project, um, but, you know, if you think about why that analogy does or doesn't apply, right? Like one is does Meta want to make money off of this in some sort of, like, B2B way? If they keep open sourcing it the answer is no, right? They want to use it in their core consumer businesses. And then two, like, for, for this to work I think one of the ways the analogy breaks down is very much, um, like, the need for centralized training today, right? It's a complicating factor, like, can you really coordinate that with the politics and slow decision-making of open source communities? I don't know, I think that's challenging. There are, um, there are interesting folks working on at least the sort of, like, technical coordination of, of this as well, right? Like Foundry and Together. Um, but if you, just to like make explicit, like, why might they care, my guess is, like, the ability to use these models, it applies in sort of the more traditional ways, like, we can use them to make the data center, like, more energy efficient. We can, and I, there's been publishing about this, we can use these models to improve, um, like, ad serving, right? Uh, like lots of things that matter to the core Meta business, but it's also just one of the most interesting things to happen in consumer in a long time, right? You have things like character inflection, Midjourney, Pika, experiments like Can of Soup, like, these things, they have caught the attention of consumers in a way that few things have over the last few years and so I, I think it's known that there are Instagram chatbots being tested, right? And so if this is a path to consumer engagement and then therefore ads and it's going to be a really important element, I think they just want to have access to it without being beholden to a sponsor. What's your view?
5. EGElad Gil
  Yeah, I mean, I think it's amazing that Meta has decided to make this move, and I think it's really beneficial to the ecosystem overall. So, you know, at this point I think Llama 2 is really emerging as a model that a lot of people are rallying around and obviously that may change over time, but for now I think it's one of the primary models people are using on the open source side and that people view as quite high quality, um, so I think it's super impressive. I think more broadly in social NAI,
16:23 – 21:53
New Trends in Social Content Creation
1. EGElad Gil
  it's kind of striking that the last large social network in some sense was TikTok which was launched seven years ago now, so it's been a while since we've seen a major shift and part of that is because large scale social products have already been established and so now you need to sort of pry users away from existing products which is much harder than just filling time other ways. I, I remember talking once with Jack, the founder of WikiHow, which was like a how-to, you know, community driven website and he said that the main way that they lost people who were contributing to WikiHow was they went to social gaming, they were just playing games instead, right? So it was sort of this time and attention shift 10 years ago when, when he mentioned this to me, right? And so, um, number one is you have to displace other people, um, number two, you know, a lot of the innovation in social kind of stagnated a little bit for startups, right? It became a lot more let's do Twitter but more woke or more right wing, um, or let's do early Facebook again as a mobile app versus hey, we're gonna reinvent the modality or we're gonna reinvent the use case or the communication channel, whatever it may be, and it feels like generative AI is the first thing in many years to sort of create that new window or opening. And I think the big social networks like Meta and Twitter and others may actually be the biggest beneficiaries of this new wave, but there also should be room for startups and there's some new things, you know, Can of Soup was in the recent YC batch and they're doing kind of interesting things and I think it's almost like asking what's the gen AI native modality and use case? And typically when you look at social products you used to have this two by two or some people had like a two by three of-... you know, is it broadcast versus, um, mutual follows in terms of network structure? What's the modality? Is it images, is it video, etc.? And then what's the length and persistence of it? Is it long form, is it ephemeral, etc.? So for example, Snapchat started off as, um, you know, short form, uh, broadcasts and one-on-one that was ephemeral, right? And so, uh, you could kind of map out the whole social world against those dimensions, and now there's this new interesting thing of, you know, new forms of content creation potentially upending one or two of those quadrants. So, it seems like a very exciting time overall.
2. SGSarah Guo
  Yeah. Yeah, I had a, a, you know, longtime obsession with Toutiao and TikTok and some of the Chinese social companies that really started as, like, AI-native content aggregators, right? If you think about what they did, um, they really figured out this, like, cold start problem in terms of they... Like, Toutiao originally, they aggregated, um, news content from other places and then bootstrapped your preferences. They didn't require explicit user input to say, like, "I am interested in these topics." They analyzed your social profile for your interests. They collected, like, location and demo and analyzed articles for, like, quality and topics. So they had these, like, rich per-user models of engagement based on interaction data, and then you had this magical experience of, like, a better content feed that then drove the iteration around better labeling. And I think exactly as you said, if those companies figured out, like, the cold start on relevance, um, maybe the opportunity... I think one of the potential opportunities in, in this generation of social is, like, cold start on the content itself, right? Like, you've seen, um, other amazing companies like, like the Instagrams of the world, right? They, they create tools for content creation for, like, magically compelling assets that are much easier-
3. EGElad Gil
  Ooh.
4. SGSarah Guo
  ... and then, like, turn it into a social network. And so generation feels like a, um, a really compelling answer in terms of, like, how to have a content feed that is both, like, really engaging for you and then giving people creation superpowers.
5. EGElad Gil
  Yeah, and I think, um, Midjourney and Pika are two great examples of that, to the point earlier. And then character is sort of a form of that if you decide to create your own character or sort of interact with something that's more customized there. So it does seem like there are these really interesting, uh, shifts that are happening, and then the question is, is it more for creation and sharing or does it become a new social product or a new communication product? In other words, is it GIPHY or is it, you know, uh, F- Facebook, right? And Lensa was a good example of GIPHY, right? It was used to basically create content that you share on other social networks, and the question is what are going to be the big consumer apps that sort of emerge on top of that? And again, it may just be Meta again, right? But I think it's a super interesting question and, uh, and probably the most exciting time in social for a very long time, and it's kind of this oddly almost ignored area from a entrepreneurship and founder perspective right now. Everybody's rushing at the enterprise stuff and the infrastructure and, you know, that whole stack, and it's almost like the generation of people who are going to start social products all did them five years ago and did the, you know, "Let's do Twitter again." And the generation that's really focused now kind of grew up where SaaS was sort of the opportunity, or SaaS and dev tools were the opportunities that everybody was mining against. So, it'll be interesting to see whether or not that shifts back in any meaningful way. Um, the one other thing that I think is kind of interesting just related to entrepreneurship and AI right now, and I was talking to a founder about this, where they were trying to do something really hard, and, um, by really hard I mean addressing a really hard market by using gen AI. And early in markets, like when a new technology shifts and disrupts the whole market, you actually want to just do the easy stuff, right? Why do the hard stuff? There's so much low-hanging fruit. Why don't you just go after the stuff that's super easy? And, uh, my, my sort of advice to founders generically on this stuff is like, don't do the hard stuff right now.
21:53 – 23:53
Balancing Ambition With Realistic Customer Expectations
1. EGElad Gil
  (laughs) Or if it's hard, do something that's technically hard that enables a giant breakthrough in terms of use case, but don't actually do the hard market, because there's so many easy markets right now. You should just, you should just go for the easy stuff, and, uh, if you're grinding and grinding and grinding and not getting customer attention, don't spend more time on it. It's just not worth it right now. Now, five years from now when the use of these technologies are a bit more saturated, that's when you have to go do the hard stuff, right? (laughs) But, you know, it's kind of interesting to, to think about, you know, prior technology waves and when should you do the easy versus hard?
2. SGSarah Guo
  Yeah. I was actually just talking to some of the founders that are in our accelerator right now that come from, like, really great technical and research backgrounds, and they were reaching for a problem broadly in the engineering and code generation space that was very ambitious, right? And I could see kind of a, a, a solve-it-all type problem. Um, and it's not that it's not valuable, it's just that there is so much you could do that is, as you point out, easier and valuable today, um, and, like, requires pushing the bounds of research, but you have far higher likelihood of having something that's useful to give to customers this year with far less risk. And I don't mean to, um, constrain people's ambitions, but the ability to give yourself multiple at-bats with the wind at your back in terms of the entire field progressing versus trying to get out in front of everyone else with a, um, a multi-year research goal when there's, like, just gold hanging out everywhere, you know, my, my orientation is, is I think similar here.
3. EGElad Gil
  Yeah, it's no GPU before product/market fit. I think that's the takeaway.
4. SGSarah Guo
  Elad's slogan of the year. Okay, awesome. Uh, fun to hang out and talk about the news of the week. Find us on Twitter @nopriorspod. Subscribe to our YouTube channel if you want to see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way, you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.

Episode duration: 23:53

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode UcH-ikqhVj4

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome