No Priors Ep. 20 | With Sarah Guo and Elad Gil

This week on No Priors, Sarah and Elad do another hangout to answer listener questions. Topics include debunking common misconceptions about AI and its implications on the world, the analogy to nuclear power and nuclear safety, the impact of larger context windows, developer productivity, incumbent announcements of AI products, and some requests for (fat) startups. 00:00 - What Are People Getting Wrong About AI Right Now? / New Capabilities of NLP 04:35 - Nuclear Power and Safety Concerns 11:12 - Emerging AI Companies and Research 15:54 - China's Hardware Sanctions and Funding Ramp 20:34 - Innovation in Heterogeneous Compute Infrastructure 28:08 - Enterprise Stack and Decision Making 33:44 - Data's Impact on the World

Sarah GuohostElad Gilhost

Jun 8, 202334mWatch on YouTube ↗

EVERY SPOKEN WORD

70 min read · 14,022 words

0:00 – 4:35
What Are People Getting Wrong About AI Right Now? / New Capabilities of NLP
1. SGSarah Guo
  Hello, No Priors listeners. We're excited to just do another Hangout episode with me and Elade and answer listener questions. I think a fun one to start with would always be a place where we're disagreeing with the market, so I'll ask Elade, what are people getting most wrong about AI right now?
2. EGElad Gil
  Yeah. I guess there's- there's two or three things that I wouldn't say they're necessarily getting wrong, but I just feel there's some misconceptions about. Um, the first one is, I feel like a lot of people are kind of treating this as a- as an extension of the last decade of machine learning that we've seen, in the sort of convolutional neural network and, um, RNN world. And everybody keeps talking about it as if it's that old world, and they keep emphasizing certain aspects of data and other things which are important but not as important as they used to be. And in reality, we've had a technology disruption. We've shifted to two very different architectures. Uh, diffusion-based models, which is a statistical physics model for image gen, and then, um, on the language side, we moved to, um, these large language models, which some people are now calling foundation models. And, you know, fundamentally, that's different from the prior wave of NLP in terms of capabilities, in terms of the way it works, but also in terms of insights around things like, you know, just the fact that you now have this really interesting, like, chain of logic or chain of thought style processing of information, and the ability to act and synthesize information in a way that never existed before for NLP, for example. And so I think one big misunderstanding is, "Oh, this is just ML, and we've been doing ML for 10 years, and it's the same thing," and it's totally different. So I think that's one big sort of area that I keep seeing people get things wrong. Um, or at least that, you know, there's these- these- these misassumptions. Second is I keep getting pinged by people saying, "Hey, what's working? What's working?" And there are a few things that are truly working at scale. You know, OpenAI and Midjourney and a few other things. But the reality is, it's been six months since ChatGPT came out and most people became aware of this. You know, like, um, I, you know, I think we both started investing or being involved with the area, uh, on the generative AI side much earlier than that. But the sort of starting shop for the industry was six months ago, and then GPT-4 came out maybe three months ago. And so everybody's acting as if this is an old thing, and again, I think this ties into the prior point. This is not a normal extension of what NLP used to be like. This is a fundamentally new set of capabilities. And so when people are saying, "Well, look, no enterprises are adopting it very much yet," you're like, "Well, it's been six months since most people realized this was that important," and six months is one planning cycle for a big enterprise, right? So people are just planning what to do. So I think that's a second one.
3. SGSarah Guo
  I actually went on a, uh, walk with a, uh, growth investor that I actually have plenty of respect for, but they asked the question of like, "Oh, like what's working? What should we invest in? Do you have anything that, you know, popped off that we should keep an eye on?" I'm like, "Yes," and we actually had a very similar conversation around what adoption in the enterprise looks like. And my prediction based on the sort of, like, negative point of view that this investor had of, uh, "Oh, like it's not enterprise ready, like this is just a hype cycle, and, like, we're not actually going to be investing here, like, we explored it, we met 100 companies, we're done," is that I think other people will arrive at that conclusion, and like any other, uh, wave of enthusiasm, you'll see people abandon it as well, as a- as a technology.
4. EGElad Gil
  Yeah. I think people are just prematurely assuming it's- it's a- it's a conti- it's a continuum from before and therefore there's nothing new here, and I just think that's wrong, and I think people will realize that. I- I actually don't think it's gonna ... I think there's going to be more hype rather than less coming, simply because certain things are really starting to work at scale from a revenue perspective and really quickly. And we've seen the first wave of viral apps in terms of things like Lensa or other things that really ramped quickly and then went away, but those are signs of real traction and real usage. And so I just feel that it's very early in the hype cycle. There's more to come, but it's a different technology, and I- I just think people don't appreciate that. Or I should say, it's- it's an extension of some technologies in some ways, but fundamentally, the capabilities are very different, right? You're still using deep learning, but, you know, it- it- it really implies a different modality of- of, um, how these things work. And then I think the last place or the third area that I think people are kind of getting things wrong is the mad rush to call for regulation by people working in the industry strikes me as very unusual and a bit naive in terms of what that actually means. And so I'm on some private, you know, chat groups where people are talking about regulation and they're like, "Oh, well, the regulators will assign a panel of experts from the industry who will then decide what we're going to do." And I said, "No, that's not how it works." You know? (laughs)
5. SGSarah Guo
  (laughs)
6. EGElad Gil
  What- what happens is, you know, you have some agency established, and just like people keep talking about how it's
4:35 – 11:12
Nuclear Power and Safety Concerns
1. EGElad Gil
  just like nuclear, and you're like, "You do realize that once the NCR, or the Nuclear Regulatory Agency existed, we had no new nuclear designs approved for the last 50 years due to this agency existing." It's not gonna help us. It's gonna hurt the- the collective, uh, motion forward, and there's so much at stake in terms of the positive things this technology can do for the world. Like, if you think about global equity, around healthcare, around education, this is the single biggest motive force to make anybody around the world, irrespective of their upbringing or background or any other aspect of diversity, this is the biggest single thing that can impact things for the positive for healthcare and education going forward as two examples of global equity. And so, you know, that's one of the other areas that I kind of feel like I differ from the pack. I actually am quite a bit of a doomer in the long run, but in the short run, I think that regulation or misregulation can really take things down a dark path. So those would be my three. How about you?
2. SGSarah Guo
  Yeah, let's talk about, um, let's talk about nuclear first and we'll go back to that, 'cause, uh, 'cause you brought it up. I think it's actually really interesting that people want to put it in this bucket with nuclear and climate of existential risk, and I'm- I'm not saying that there isn't a version of, um, technical progress here that leads to existential risk.... as I think you, you implied. But it's funny in, in the way these things relate because as you said, like, nuclear regulation, uh, I, I, I think it's, like, broadly agreed upon in the scientific community that nuclear power is ... if you look at the trade-offs of waste management and then you look at the progression since second, third generation reactors have been developed and the absolute, like, um, complete freeze on nuclear power in the United States and in many other countries because of proliferation concerns. Like, that's a huge impediment to one of the major contributors to, um, actually, you know, at least creating some, uh, progress on the climate change front.
3. EGElad Gil
  Yeah. It's, it's proliferation and broader safety concerns 'cause the US is still at something like, um, 20-ish percent nuclear power. Japan is 30%. France is 70%. So people have been using nuclear and they've been using 60-year-old reactors this whole time without any real issues if you actually look at death rates or things (laughs) like that.
4. SGSarah Guo
  Right, right. What I'm, what I'm pointing out is that there have been two generations of improvement in terms of efficiency and safety of, like, reactor development and specifically and concretely because of the regulatory cost to get a new reactor license. We're building none of something that is newer and safer.
5. EGElad Gil
  Yeah. And, uh, and to your point on safety, the existing reactors are incredibly safe. If you actually look at death rates from nuclear, they're incredibly low. Uh, you know, there maybe was one person who died during Fukushima from the nuclear reactor when, you know, the biggest earthquake and tsunami in decades hit the entire coast of Japan and there was something like a dozen other reactors on the coast that were completely unaffected, right? And so it's just very overblown and overstated, and I kinda worry that in the short run the same thing is happening on the AI side. Again, I'm a long-term doomer. Like, I think there's real risk but in the short run I think we're bundling, um, hate speech and bias and all sorts of things with existential risk. And I've never seen the technology industry rush into the arms and embrace of regulators which they don't understand and have never dealt with before. And if you look at the crypto community or you look at people who've worked in healthcare or in edtech or other areas, they understand what it really means to be regulated and the potential for capriciousness, right? There are some really great regulatory actions, but there's a lot of things that tend to be arbitrary at times too. And so I, I just think it's a, it's a very odd moment in the history of the technology industry in terms of how people are acting. But I guess on a more positive note, um (laughs) , do you wanna talk a little bit about some of the areas that you're excited about in terms of either big picture benefits of AI or I know you've been thinking a bit about CodeGen or other areas?
6. SGSarah Guo
  Yeah. Um, I think maybe adding to your point on global equity, there's one way to look at these models as i- you know, encoding a huge amount of the knowledge that is available publicly on the internet most simply, right? And the idea that you don't want to give that to as broad a audience as possible, um, when it is so cheap to offer, like, some very flawed representation of knowledge in the world to me is ridiculous, right? And, and so I, I think it's, it's a sort of subpoint within education and, like, structured education with curriculums and the ability to train in many different fields from science to primary education or whatever it is. But, uh, but I, I think, like, I, I think of it very much as, like, do you want people to have a- you know, information access, uh, in a way that, um, people think of, like, the web and search as something that you want universally accessible? Yeah. On, on CodeGen. So this is one of the areas I am most excited about, not only because, like, we have seen a product, you know, as you said, one of the few that is, like, massive, um, has real impact on workflow and now generating real revenue for, for Microsoft and for, for others, um, but I, because I think we're really early, right? Uh, like today, you know, you have Copilot looking at your five open files, your five most recent files, and doing function completion. Uh, and, like, that already has such a massive impact on productivity, but I, I think we're really early in figuring out how to deliver more context, uh, to these models and have them work in different ways, right, um, and in different user experiences. And so I think, um, you know, Gravely and the whole Copilot team, they tried a bunch of different things, including, like, everybody starts with chat and, uh, and ended with this, um, continual autocomplete, which is a great experience. But I, I think there are a lot of really interesting ideas that are floating around now from I think the more obvious, like search or to the very fun, like, okay, well if you just had, like, you know, AI Elah, the junior developer that takes any Jira ticket or, or, um, shortcut or linear issue and makes a best effort guess at actually writing that code with the context you give it and submits a pull request. Like, I think people would really love that, uh, interface and the question is just how can you get some of this stuff to, to work? And, um, I, I think we're gonna see a lot of progress on that front.
7. EGElad Gil
  Yeah. It definitely seems like there's a few different companies starting to work in that area now too. Like, there's magic.dev. I think, uh, you know, there's another company that Redpoint just announced they were backing.
8. SGSarah Guo
  Yeah. This is Jason Warner's company. I do
11:12 – 15:54
Emerging AI Companies and Research
1. SGSarah Guo
  think it begs this, like, broader question that's become more, more interesting as the context window for the available models has expanded, right? Um, first, OpenAI to 32K and Anthropic to 75 and 100K, and you, you see these, um, somewhat exciting research headlines like, you know, "2 million, 2 million token context window," and, and the reality is, um, first of all, that's, it's, it's not quite true, and, and second, like, y- you know, I have a strong point of view here that context will expand to fit the window.Right. Like, the- the amount of information or the instruction you can give to a general intelligence is much more than 800K and- and given the relationship of the context window to, like, attention architectures and- and the quadratic limitation here, uh, I- I think that it's going to be a continual area of investment of figuring out how to get, like, efficient context, uh, into a model as a- as a product insight, right? And I'll give one, like, very concrete example. We held a, um, like, an AI dev tools hackathon for a bunch of, like, college-aged new grad builders at the Conviction offices this past weekend and thankfully, to, uh, with the- with the support of Anthropic, we're playing around a bunch with big context and windows, but even if you just want to feed in, for example, like, Kubernetes documentation, you hit that window very, very quickly, right? So I- I think there's a lot of enthusiasm from builders about what you can do with that window. But, you know, catastrophic forgetting and the a- the idea that, like, 100K or even a million tokens is going to solve all our problems and, like, you just ask your questions and dump in all context naively is ridiculous to me. I think it's a really important area of product and research work.
2. EGElad Gil
  Yeah, I think in two years it'll be reasonably solved in terms of the size of context windows that will be usable. Like, I'd be happy to take that bet, but we'll see.
3. SGSarah Guo
  Okay.
4. EGElad Gil
  Like, I think- I think the context windows are going to get very large.
5. SGSarah Guo
  I- I- I agree they're going to get very large, but do you agree, like, people will just fill it and, like, it will continue to be a question of how to be efficient with that?
6. EGElad Gil
  Up to a point, yes. I guess the question is, say that you had a billion token context window, is that big enough? Like, where- where does it asymptote in terms of value? And you can think of it in a few different ways. Like, in your Cogen example, the question is, like, how big do repos get, right? Or you can ask in the context... I know some people who want to use LLMs for private data or I should say, um, you know, private data across many different users where they want to be able to differentiate and dump all the specific user data into the context window so they don't sort of cross- have any crosstalk in terms of data and, you know, it's a data privacy approach in some sense.
7. SGSarah Guo
  Mm-hmm.
8. EGElad Gil
  And so the question is, like, how many tokens do you need to represent all the data that's relevant to a person relative to a specific query that you're worried about, relative to PII or other sensitive information, right? And so I agree, like, the- the windows will get bigger and bigger and bigger, but eventually some of these things asymptote. It may take years or decades to asymptote, right? That happened with CPU on your computer or bandwidth, you know, is still probably limiting for certain applications, although not all. So I think it's an interesting question of, like, what number is it enough? And you could probably come up with some heuristic for that, right? Just based on the types of documents that you want to have access to and the specific use cases that you're working against.
9. SGSarah Guo
  Yeah, I guess my view would be... I don't want to take the bet against the, like, uh, billion token context window with you, but my- my point is more that we're... I don't think we're close to the asymptote today where people don't worry about it and, uh, I- I think even so, um, even if that window gets very large, like, the state of work today is that ordering matters and, like, you... An- and there's very little research as to, like, what sort of structured information. Maybe you just describe it as like, okay, as soon as your pro- your context window gets to be, you know, 30 to 100K, a million tokens, like, your, um, prompt engineering becomes a much broader field, right? That's probably the generalization. Prompt engineering and then, like, probably a little bit different than that, but just, like, how do you structure data that, um, is put into context against these models?
10. EGElad Gil
  Yeah, exactly. Yeah, I think- I think there's a data structure piece of it and then there's the sort of memory piece of it, and so I think both of those things will start to feed into, you know, how- how do you make trade-offs between different aspects of what you feed into a model and how do you do that? So yeah, I think it's very exciting times. There- there's a lot of very basic things to work out and again, I think this is back to that point of, it's very early in this field, you know? And I think a lot of people are assuming that we have this ongoing, uh, continuum from the past and are these all solved problems or why are these things still open questions? And I think the reality is it's just so fricking early. (laughs)
11. SGSarah Guo
  Um, let me- let me
15:54 – 20:34
China's Hardware Sanctions and Funding Ramp
1. SGSarah Guo
  ask you about something that I know you've been, uh, paying attention to. So, I think, like, two fronts that are interesting with regards to China. One is, like, what the reaction to hardware sanctions as of last October have been, um, and two would be just, like, there's been a ramp, as you might expect, in terms of funding for Minimax and Baidu. What's- what's your thinking on all this?
2. EGElad Gil
  I think it's kind of the expected, um, shift, right? And so if you look at the Chinese internet or software ecosystem, it's always been walled off from the US and there's been a focus on building local heroes or local incumbents, and so that's part of what led to the rise of WeChat and a bunch of different messaging apps, because the US-based ones were just blocked in China, right? You couldn't use Twitter and you couldn't use Facebook and you couldn't use all sorts of different applications, and so that gave a lot of room for local incumbents to grow up and really become the dominant platforms in these countries. I remember, um... And this may have just been a rumor, but when I worked at Google, I visited the China office there briefly when they still had a China office, and I remember rumors of, um, the fiber to the Google data center getting repeatedly cut as a way to effectively take down the services. Baidu was just getting started, right? So it sort of gave Baidu space to grow as an alternative search engine. And again, that may be incorrect. That's just kind of the rumor that was floating around back then. And so, you know, you- you saw the ability to create these local heroes and it seems like the same thing is now happening on the LLM side. So Minimax raised $250 million at a $1.2 billion valuation. Um, to your point, Baidu just announced $145 million AI venture fund, and so it'd be surprising if the Chinese didn't invest very heavily in creating local heroes in this very important technology dislocation. And this leads back to questions around, well, how does this translate into different aspects of competition between different countries and regions, and how do you think about that relative to how you think about what to regulate or not regulate in the context of the US?... because fundamentally I think these, these companies are going to go at this area and this technology very hard and aggressively, because it's both very important, but also it can tie into national security or other concerns. So, I think it's definitely worth watching.
3. SGSarah Guo
  Yeah, and I-I think one of the things that's also worth just noting in terms of the profiles of people starting these companies, the Minimax team is, um, former SenseTime folks, right? And for, um, our, our listeners that don't spend as much time in the, like, Chinese ecosystem, SenseTime was a, um, like, broadly a facial recognition company that did a lot of work with the Chinese government, right? And, and so I- I think, um, you know, I think one vein of discussion as part of the debate around, uh, regulation and whether or not having these models developed in the United States or in the EU is a good thing, people have been discussing, like, "Well, what does AI with American values in mind mean?" I think is probably very different than something that starts with a perspective of, like, a government facial recognition company.
4. EGElad Gil
  Yeah. Do you have any thoughts on the hardware sanctions?
5. SGSarah Guo
  Yeah. I- I, um ... So right now, um, NVIDIA is the, you know, engine for everyone's progress, um, and, uh, you know, Jensen is laughing all the way to the bank about that and much deserves sitting on his, his empire. But I- I do think that if China cannot get A100s, H100s, top of the line hardware, like GPUs with good interconnect to train models, like, they will invest heavily in solving that problem. And so one of the things that has been quietly happening is, uh, investment by both the, like, networking and hardware players, um, that are domestic, like a Huawei, in, um, you know, chips and systems to support AI training. And this is not, like, a trivial pursuit, right? You have a small set of companies that have created, like, TPUs at Google or other, um, accelerators that actually, uh, work for at scale Transformers training. But, um, but I think it will happen over time, right? It's not an unsolvable problem with the right talent and enough capital. And so, so there is a- a piece of me that feels, um, that some of the regulatory attempts to control model development are misguided, and also they- they don't really take into account, like, what you, what you need to go, um, create these models and what those sort of impacts are going to be, right? Because what is happening, um, not that, not that I'm against these sanctions, uh, uh, around, like, AI training hardware, but what it's doing is encouraging a domestic industry, um, in, in the long run, right? Um, I think separately
20:34 – 28:08
Innovation in Heterogeneous Compute Infrastructure
1. SGSarah Guo
  from that, there's another interesting vein that you and I have talked about where, like, you're trying to ... It's probably two different technical approaches, right? One is figuring out how to make more, uh, heterogeneous ... Well, I guess it's all sort of the same thing, but making more heterogeneous compute work for both inference and training, and so this is companies like Together or Foundry or even at the, uh, compiler layer, like something like Hippo. And, and I think that, um, that's, that's interesting depending on, like, what layer of abstraction you're at, if that is a compiler or if it's scheduling an abstraction. But I think that there is more innovation to be had at the infrastructure layers, uh, given it is a constraint to the ecosystem in a way that it hasn't been, um, at least not in a way that's been, like, paid attention to in a very long time, right?
2. EGElad Gil
  Yeah. I guess one other big thing that's happened over the last week or two is all the different incumbent announcements, right? So, there's all the different, uh, things around Microsoft, Google Duet, incorporation of all these things into various products in terms of AI enablement. Do you think this impacts any startup opportunities?
3. SGSarah Guo
  I think it definitely does. It also, um, is, to me, not that different from the past, right? Let's sort of, like, um, separate into the different components of whether or not this is going to actually work at the incumbents and whether or not that means, like, there's still startup opportunity around some of these platforms. Like, there was always risk in building on other people's platforms, right? Be that Facebook Games or being part of the G Suite ecosystem or building on Shopify, right? Like, s- this is not a new question for startups, and I think the name of the game has been, like, can you create sufficient value on a broad enough platform that you can break out or is it a big enough independent business, right? You look at something like Klaviyo and, like, you know, there's both relationship with Shopify as a platform vendor and, like, it's a multi-hundred million dollar business. Looks like there was enough there. And so I- I think it's, like, very idiosyncratic to every situation. I think the other consideration would just be, like, Salesforce is an amazing company in terms of many dimensions of execution, but it does have a repeated pattern of, like, announcing, you know, interesting products that, like, never seem to see the light of day or get broad customer develop- uh, deployment, right? And so I- I think that, you know, to your point, it's a really young industry. You're asking companies that have become larger and, uh, more difficult to drive at a rapid pace facing technology disruption, and I think you're gonna see varied levels of execution on that, right? So six months in, that's like one planning cycle. That planning cycle hasn't been executed at the biggest companies. They're saying they're gonna have all of these AI features, um, and AI, you know, development tooling if you're Microsoft, uh, and, uh, I think, like, we should see if any of that stuff gets shipped over the next quarter or two.
4. EGElad Gil
  Yeah, I think it'll take 12 to 18 months for a lot of larger companies to really start to have anything.
5. SGSarah Guo
  Do you have a point of view on this?
6. EGElad Gil
  Yeah, I think it's overstated in terms of the impact. Like, I- I think it's expected, right? In other words, if you don't expect Microsoft to incorporate this into every product, then you're- you're probably misunderstanding what Microsoft is doing. I say that with no insight knowledge of Microsoft. I'm just saying they've been very open on it. They were very early to OpenAI. They've been moving very aggressively. It takes time to roll all these things out at scale. Same with Google, like they're gonna start incorporating it in lots of spots. Now sometimes incumbents try to do something and it backfires or doesn't work very well like Google Plus, and sometimes you do something and it works extremely well 'cause of your distribution and cross-sell like- like Microsoft Teams, right? And so...I think there's always the potential for some misexecution or the potential for bumps along the way. But you should assume that many of the more savvy incumbents are gonna adopt this reasonably soon, and reasonably soon from an incumbent may mean 12 to 24 months. But if they suddenly cross-sell everything to their existing base, then it can really hamper a startup's ability to, to function, right? And so as a startup, you should kind of be asking, "Well, what will happen if the incumbent adds this? And do I still have an advantage? And what if it takes three years versus one year, does that matter?" And if it matters, then great, you have an opportunity. If it doesn't matter, then two, three years later, you can still get really hampered, then you have to rethink what you're doing or ask about how you build defensibility or what else you build against it. So I, I, you know, I think it's, this is all expected and I don't think it's that surprising, and I think we'll see more and more announcements in the next 12, 24 months from lots of big companies saying they're doing stuff. Honestly, the thing that surprised me the most was how fast Adobe moved. Like, I didn't expect them to actually launch products this quickly, and so I was, I was pretty impressed by that. And Adobe, of course, is one of the better run large technology companies in the world, and you could almost measure the rapidity with which somebody adopts this technology almost as a metric of management competence, right? The most competent, uh, companies, or at least the ones closest to this technology will adopt it soonest, and the ones that are still figuring out a lot of other stuff will take a little bit longer to figure it out. But we should assume this is all coming just like mobile. Like, some companies took, you know, 18 months to, to launch a crappy mobile website, and then three years later or five years later, everybody has apps. And 10 years later, BofA has a great mobile app, right? (laughs) So, you know, I think these things happen with time.
7. SGSarah Guo
  Yeah. Um, we, uh, we would all buy Elad's ETF that is, um, predicated on signals of actual product execution in AI amongst incumbents rather than announcements if you're, if you're gonna, you know, put that side hustle together.
8. EGElad Gil
  Yeah. I do think that this does create new openings to go against certain incumbents that were untouchable before. And I do think that there is almost this room for a Rippling or Ashby-like or HubSpot, like, fa- fat startup approach, right? So this feels to me like the first time that maybe Salesforce is vulnerable or certain ERP vendors are vulnerable, or others where you have a lot of very dense interconnected software with lots of connectors and integrations to other applications as a moat, right? And the big moat for many of these companies is, A, people know the brand and they wanna buy them and all the rest of it, and they're already running on them, so it's hard to displace. But secondly is just the breadth of the product, plus the breadth of integration so you have to do if you're implementing an e- enterprise resource planning system like a SAP or a NetSuite, right? Or if, if you're trying to display Salesforce, right? And so I do think the fact that you can now build things pretty rapidly that interact in a really rich way with each other, and where you can fundamentally take advantage of sort of the, the ability to munge and interact with data and create your own connectors and everything else that AI gives you, it does seem like it creates an opening for the first time for some of these really broad-based product areas. So I think that's very exciting from a startup perspective. It actually creates some opportunities as well.
9. SGSarah Guo
  Yeah. And, and so just to, um, dig into that a little bit, uh, the idea being instead of paying for the million man or woman hours to write, you know, painful integration code, maybe much of that can be generated with AI today, and that's the enabler.
10. EGElad Gil
  Yeah. It takes six months to roll out SAP, and one of the reasons it takes six months is big enterprises will hire a consulting company, an Accenture, Deloitte, whoever on the other side, to actually write a bunch of connectors into other systems they have. And then relatedly, there's all sorts of, like, pro- proprietary views you start building on top of it and customization, and those are things that should be extractable and usable from a language model perspective. So I do think you could imagine both a much faster implementation time to display something, but also you should be able to more easily copy over all sorts of customization.
11. SGSarah Guo
  What else are you, um, interested in, uh, that you wish people were, were, great people were working on?
12. EGElad Gil
  I, I think
28:08 – 33:44
Enterprise Stack and Decision Making
1. EGElad Gil
  one other really interesting area is just the broader enterprise stack in terms of how can you enable enterprises, uh, to use this technology in a really rich way. And there's startups like Qima and others who are building out different components of the stack, but I feel like there's four or five different components that you need, and you want an integrated solution. So as you're an enterprise adopting either a proprietary solution, like you're, you're moving from GPT 3.5 to 4 using a mix of, of models, or you're using open source models like LLaMA or others, you basically need to do a lot of different things relative to that. Everything from trust and safety, to other aspects of prompt management, to other things, right? There's, like, this whole stack of stuff, and I feel like there's a lot of point solutions in the market, and then there's some people who are trying to build a more integrated view of this, and I think that's a really interesting area.
2. SGSarah Guo
  Yeah. Um, uh, I think one of the areas that I, um, am trying to figure out, uh, uh, whether or not it's feasible is, um, just the general area of decision-making against, like, technical enterprise datasets, right? So, um, like, two examples of this could be if you're a large organization, you have a security operation center, right? You have analysts that process incoming information. They make decisions about how to triage events. Um, you have a correlation engine in front of that analyst. Um, you know, i- in theory, the, the sim product does that. In practice, like, nobody likes that and it's super, um, uh, honestly super useless because of the false negative, um, false positive rates on those products. And so if it's a security incident triage or in, in the devops space, like, um, root cause analysis or even, like, post-mortem generation as a smaller feature, I think that's going to become really interesting, especially since there's a lot of, like, cross-team communication based on different datasets, uh, and I feel like that's something that's well-suited for LMs to go make sense of and then explain.
3. EGElad Gil
  How do you think about which of those things are just gonna be incumbents or startups that already exist adding it versus new things? So for example, when I look at supply chain, uh, software...... security, like what, um, Snyk is doing or what Socket is doing, right? Socket really rushed to add these features in a really smart way using LLMs to classify, you know, potential malicious code or issues in software, right? Open source software. And so it feels to me like a lot of these players may just be existing players adding that AI capability versus a- an entirely new company. Like, which of those areas do you think will be entirely new companies?
4. SGSarah Guo
  Yeah, I- I think part of it is just, like, what are the data sets you are operating on and, like, does the incumbent own those data sets or not, right? Um, one of the interesting things about the SIM market is, like, l- average large enterprise has, you know, 200 plus security products, many of which pipe data into the SIM, and so a n- a- a new player is not necessarily on any weaker footing than an existing SIM vendor because they don't own the collection of that data and they're- they're just making sense of it, right? So I think that's challengable. Um, you know, we're also gonna talk to the founder of Datadog, and I think Datadog is in an extremely good place to actually go attack some of these opportunities in that they- they do the data collection natively, right? And they have a very broad, sweet product. And- and I think, you know, for any new company coming in, um, figuring out whether or not the incumbent can just turn on, like, let's say a feature blade on their existing data is one thing, but, uh, if- if you're competing with incumbents that don't necessarily have any edge and are taking input data the way you would be, I think that is interesting.
5. EGElad Gil
  Makes sense.
6. SGSarah Guo
  One other challenge to your point about, uh, like, what an enterprise needs to do, I- I do think that, like, annotation and synthetic data and data sharing, like, continue to be, like, huge blockers for- for any enterprise, right? And there's research advancements that make annotation more interesting or LHF more interesting and, you know, it's an unserved tooling market, um, if it is a market at all. Uh, and then there- there have been companies that have struggled to scale acceptance in th- in specific verticals around synthetic data, and m- may end up just being, like, a very, I don't know, task-specific problem, right? Where you can't get general companies out of it. But I talked to a lot of people who, you know, they- they have really interesting internal data sets, these incumbents, and, um, their ability to work on their own customer's data is quite challenged in terms of privacy and security and the agreements they have with those customers, right? And so figuring out how to actually tokenize that data, not in the, um, machine learning sense, but in the, uh, like, anonymization sense, uh, or- or generate synthetic data you can actually use for working with LLMs I- I think is, like... I don't know how big of a problem it is, but it's very interesting and, uh, like, uh, economically, is it- is it economically valuable? I'm not sure, but I do think it's a prevalent and interesting technical problem.
7. EGElad Gil
  Yeah, it's- it's, uh, it's one also where I feel like it's been a longstanding one just for traditional ML prior to this wave of LLMs and it's- it's- it's back to that question of does- do LLMs make a difference? Obviously you can do synthetic data with more intelligence, but it's back to, you know, how big of a real issue is it and then also, you know, once you have a bit more understanding of what some of the data actually means, you know, does that impact the world? Like, when I talk to bio people, they still constantly talk about data, data, data, data, and they don't seem to understand at all the new capabilities of these technologies, right? They're- they're still stuck in the old ML world.
8. SGSarah Guo
  Yeah. I'll describe one last shape of company that I think is, like, here's a concrete example, but then is quite general, um, uh, like, you know,
33:44 – 34:57
Data's Impact on the World
1. SGSarah Guo
  you mentioned, like, CRM and ERP, like, these core systems, core enterprise software systems. They, um, are databases of customer and financial records and then the workflow in them is, like, update- update the record based on some action in the world, right? And so th- one of the obvious things to me is to try to figure out whether or not you can take the event that's happening in the world and write that update automatically, right, to a database in a- in a robust way. Um, I think that's a pretty big ask, but, like, imagining translating any economic event, right? Um, uh, Eladco paid an invoice or transmitted this cash balance t- from one account to one of its suppliers. The ability to take that and, like, record the transactions from an accounting perspective and then update the financial database, like, I think that would be pretty disruptive and I've seen, um, different teams begin to try here. I think it's really interesting.
2. EGElad Gil
  Yeah, makes sense. Cool. Well, I think that's all the questions we have for this week, so, uh, thanks to everybody for submitting them. (instrumental music)

Episode duration: 34:57

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode eTFUcPiodGU

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome