Alexandr Wang: Why Data Quality Decides the AI Frontier

Through hard evals against real customer tasks rather than benchmarks; Scale AI proves labeled data quality determines the frontier model performance ceiling.

Garry TanhostAlexandr WangguestJared FriedmanhostHarj Taggarhost

Jun 18, 20251h 1mWatch on YouTube ↗

EVERY SPOKEN WORD

60 min read · 12,341 words

0:00 – 1:15
Intro
1. GTGarry Tan
  Since we recorded this Light Cone episode with Scale AI CEO Alexander Wang, Meta has agreed to invest over $14 billion in Scale, valuing the company at $29 billion. Alex has also announced he will lead Meta's new AI Superintelligence Lab. Our conversation you're about to hear covers the history leading up to this investment, from Scale's early days at YC to its integral role in the training of foundational models. Let's get to it.
2. AWAlexandr Wang
  The AI industry really continues to suffer from a lack of, uh, very hard evals and very hard tests that show really like the frontier of model capabilities. The biggest thing is you just have to really, really, really care. When you interview people or when you interact with people, you can tell people who are just sort of like phone it in, versus people who sort of like hang onto their work. It's like th- so incredibly monumental and forceful and important to them that they, they do great work. Very exciting time to, to see the... how the frontier of human knowledge expands.
1:15 – 7:25
Alexandr’s early days at YC
1. GTGarry Tan
  Welcome to another episode of The Light Cone. Today, we have a real treat. It's Alexander Wang of Scale AI. Jared, you worked with, uh, Alexander way back in the beginning actually. Uh, what was that like? What year was it? Put us in the spot.
2. JFJared Friedman
  Yeah, Alex, I mean, most of what we want to talk about today is like what Scale is doing now 'cause like th- the current stuff is like so, so awesome and so interesting. Since Scale got started at YC, I thought it just seemed appropriate to start all the way at the start. And, um, it is funny, uh, Diane and I were at MIT last month talking to college students, and like of all the founders, the one that they like most look up to and like want to emulate is actually you. Like everybody wants to be the next Alexander Wang 'cause everybody knows the story of how you like dropped out of MIT and, and, and ended up starting Scale. But they don't know the real story. And so I thought it'd be cool to go back to the beginning and just talk about the real story of how you ended up dropping out of MIT and starting Scale.
3. AWAlexandr Wang
  So before I went to MIT, I worked at, um, Quora for a year. And so this is 2015 to 2016. Or no, sorry, 2014 to 2015 was when I worked as a software engineer. And this was already at a point in the market where ML engineers, as they were called, or like machine learning engineers, uh, made more than software engineers. So that was already like the market state at that point. I went to these summer camps, um, that were, that were organized by, um, by Rationalists, the Rationality community in San Francisco. So, um, and they were for precocious teens, but they were organized by, um, uh, many people who have become pivotal in the AI industry. So one of the organizers is this guy Paul Christiano who, um, used to, uh... who's the inventor of RLHF actually, and now he run or is a research director at the US AI Safety Institute. He was at OpenAI for a long time. Um, Greg Brockman came and gave a speech at one point. Eliezer Yudkowsky came and gave a speech at one point. And actually it was very like when I was... I don't know, must have been, uh, 16, I was exposed to this concept that like potentially the most important thing to work on in my lifetime was AI and AI safety. So something I was exposed to very e- early on. So then when I went to MIT, I was... started MIT when I was 18. I like studied AI quite deeply, that was most of what I did in the sort of day job. And then, um, uh, kind of got antsy, applied to YC, and then the idea was kind of like, okay, how could... Initially it was like, okay, where can you apply, um, sort of like AI to things. And this was, um, in the era of chatbots, which is like crazy to think about actually-
4. JFJared Friedman
  Mm-hmm.
5. AWAlexandr Wang
  ... um, that there was like this, like mini chatbot bubble-
6. JFJared Friedman
  Boom. Yeah. Yeah, 100%. Yeah.
7. AWAlexandr Wang
  ... (laughs) in uh, in 2016, um, which is, uh, which was I guess spurred by Magic, right? Or, or some of these apps. And, and, uh, Facebook had a big vision around chatbots and anyway, there was this little mini chatbot boom. So the initial thing that we wanted to work on, uh, and, um, was (laughs) was chatbots for doctors, right?
8. JFJared Friedman
  (laughs)
9. AWAlexandr Wang
  (laughs)
10. JFJared Friedman
  Which is like a funny idea because do you guys know anything about doctors? (laughs)
11. AWAlexandr Wang
  Yeah, no, not at all. Um, like basically no. I... It was just sort of like, oh, doctors are a thing, that sounds expensive and so... (laughs)
12. JFJared Friedman
  (laughs)
13. AWAlexandr Wang
  And I think it was like, I think it's like indicative of like, I mean, I don't know, you guys see this all the time, but I feel like most of the times young founders' like first ten ideas are like alwa- First of all, they're very memetic so they're probably like there's a lot of like the same ideas
14. SPSpeaker
  Yeah.
15. AWAlexandr Wang
  There's like a dating app, there's like some- something for like, you know, social li... you know, the s- the same ideas. Um, and then I think that like, I think young people have a very poor sense of alpha, like what are, what are the things that they're actually like going to be uniquely positioned to do? And I think, you know, most young people don't have a s- sense of self, so it's, you know, it's not clear. So when we were in YC, we were roommates with uh, with um, with another YC, uh, company, and we were sort of like, um, we were sort of observing this like, this like chatbot boom ahead of, uh, you know, that was happening at the time. Um, and... but it was very clear that like, um, chatbots, if you wanted to build them, and this is funny to say in retrospect, required lots of data, um, and required lots of like human elbow grease, um, to be able to get them to work effectively. And so like just like kind of off the cuff at one point I was like, oh, like what if you just did that? What if you just did the data and the, the like language data and the, the human data so to speak for the chatbot companies? We were also very lost by the way. I think you probably remember. We, we were, we were quite lost mid-batch. Um, uh, and like many YC companies I think. And so then we, um, switched to this like concept. I think the, you know, the initial idea was like API for, um, for human tasks or something along those lines. And uh, and one night I was just like tr- trolling around for domains, scaleapi.com was available and then we just bought it.We launched it, I think, a week later. We-
16. JFJared Friedman
  On Product Hunt.
17. AWAlexandr Wang
  Yeah.
18. JFJared Friedman
  I remember it. The, the Product Hunt page is still live.
19. AWAlexandr Wang
  (laughs)
20. JFJared Friedman
  I was reading it last night. And I remembered the tagline, it was an API for human labor. Like, that, that, that, that's my recollection as sort of like the, like, distilled insight that you had, was like, "What if there was an A- what if you could call a human with, like, an API?"
21. AWAlexandr Wang
  Yeah. And that was, I mean, I think it was, like, three days for us to put up the landing page. It launched on Product Hunt. I think this idea captured some amount of imagination of the, like, of the startup community at the time, because it was sort of like this weird form of futurism where you have, like, humans delegated, like, APIs delegate to humans in this, in this interesting way.
22. JFJared Friedman
  Yeah, it's like an inversion of the, yes, yeah. (laughs)
23. AWAlexandr Wang
  Yeah, yeah, yeah, exactly.
24. JFJared Friedman
  The humans doing work for the machines instead of the other way around, yeah.
25. AWAlexandr Wang
  Yeah, yeah, yeah. It's funny because the, the initial phase, you know, we sort of, we just worked with all these engineers who reached out to us from, um, from that Product Hunt, which was a real grab bag of use cases. But then that was enough for us to raise money at the time, and like, you know, uh, and to get going. And then a few months after that, uh, we, it became clear that, like, self-driving cars was actually the first major application that we needed to focus on. And so there were many, uh, very big decisions, I would say, in the first, like, years of the company.
7:25 – 10:24
Dialing in on what worked
1. HTHarj Taggar
  One thing that was curious is at that point there were al- other solutions that were already the game in town, like Mechanical Turk from Amazon was sort of a thing that people were using, but you ended up capturing this whole other set of people that didn't know about it, and you had a way better API, and you kinda won.
2. AWAlexandr Wang
  Yeah.
3. HTHarj Taggar
  But it was not clear at that point because you probably were compared a lot with Mechanical Turk.
4. AWAlexandr Wang
  Yeah, so Mechanical Turk was definitely the sort of, like, um, the concept in most people's mind at the time. I mean, it was just, it was kind of one of these things where I think a lot of people had heard about it, but anyone who had used it knew it was just awful.
5. JFJared Friedman
  (laughs)
6. HTHarj Taggar
  (laughs)
7. AWAlexandr Wang
  And so it's like, whenever you're in a space and that's kind of the, like, that's, like, the thing, is like people mention a thing but it sucks, that's usually, like, a pretty good sign. Um, and so that was, that was enough to give us, like, early confidence. But then I think the thing that, like, really, I would say the, the, um, the thing that was as actually fundamental to the success of the, the company was actually focusing on this, like, on this, like, seemingly very narrow problem of, of self-driving cars. I think that, um, you know, I remember very early on when it was maybe, like, six months after we were out of YC. Basically, um, there's another YC company, Cruise that, that had reached out to us on our website, and sort of, like, in the blink of an eye they became our largest customer.
8. JFJared Friedman
  And they found you just from your launcher or...
9. AWAlexandr Wang
  Yeah, just, yeah, I think maybe even Google. Like, I, it's not even totally obvious, but just, yeah, vaguely from our launch and vaguely... It was actually an ex-YC founder that, uh, was working on Cruise that reached out to us. So maybe some YC mumbo jumbo. (laughs)
10. HTHarj Taggar
  (laughs)
11. JFJared Friedman
  We're a (...) , you know.
12. AWAlexandr Wang
  Uh, who knows? The world works in mysterious ways.
13. JFJared Friedman
  (laughs)
14. AWAlexandr Wang
  But, uh, and so they grew very, very large. So then early on we made this decision, and I remember we, we, we, um, went to our lead investor at the time and, you know, we had this conversation. It was like, "Hey, actually we think we should probably just focus on this self-driving thing." You know, it was actually a very interesting conversation because the reaction was like, "Oh, that's just, like, obviously way too small a market." Um, and like, "You're, you're never gonna build, like, a gigantic business that way." Um, and we were like, "We think it's probably a much bigger market than, than you think it is," because there's, like, you know, all these self-driving companies are getting crazy amounts of funding and the automotive companies are doing huge programs in self-driving, and it clearly is the future. Like, it feels like something that, that, um, that should exist. And so we were like, "If we focus on it, we think we can build, like, build the business much more quickly." And it's funny looking back because both things are true. It is both true that it enabled us to build the business to be, to get to scale pretty, very quickly, and it is also true that that was not a big enough market to sustain a gigantic business. The story of scale in many ways is, like, this progression of, like, how do you continue? You know, AI is this incredibly dynamic space. Um, lots of things are constantly changing. And, um, a lot of, I think, what, um, what we pride ourselves on at the company is how we've been able to, um, continue building on and, and, um, contributing to this very fast-moving industry.
15. JFJared Friedman
  When did
10:24 – 19:18
Model improvements, evals
1. JFJared Friedman
  you, uh, become much more aware of the scaling laws? Because your, uh, one of the interesting facts that sort of emerged is that, uh, you're a little bit the Jensen Huang of data.
2. AWAlexandr Wang
  (laughs) I think that in self-driving, um, scaling laws were not really a thing. Um, because... And the fundamental, the biggest reason actually was that, like, one of the biggest problems in self-driving is that your whole algorithm needs to run on the car, and so you're very constrained by the amount of compute you have access to and is available to you. So, like, a lot of the engineers or, and a lot of the companies working on self-driving never really thought about scaling laws. They were just all thinking about, like, okay, how do you keep grinding these algorithms to be better and better and better that are, like, small enough to fit onto these, um, onto these cars? But then we started working with OpenAI in 2019. This was, like, GPT-2 era. Um, and I would say, like, GPT-1, G- GPT was sort of, like, this curiosity. GPT-2, um, I remember OpenAI, like, they would have a booth at these, like, large AI conferences, and they would, like, you know, their demo would be to allow researchers to, like, talk to GPT-2. And it was, like, mildly... Like, it was, it wasn't, like, particularly impressive but it was, like, kind of cool. It was, like, kind of this thing. And then, um, I think by GPT-3, uh, it was sort of this, like, that's when the scaling laws clearly, um, you know, felt very real. And that was, I mean, I think GPT-3 was 2020. Um, so it was actually, like, long before-
3. JFJared Friedman
  Before the world caught on to what was happening.
4. AWAlexandr Wang
  Yeah.
5. JFJared Friedman
  Did, did you know as early as 2020? Did, did you have a strong inkling that this was really gonna be, like, the next big chapter of scale? Or not until ChatGPT took off? Was, was that clear- Like three, five or fou- Yeah. Was it four?
6. AWAlexandr Wang
  I think that, like, um, in 2020, I think it was clear that scaling laws were gonna be a big thing. But it was still not totally obvious yet. I remember this, like, interaction. You know, I, I got early access to GPT-3, and then it was, like, in the playground. And then I, I was, like, playing with it with a friend of mine. And, uh, I told the friend of mine, "Oh, you can, like, talk to this model." And-During the conversation, um, uh, my friend got, like, visibly frustrated and angry at the AI, but in a way that wasn't just like, "Oh, this is a dumb, like, toy." It was like, it was in a way that was, like, somewhat personal.
7. GTGarry Tan
  Oh.
8. AWAlexandr Wang
  And that's when I was re- I realized, like, "Whoa, this is, like, somehow qualitatively different from anything that existed before."
9. GTGarry Tan
  Did it feel like it was passing the Turing test at that point? Kind of, it was like semblances.
10. AWAlexandr Wang
  Yeah, semblant- it was, like, sort of like, the gl- the- the glimpses of it potentially passing the Turing test, right? But I think the thing that really, um, caused the recognition of, I would say, generative AI, which is still even the term in some ways, it was really DALL-E, I think, that- that, um, that convinced, um, that convinced everyone. But I think, I think my- my personal, um, journey was, like, GPT-3 sh- like, was, like, highly interesting and then... and so it was, like, one of many bets at the company. And then in 2022, over the course of DALL-E and, and then, um, and then later ChatGPT and, you know, um, GPT-4, et cetera, and we worked with OpenAI on InstructGPT, which is kind of the precursor to ChatGPT. It became very obvious that that was, like, the bet the farm moment for the- for the company and for, frankly, the world.
11. GTGarry Tan
  That's when we saw it as well with the big shift in companies, because it was that 3.5 moment release, end of, uh, 2022, and we started seeing a bunch of companies and smart people changing directions and pivoting their companies in 2023, and that was that moment.
12. AWAlexandr Wang
  This dynamic that you referenced, which is kind of the, you know, scales of the NVIDIA for data kind of thing, um, I think that became quite obvious, um, I would say GPT-4 really was the moment where it was like, it was like, wow, this is like, like scaling laws are very real. The need for data will basically, you know, grow to consume, you know, all available information-
13. GTGarry Tan
  (laughs)
14. AWAlexandr Wang
  ... and, and, um, and knowledge on, uh, that humans have. And so, um, it was like, wow, this is, this is like, this, like, astronomically large opportunity.
15. GTGarry Tan
  Yeah, four seemed like the first time it was something that you could, uh, get to not hallucinate basically ever. You could actually have a zero hallu- uh, hallucination experience in limited domains, and which is, we're still sort of in that regime even at this point. You know, the classic view is that if it's hallucinating, you're not giving it the correct data in the prompt or context, or, uh, you're trying to do too much in one step.
16. AWAlexandr Wang
  Yeah, I mean, I think, I think, like, the reasoning paradigm is- is- has a lot of legs. And it's actually been interesting this last era of the- of model improvement because, um, uh, the gains are not really coming from pre-training, um, which is... so- so we're, like, moving on to a new scaling curve of- of reasoning and reinforcement learning, but it's- it's, like, shockingly effective. Um, and- and I think that, you know, it- it's- the- the analogies between, like, AI and- and Moore's law are pretty clear, which is like, you know, you'll get on different, like, technical curves, but, like, if you zoom way out, it'll just be- feel like this, like, smooth improvement of models.
17. GTGarry Tan
  One of the things that, uh, has been popping up with some of the, like, really big well-known wrappers is they're getting access to full parameter fine-tunes of the base models, especially the frontier base closed source models. Is that, like, a big part of your business or, you know, something that people are sort of coming to you for, just, like, these verticalized full-parameter fine-tune, like, data sets?
18. AWAlexandr Wang
  Yeah, I think this is going to be a, like, blueprint for the future, right? So right now, I mean, like, the total number of large scale parameter fine-tune or reinforcement fine-tune models is like still pretty small, but if you kind of think about it, like, that, like, one version of the future is that every firm's core IP is actually their specialized model or their- their own fine-tuned model. And just in the same way that like, you know, today you would generally think that the co- the- the, uh, IP of most tech companies is their code base, um, in the future you would generally think that their- their i- their specialized IP might be the model that powers all of their- all their internal, um, workflows. And what are the special things they can add on top? Well, they can add on, um, data and environments that are somehow specific, very, very specific to the day-to-day problems or information or challenges or business problems that they see, um, on a day-to-day level. And that's the kind of, like, really gritty real world information that, you know, nobody else will have because nobody else is like doing the same- the exact same business motion as them.
19. GTGarry Tan
  Yeah. There's a lot of weird tension in that though. Um, I remember, uh, friends of ours from one of the top model companies came by and they were like, "Hey, do you think YC and YC companies would give us their evals so we could train against it?" And we were like, "No, dude, what are you talking about? Why- why would they do that?" Because that's like their moat. And then I guess now that based on this conversation, it's actually, I mean, evals are pretty important as a part of L- RL cycles, and then even the evals are not really, uh, the valuable part. The valuable part is actually the, like, properly fine-tuned model for your data set and your set of, you know, sort of problems.
20. AWAlexandr Wang
  Yeah, it's like these Lego blocks, right? If you have the data and you have the environments and then you have the- you have, you know, a base model, you, like, you know, can stack those on top of each other, get- get a fine-tuned model. And obviously the evals are important. This is some of the tension, and this is basically a, you know, in a nutshell the sort of like, um, does AGI become a borg that just sort of like swallows the whole economy in like, you know, as one firm, or do you still have a specialized economy? My belief, generally speaking, is that you- you still do have a specialized economy. Like, the- like, these models are platforms, but the, like- like, alpha in the modern world will be determined by, you know, to what degree you're able to sort of like encapsulate your business problems into data sets or environments that are then conducive towards building like, you know, differentiated models or differentiated AI capabilities.
21. GTGarry Tan
  Yeah, that's why asking for evals was so crazy to me because it's like, okay, you get the evals, the base model is way better and then not- you know, now all your competitors have exactly, uh, the same thing that used to be your advantage.
22. AWAlexandr Wang
  I think we will undergo a process in AI where we learn what the bright lines are, right? I mean, I think that, like, it's, like, very obvious and intuitive to tech companies that they should not give away their codebase, and they should not give away their database. Like, they should not give away their data, they should not give away their codebase. The analogues of that in a, you know, highly AI-fueled economy, I think we'll identify over time but are, yeah, the evals, your data, your environments, et cetera.
19:18 – 27:47
The techno optimist view of work
1. HTHarj Taggar
  I think you have a very, uh, techno-optimistic view of what the future is gonna be with how jobs are gonna be shaped. Can you talk more about that? Because I think you hinted at it b-before, is it's gonna be more specialized. It's not that all these jobs are gonna go away, right?
2. AWAlexandr Wang
  First off, i-it's undeniably true that we're, we're, uh, at the beginning of an era of, like, a new, a new f- way of working. Like, like, you know, there's, there's this term that people have used a long time which is like, "the future of work." Well, (laughs) um, uh, we're, like, entering the future of work, or this, certainly the next era. And so work fundamentally will change. But I do think, um, humans own the future, and we, we are, we are, like, uh, we have a lot of agency actually, and a lot of, a lot of choice in how this sort of, like, reformatting of, of work, or how the reformatting of sort of, like, workflows ends up playing out. You know, I think you kind of see this play out in, uh, in coding right now, and I think coding, in some ways, is, is really the sort of, like, um, case study for other fields and, and other, you know, other areas of work where sort of the, the initial phase is the sort of, like, assistant-style thing where, um, you know, you're kind of doing your work and then the models are kind of, like, assisting you a little bit here and there. And then you go to a, you know, the sort of, like, cursor-agent mode kind of thing where you're, you're, like, um, synchronously asking the, the models to, like, carry out these workflows, and you're sort of like, you're mo- you're managing, like, one agent kind of, or you're sort of like, uh, you're kind of like pair programming with a single agent. And then, and then now, with like Codex or other systems, like, it's, it's very clear the paradigm is like, oh, you have this, like, you have this, like, swarm of agents that you're gonna deploy on, like, all these various tasks, and you're just gonna, like, sort of, like, you know, g-depl- like, um, give all these tasks and you'll have this sort of, like, um, this, this cohort of, of agents that are sort of like, you know, doing this work that you, you think is appropriate. And that last job, um, uh, has a, has a semantic meaning in the, in the current workforce. It's a manager, you know? You're basically managing this sort of, like, this set of agents to do, um, actual work. And so ... And, and I think that, like, AGI or, you know, AGI or doomers or whatnot, like, they take this view that, like, oh, even this job of, like, managing the agents will just be done by the agents, so, like, humans will be taken out of the, of the process entirely. But our belief, my personal belief, is that, you know, this is, um ... Management is very complicated. Um, management is also, like, more about, like, what's the vision that you have and what's the sort of, like, what's the, like, end result you're aiming towards? And those will be fundamentally, I think, like, you know, we have a h- human demand and human desire driven economy, so those will be driven by humans. And so I think the terminal state of the economy is just, is large scale humans manage agents, in a nutshell.
3. GTGarry Tan
  I have a funny story where, um, (smacks lips) found a friend of mine is trying to promote, uh, one of his, you know, junior employees, but they're really, really smart and they're working on the agent infrastructure, and then he was like, "Hey do you want to," like, "you know, I'm looking for someone who can step into management. You've never managed people before. Do you, you know, if we hired some people, uh, under you, like, how would you feel about that?" And, uh, this, you know, uh, mid-20-something really smart, you know, sort of d- ... He's just like, he's an engineer, and he's like, "Why would I do that?" Like, "Just give me, like, more compute."
4. AWAlexandr Wang
  (laughs)
5. GTGarry Tan
  "Like, you know, the model, like, look at what just happened to the model literally, like, last month and, you know, I didn't have to do anything. It just started doing things that it couldn't do a month ago. Why would I want to manage people? Like, just give me, like, I will just manage more agents for you and it's fine." (laughs)
6. AWAlexandr Wang
  Okay, so what are the unique things that, that, um, that humans will do over time? I mean, I think, I think that this, like, this, like, element of vision, um, is very important. This element of, like, kind of like debugging or sort of like, um, fixing when things go wrong. Like, most of a manager's job, speaking as a manager-
7. GTGarry Tan
  (laughs)
8. AWAlexandr Wang
  ... (laughs) is, is just, like, putting out fires, dealing with problems-
9. GTGarry Tan
  Yeah.
10. AWAlexandr Wang
  ... dealing with, like, like, issues that come up. Like, I think intuitively, you know, I, the idealistic manager job seems like this very cushy job because you're like, "Oh, yeah. All the other people do all the work, and I'm just sort of like, I just vaguely supervise." And then the reality is obviously, like, highly chaotic. I think people have been jumped to this, like, you know, extreme reality where it's like, oh, yeah, these like, you know, you're just gonna manage the agents and you're gonna sort of like live this, like, you know, kind of Victorian life where all your problems are solved. But, but no, I think it's still gonna be pretty complicated, like getting agents to, like, coordinate well with one another and, like, coordinating the workflows and, and s- and debugging the issues that come up. Like, these are still complicated issues. And, you know, having seen what happened in self-driving, which was more or less that, like, you know, it's easy to get to 90%, very, very hard to get to 99%. I think that, like, something similar will happen as with large-scale agent deployments, and that, like, you know, final 10% of accuracy will be, like, you know, will require a lot of work.
11. HTHarj Taggar
  Yeah, even for, uh, self-driving cars right now, there's the remote assist for all the super edge case, so there's still a human at the end managing the car.
12. AWAlexandr Wang
  Yeah, and the ratio, by the way, I mean, um, uh, the companies don't publish them, but I think the ratio is something like five cars to, to one teleoperator.
13. HTHarj Taggar
  Oh.
14. AWAlexandr Wang
  Um, or, or maybe even less than, maybe three cars per teleoperator. So, um, the ratio is like, n- you know, much lower than people think. I think that, like, humans are much more involved, even in self-driving cars, than I think most people appreciate.
15. HTHarj Taggar
  I mean, which, if you put it in that perspective, I think is still very optimistic. It's just the output of getting rides instead of doing, in today's world, if you're a Uber driver, you just do one car. In this world, you can do five cars, right?
16. AWAlexandr Wang
  Well, you have to believe...... for this, like, for an optimistic version of the future where, you know, unemployment is still low, et cetera, you just have to believe that humans are, like, almost insatiable in their desire and their demand. Um, and that like, you know, prices will go down, things will become, you know, uh, the ef- the economy will become more efficient, and we'll just, like, want more. And I think this has been a pretty reliable trend for, like, the history of humanity is that, like, you know, um, we have somewhat insatiable demand. Um, and so I, I've, like, conviction that, like, you know, the economy can kind of get as efficient as it needs, uh, or as it, like, can get, like, hyper, hyper efficient, and then human demand will just, like, continue to sort of like fill the bucket.
17. GTGarry Tan
  Yeah. In the 20th century, uh, you know, when you said computer, maybe early 20th century, people didn't think of, like, a computer as it is today. They thought of a human being-
18. AWAlexandr Wang
  Yeah.
19. GTGarry Tan
  ... that would sit in front of a punch card tabulator (laughs) and that was like what a computer was doing. I mean-
20. AWAlexandr Wang
  It was a job title.
21. GTGarry Tan
  ... like, it was literally that was a real person's job.
22. AWAlexandr Wang
  (laughs)
23. GTGarry Tan
  And then, of course, now today, it's like where are all the computers? Well, they're actually real computers now. I don't know.
24. AWAlexandr Wang
  (laughs)
25. HTHarj Taggar
  That was the Apollo mission.
26. GTGarry Tan
  Right.
27. HTHarj Taggar
  It was a bunch of, uh, people just crunching numbers with the trajectories of, uh, of the Apollo and that was it. Because the, uh, computer that went on the, uh, rocket is actually was a microcontroller with I think only, like, single digit hertz. It was, like, very tiny amount of computations with just humans doing it.
28. AWAlexandr Wang
  Totally. And, and even this, like, I mean, I think the concept of being a programmer is somewhat, is, like, highly esoteric, um, in the sense that, like, oh, you're, like, writing the instructions for these, like, machines to just, like, you know, just continue to do repetitively. And in some ways it's, like, the leverage boost that all humans will get is, like, similar to the leverage boost that, like, programmers have had historically. For a long time, I think a lo- like, uh, a lot of people in Silicon Valley say this. Like, the m- the closest thing to alchemy in our world pre-AI, let's say, is programming because you sort of, like, you can do something that, uh, creates, like, like, an infinite... there's these infinite replicas of whatever you build, and they can sort of, like, run an infinite number of times. And, um, and I think the entire human workforce will soon see that ki- that large of a leverage boost, which is extremely exciting because I think that, like, programmers are s- are, are, um, have, like, benefited over the past few decades from this, like, unique perch where they, they have, like, you know, one 10X or 100X engineer can, like, can build something, like, absolutely incredible and, like, very, very valuable and, like, very, um, uh, shockingly productive. And all of a sudden, I think, like, like, humans in all trades, I think, will gain this, like, level of leverage.
27:47 – 37:37
The turning points for Scale AI
1. JFJared Friedman
  Uh, so I'm curious to return to a point that you made earlier about, like, how scale has kept reinventing itself. If you had to, like, describe the arc of scale, like, what's, what's, what's the story then? What were the turning points?
2. AWAlexandr Wang
  Our initial business was all around, um, you know, producing data, um, you know, generating data for various AI applications.
3. JFJared Friedman
  And primarily self-driving car companies, right? For, for the early years it was really, like you were saying, you were really focused on, on that.
4. AWAlexandr Wang
  Yeah, for the first, like, three years fully focused on that. One of the properties of focusing on that business, uh, of building that business is over time, you know, we had this, like, obligation to really, like, get ahead of most of the waves of AI, if that makes sense. Because, you know, for AI to be successful in any vertical area, it needed data. And so, like, our demand for our, our products would precede a lot of times the actual sort of, like, evolution of AI into those industries. So, you know, as an example, we started working with OpenAI on language models in 2019. Um, we started working with the DoD on government AI applications, um, and defense AI applications in 2020. This is, like, long before I think the, you know, recent sort of, like, drone-fueled, um, you know, AI, uh, AI craze in the, in the Department of Defense. We started working with enterprises long before there was sort of like this, uh, you know, the recent sort of, like, larger waves around, uh, enterprise AI implementation. So, um, almost, uh, uh, sort of systemically or, or intrinsically, we've had to, uh, basically build ahead of the waves of AI. I think this is actually quite similar to NVIDIA. You know, whenever, like, Jensen gives his annual presentations about, you know, um, NVIDIA and its future and its outlook, like, he always is so ahead of the trends. Um, and that's because he has to get there on the trend before the trend can even happen. That's, I think, been one, um, one way in which our business has continued to adapt because AI is like this, you know, it's this, this, like, it's the fastest moving industry I think ever, um, (laughs) in the history of the world. Yeah. And so, you know, that each, each turn, um, each evolution, uh, has been... has moved incredibly quickly. The other thing that, that happened May 2021, early 2022, um, we started working on, um, applications. And so we started building out, uh, AI-based applications and now, um, more, much more so, uh, agentic workflows and agentic applications, um, for enterprises and government customers. And this was an interesting evolution of our business because, because historically, like, our core business is highly operational. You know, we build this, like, data foundry, we have all these processes to produce data. Um, it's a very operational process that involves, like, lots of humans and human experts to be able to produce data with quality control systems in place. That highly operational business, um, and the success of that business is what created the momentum for us to, you know, sort of dream about building an applications business. When we went into it, uh, I had studied other businesses that had basically successfully, um, added on very different businesses and what are sort of, like, the unique traits or, or why do some of those work? And one of them that is probably the most interesting, um, I think is, like, the most singular in modern, uh-... modern business history is, um, Amazon building AWS. You know, if in 2000 you had written a short story that said that like, the, you know, this large online retailer would build this like, large-scale cloud computing rent a server business, like, it would seem like nonsensical.
5. JFJared Friedman
  I remember when they launched AWS in 2006, Amazon stock went down because all the analysts thought it was such a terrible idea.
6. GTGarry Tan
  It'd never been done before.
7. JFJared Friedman
  Yeah.
8. AWAlexandr Wang
  It just like, it doesn't seem related at all to their core business. Um, it has, it's like this like weird thing. But the sort of like wisdom of that was, I think twofold. I think like first, um, and, uh, from talking to people who were like there at the ou- you know, the sort of like the genesis moment of this business, like one thing, probably the most important thing was that they had conviction that that, that, the, the sort of like underlying business model of AWS would basically be this like, this like infinitely large and growing market. Like that market would, would literally grow forever. There would be like a, this like exponential of the amount of compute that needed built up, needed to be built up in the world. And, um, if you did that, there was like sufficient cost of, you know, cost advantages from economies of scale. I think like startups, when you kind of like, um, uh, you kind of have to like switch modes at a certain point where like early on you're trying to go for very, very narrow markets, like almost the narrowest markets you can, and then you're just trying to like gain momentum and then sort of like slowly grow out from those ex- hyper-narrow markets. And then, um, at some point you, if you like, have ambitions to be $100 billion company or more, then you have to sort of like switch gears and say, "Where are the infinite markets? Um, and how do you build towards those infinite markets?" And so, um, this was sort of like, uh, the moment where we realized that. And, and the simple realization was that every business and every organization was just going to have to reformat their entire businesses, um, with AI-driven technology. Um, and now obviously like agent-driven technology, and that would just be, like over time that would swallow the entire economy. And so it was like another one of these like, okay, that's an infinite business to build out AI applications and AI deployments for large enterprises and governments.
9. JFJared Friedman
  I think a lot of people don't realize that you guys are in the middle of this transformation. They still think of Scale as the data labeling company, but like if you fast-forward 10 years, do you think most of Scale will actually be the agent business?
10. AWAlexandr Wang
  Yeah. It's, it's growing much faster at this point. I think it, it, it's an infinite market. (laughs) So, the crappy thing about most markets is that they have like a pretty shallow S-curve. Um, but then, you know, you look at hyperscalers or, or like, you know, these like mega cap tech companies and they just have like these like ridiculously large markets. So, you really want to get into these, these, these like, um, infinite markets. So our strategy so far has been to focus on building use cases for, you know, focus on a small number of customers and, um, and be quite selective. So we work with, you know, the number one pharma company in the world, the number one telco in the world, the number one, uh, bank, the number one, um, healthcare provider. Um, and we work a lot with the US government, you know, the depar- Department of Defense and, and other government agencies. And, um, the whole thing is like, how do we take a very focused approach towards building, um, stuff that resemble, you know, real differentiated AI capabilities? And all of this I think sounds so trite, but, but, um, we have this multi hundred million dollar business in building all these applications. By my count, I think it's, it's one of the largest AI application businesses, um, in the industry, certainly what our investors tell us. And it's fueled by our differentiation in the data business because our belief fundamentally is that, um, kind of what we talked about before, the, the end state for every enterprise or every organization is, um, some form of specialization, um, imbued to them by their own data. Our day jobs historically have been producing highly differentiated data for, you know, these like large scale model builders in the world. And then we can apply that wisdom and that capability and those operational capabilities towards enterprises and their unique problem sets and, um, and give them specialized applications.
11. GTGarry Tan
  Honestly, like it kind of sounds like Palantir.
12. AWAlexandr Wang
  At the like most zoomed out level-
13. GTGarry Tan
  Yeah.
14. AWAlexandr Wang
  ... if you sort of like squint in-
15. GTGarry Tan
  In that you're a technology provider, but-
16. AWAlexandr Wang
  We're like a technology provider to like the most, you know, some of the largest organizations in the world, um, with a focus on data.
17. GTGarry Tan
  Yeah.
18. AWAlexandr Wang
  Um, and I think the key difference is like, you know, Palantir, um, has built a real focus around these data ontologies and, um, and really solving this like messy like data integration problem for enterprises. Um, and then our whole viewpoint is like, what is the like most strategic data that will enable differentiation for your AI strategy, and how do we like generate or harness that data from within your enterprise towards developing that?
19. GTGarry Tan
  I guess you will end up being pretty big competitors in another five, 10 years. But for now, like it's basically so greenfield honestly.
20. AWAlexandr Wang
  I mean, I think it's an infinitely large market is the other thing.
21. GTGarry Tan
  Yeah. So you might not ever meet actually, which is interesting.
22. AWAlexandr Wang
  Yeah, yeah. I, I think in practice now we actually like... Frankly, we work, we're more partnered with Palantir-
23. GTGarry Tan
  That makes sense.
24. AWAlexandr Wang
  ... than, than competitive with them.
25. GTGarry Tan
  Yeah.
26. AWAlexandr Wang
  Um, and, uh-
27. GTGarry Tan
  Well that's 'cause the problems that these giant organizations are actually so massive and intractable that they throw up their hands. It's like they have no shot at ever hiring people who could possibly solve the problem. Uh, but a company like Scale or a company like Palantir can actually hire kind of the same kind of people who would apply to YC actually. (laughs)
28. AWAlexandr Wang
  (laughs)
29. GTGarry Tan
  It's kind of like this, this... Yeah, I don't know. The, the through line in my head right now is realizing like, you know, there's plenty of capital and then the limiting agent is actually really great technical smart people who, uh, are optimistic and actually work really hard. (laughs)
30. AWAlexandr Wang
  (laughs)
37:37 – 41:55
Agentic workflows
1. HTHarj Taggar
  Talking about, uh, operations. You clearly are living in the future, which is super cool. I'm sure you're running Scale with all these agents and tools already to make it very efficient. Could you share some of the things that you're doing internally as a company and agents you're adopting so you can do more with less people?
2. AWAlexandr Wang
  You know, we saw this early because, uh, when, when the model developers were starting to develop agents and starting to develop using reinforcement learning, like actual, you know, like, reasoning models where the models could actually, like, really do end-to-end workflows, we were, uh, responsible for producing a lot of the datasets (laughs) that enabled, um, the agents to get there. And then we saw just, like, how effective that, that training process is. I think that, like, the efficacy of reinforcement learning for, um, for agent deployments is like... is pretty insane. So then once we realized that, we realized like, okay, if you can actually, like, you know, turn, um, existing human-driven workflows into environments and, and data for reinforcement learning, um, then you have this ability to convert these, like, human workflows and, like, human workflows, um, especially ones where you're like, okay, with some level of fault- faultiness and okay with a certain level of reliability, you can convert those into, um, into agentic workflows. So there's all sorts of like, you know, agent workflows that, that happen in our hiring processes and happen, um, in our quality control processes and happen to sort of just like automate away certain, like, data analyses, um, and data processes, as well as like various like sales reporting. Like, it's sort of like embedded at, you know, every major org of the company. Um, and the whole thing is like, um, it's just like mindset, like, can you identify these like very repetitive human workflows and basically like undergo this process where you convert that into datasets that enable you to build automation tools.
3. GTGarry Tan
  What do these datasets actually look like? I mean, for browser use, is it like... is it an environment and then, you know, here's a video of a human being going through this process of like filling out this form and decide like yes/no on this, uh, dropdown or something? I mean, you know, what's a concrete example just for the audience?
4. AWAlexandr Wang
  One of the processes that we go through is like, you know, you, you, um, you'll take a sort of like full packet of a... from a candidate and you'll, like, want to distill that into like, you know, a brief of some sort that sort of like gives all the salient details about that candidate for like decision by a sort of like broader committee. Um, and these kinds of cases, you know, broadly speaking, like deep research plus plus kind of things are like the lowest hanging fruit. It's just sort of like, can you take these processes that like more or less look like, you know, you have to like click around a bunch of places and pull a bunch of pieces of information and then blend them together and then pro- produce some analysis on top of that. Like that process, that fundamental like information-driven sort of like analysis process is the easiest thing to, to drive via agentic workloads and the kinds of data you need are just like, you know, um, uh, we call them kind of environments, but usually it's just like what is the task? What is the full, um, sort of like dataset that's necessary to conduct that task? And, um, what is like the rubric for how, how you conduct that effectively?
5. GTGarry Tan
  Do you need RL and fine-tuning when like prompt engineering and meta-prompting seems so good?
6. AWAlexandr Wang
  I think that yeah, I mean, I think, I think prompting... I mean, as the models get better, prompting will get better. But like prompting gets you to a certain level and then reinforcement learning gets you beyond that level. And, um, actually this is a good point. I think that like probably most of the time in our, in our business it's mostly prompting that just is... like works really well.
7. GTGarry Tan
  I mean, that's the weird thing is like, oh shoot, you don't have to crack open the models. And then frankly, like the next models are going to be so good and then the evals are mainly about picking which model or, you know, at what point do you switch to the next one?
8. AWAlexandr Wang
  I do think startups need basically like a strategy for how they like will, um, walk up the complexity curve so to speak. Like you need to like... you know, whatever product or business you build like needs to like really benefit from like the ability to like race up this complexity curve, which is the broad- broader curve of capability of the models.
41:55 – 47:48
“Humanity’s Last Exam”
1. HTHarj Taggar
  I mean you, you actually created this leaderboard that has a lot of these super hard tasks that are trying to go into this next curve of reasoning. Could you tell us about it?
2. AWAlexandr Wang
  One of the things that we built, um, in partnership with the Center for AI Safety is Humanity's Last Exam.
3. HTHarj Taggar
  Yeah.
4. AWAlexandr Wang
  It was a funny name. I think unfortunately there will be yet another exam beyond it. But you know the idea was how... like let's effectively work with, you know, the, the smartest scientists in the field and, you know, um, we worked with many very brilliant professors but also very... many like individual researchers who are like quite brilliant. And we just collated and aggregated this dataset of what the smartest researchers in the world would say the hardest scientific problems they've worked on recently are. They solved them or they sort of like came to the right... you know, they were able to solve the problems but they're sort of like the hardest problems that they're aware of and know of.
5. JFJared Friedman
  I was curious how you came up with these problems. So each of the professors contributed new problems. So these are not... these are problems that have never appeared in any textbook or any exam ever. They just like came out of their brains and they like typed up like a new problem like from scratch. Am I understanding this right?
6. AWAlexandr Wang
  Yeah, yeah. And the general guidance was like, you know, what has come up recently in your research that you think is like... is a particularly hard problem, right?
7. JFJared Friedman
  The problems are stupidly hard.
8. AWAlexandr Wang
  Yeah.
9. JFJared Friedman
  Incidentally. They're like insane. I don't know if you guys have looked at these problems. (laughs) They're totally crazy.
10. AWAlexandr Wang
  Yeah, it's totally crazy. And by the way, like-
11. HTHarj Taggar
  They cannot be searched on the internet. It is like you need to have a lot of...... a lot of sp- expertise and actually think about them-
12. AWAlexandr Wang
  Yeah.
13. HTHarj Taggar
  ... f- for quite long time.
14. AWAlexandr Wang
  Yeah, they require a lot of reasoning. And recently, like, uh, right now, so we have a time limit where the models, um, can only think for, I think it's 15 minutes or 30 minutes, and one of the most recent requests from one of the labs was like, "Can you please increase that time limit to, like, a day?"
15. HTHarj Taggar
  (laughs)
16. AWAlexandr Wang
  So that the model has, like, up to a day to think about the, um, to think about the problems.
17. HTHarj Taggar
  Mm-hmm.
18. AWAlexandr Wang
  Um, but yeah, no, th- they're, they're deviously hard problems. Unless you have expertise in the specific problem, you probably don't have a chance of getting it, right? Um, but even this evaluation, like, I think when we first launched it, um, you know, and this was earlier this year, uh, the, the best models were scoring, like, 7%, 8% on it. Now the best models score north of 20%. It's moved really, really quickly, and I think, you know, I think, uh...
19. HTHarj Taggar
  Do you think we're gonna get a benchmark saturation for this one as well?
20. AWAlexandr Wang
  (clicks tongue) I think eventually, yeah, it'll, it'll be saturated, and then we have to move on to new evaluations. I mean, I think the, like, uh, the, the s- the saving grace for the naming was that it is the last exam. The new evals will be sort of like real world tasks, real world activities, which are sort of, like, fundamentally fuzzier and more complicated.
21. GTGarry Tan
  Ha- have you solved any of the problems yourself, Alex?
22. AWAlexandr Wang
  Uh. (laughs)
23. GTGarry Tan
  I knew, I, I know you were a competitive math person for a long time. (laughs)
24. AWAlexandr Wang
  Yeah, yeah. The, I mean, the math problems require a lot of... They're, like, very deep in the fields. I think, uh, I was, I managed to get a, a handful, but, like-
25. GTGarry Tan
  Okay, wow.
26. AWAlexandr Wang
  ... most of them are, like, hopeless.
27. GTGarry Tan
  (laughs)
28. AWAlexandr Wang
  Um, yeah. I looked at the ones that the models can solve, and so... (laughs)
29. GTGarry Tan
  (laughs)
30. HTHarj Taggar
  (laughs)
47:48 – 56:57
U.S. vs China in AI and hard tech
1. AWAlexandr Wang
  China open-sourcing or DeepSeek open-sourcing their models is, like, another very interesting question. Like, how does that play out and, um, and there's this awkward sort of thing that, you know, the best open source models in the world now come out of China. I mean, that's sort of this, like, awkward reality, uh, to contend with.
2. GTGarry Tan
  And what do you think we can do to just make sure that it's the American models that are ahead? Or, you know, is that written in the stars or... You know, something tells me that's not. (laughs)
3. AWAlexandr Wang
  Uh, the simplest explanation for me about why the Chinese models are so good is, is espionage. I think that there's, um, there's a lot of secrets in how these frontier models are trained. Um, and when I say secrets, they, you know, it sounds more interesting than they are, but there's just a lot of tacit knowledge. There's a lot of, like, you know, tricks and small... um, and intuitions about where to set the hyper parameters and, like, you know, ways to make these models, um, work and to get the model training to work. The Chinese labs have been, have been able to move so quickly and accelerate and m- make such fast progress, um, whereas some even, like, very talented US labs, like, have made progress less quickly, and I just purely think it's because, you know, a lot of the, the secrets about how to train these models, um, you know, those secrets leave the frontier labs and make their way back to these Chinese labs. Um, I, I think the, the only way to model the future is that China has pretty advanced models. Um, you know, the solace right now is they're not the best models. Um, they're sort of, like, a half step behind, let's say. Um, but, uh, but it's tough to model what'll happen when it's sort of truly neck and neck.
4. GTGarry Tan
  We're very behind on energy production, which is-
5. AWAlexandr Wang
  Yes.
6. GTGarry Tan
  ... uh, just pure regulation. Like, that could be fixed in two seconds, but, you know, hasn't been yet.
7. AWAlexandr Wang
  That's a huge problem. I mean, if you look at, uh, you know, not that the past will be a predictor of the future, if you look at what-... US total grid production looks like. It's like, looks flat as a pig. And if you look at, um, you know, Chinese, uh-
8. GTGarry Tan
  I saw that.
9. AWAlexandr Wang
  ... uh, aggregate, uh, you know, grid production, it's like, you know, it's doubled over the past decade. It's just like, it's just this, like, straight up into the red curve.
10. GTGarry Tan
  I saw that, and it's astonishing. It's, I mean, that's just a policy failure.
11. AWAlexandr Wang
  China just, you know, the vast majority of that is coal, and coal's growing in China. And, um, in the United States, uh, actually renewables have grown a lot, but renewables trade off against the, uh, the sort of fossil fuels. So we've sort of like done a, done a transition of our, of our, um, energy grid, whereas they're just continuing to compound, let's say. We have this issue on power production, but we're, we're advantaged in chips. I think, like net-net, we will come out ahead on compute. Um, if you look at data, I mean, this goes towards a lot of the questions you've been, you've been asking about, but like, I mean, I think China is like fundamentally very well-positioned on data. Um, it's weird to say 'cause obviously like, you know, we help (laughs) all the American companies with data. In China, they can ignore copyright or other privacy rules, and, and they can sort of, um, you know, build these large models without abandon. And then, and then the second issue is that, um, there's actually large-scale government programs in China for data labeling. Um, there are, uh, you know, seven data labeling centers, um, like i- in various cities that have been started up by the government itself. There's large-scale subsidies for, um, for AI companies to use data labeling, a voucher system, in fact. There's like college programs because, you know, one of the interesting things is in China, like employment is such a large national priority that they like, you know, when they have a strategic area like AI, they'll like figure out, okay, what are all the jobs? And they'll like create these like funnels to, um, to, to create those jobs. And then we're seeing this in robotics data too, where like there's, the, already in China, there are like large-scale factories full of robots that just go and collect data. Um, and, uh, and strangely enough, like even a lot of US companies today actually rely on data from China in training these like robotics foundation models. Long story short, I think China likely has a data, an advantage on data. And then the algorithms, um, you know, the US is s- is on, on net much more innovative, but if espionage continues to be a reality, then like, you know, you're basically even on algorithms. So, um, so it's hard to model, but I think that probably like, you know, it's like 60/40, 70/30 that the United States like, has like an undeniable continued advantage, but there's like a lot of worlds where China just like catches up or may potentially even overtakes.
12. GTGarry Tan
  I mean, the, the scary thing for me is, you know, watching Optimus or YC has, uh, some robotics companies like Weave Robotics, and, you know, we look at those things, the software can be as good or better than anything coming out of China. But when it comes to the hardware, it's like, BOM cost over here, 20,000, 30,000 bucks. Like you can't, you know, we can't even make like high precision screws over here, and then over there, the same mach- the same robot, the embodied robot could be made f- like, I don't know, two, 3000, $4000.
13. AWAlexandr Wang
  Right.
14. GTGarry Tan
  It's like you just walk down a street in Shenzhen, and like, they, they got it, you know? And so how do you compete against that at sort of that, uh, state level?
15. AWAlexandr Wang
  The degree to which China's incredible manufacturing, I mean, that's a, that's a very big problem. Um, and it relates to defense and national security. It's a fundamental issue, uh, because on some level, defense and national security will boil down to which countries have more things that like can deter conflict or can, can go into a, a, you know, can, can shoot other things down.
16. GTGarry Tan
  Yeah, I don't think it's gonna be fighter jets and aircraft carriers anymore. I mean, it's probably gonna be, you know, this micro war of, it's like hyp- hyper micro. It's drones and embodied robots, and I mean-
17. AWAlexandr Wang
  Yeah, exactly.
18. GTGarry Tan
  ... war i-
19. AWAlexandr Wang
  Drones, embodied robots, cyber warfare, the, um, Cold War era, um, uh, philosophy of like, you know, you build like bigger and bigger bombs. Um, it's like the exact opposite of that. It's actually like, it's like the fragmentation and, uh, and, and move towards sort of like, you know, smaller, more nimble, attritable resources, um, is the, is the... That- that's like one of the big picture trends, I would say. Um, and then the other big picture trend is just what we believe, which is, uh, the move towards, uh, agentic warfare or agentic defense, which is basically, you know, if you cl- if you actually mapped out the, what warfare looks like today, or like what, like the, um, you know, the actual process of a conflict. Um, you know, if you look at Russia/Ukraine or other conflict em- uh, other conflict areas, like the decision-making processes are driven, are remarkably, um, manual and human driven. And it's just like all these dec- all these like very critical battle time decisions are made like with very limited information unfortunately, um, uh, in these like very manual workflows. And so it's very clear that, that, um, if you used AI agents, you'd have perfect information and you would have, uh, immediate decision making. And so the, you know, it's, we're going to see this like huge shift towards, um, agent-driven, uh, warfare and agent-driven, um, conflict. And it has the potential of turning these conflicts into these like almost incomprehensibly fast-moving, uh, kinds of scenarios.
20. GTGarry Tan
  And that's something that you guys are actively working on, right? Can y- c- is there anything that you can talk about? I assume some of it is classified, but... (laughs)
21. AWAlexandr Wang
  Yeah, yeah. So one of the things we're doing is we- we're building this, uh, this system called Thunder Forge, um, with, uh, the Indo-Pacific command, um, out in, out in Hawaii. It's responsible for the sort of the Indo-Pacific region, and it is the flagship DoD program for, um, using AI for military planning and, and operations. So we're basically doing exactly what I said. We are, we take the hu- the existing human workflow. The military works in a, what's called a doctrinal way, or they're, they're sort of like governed by the doctrine of this like, you know, very established military planning process, and you just convert that into, you know, a series of agents that work together, um, and, and conduct, you know, the exact same task, but it's just like all agent driven. And then all of a sudden you, you turn these like-Um, very critical decision-making cycles from, you know, 72 hours to ten minutes. And it kind of, like, changes it from, um, you know, y- uh, you know, when you play chess, if you play chess versus a human, they have to spend all this time thinking, um, you know, you, you know, it's sort of this, like, slow game, and if you play chess against a computer, it's just, like, these immediate moves back, and it's, like, this sort of, like, unrelenting form of, of warfare.
22. GTGarry Tan
  I mean, some of it is, like, the being able to see the chain of thought immediately was, is the most powerful thing.
23. AWAlexandr Wang
  Yeah.
24. GTGarry Tan
  Like, 'cause, it's, you know, I don't want the answer, I want to see how you got there, and then actually seeing the reasoning itself was so powerful. I mean, that's actually why the, um, launch of that first DeepSeek was way more interesting 'cause, uh, I think 01 had come out but they hid the, uh, the reasoning. And it's like, no, the reasoning is actually a really important part of it-
25. AWAlexandr Wang
  Yeah.
26. GTGarry Tan
  ... and the only reason why they hid it was they didn't want other people to steal it, which they did anyway. (laughs)
27. AWAlexandr Wang
  I think that that's, that's another, like, um, interesting thing about this space which is that, um, you know, y- so far you could really model as, like, there's, like, advanced capabilities, um, and you can try to keep those secret and you can try to keep those closed, but they open over time, kind of no matter what you do.
28. HTHarj Taggar
  Well, I
56:57 – 1:01:11
How to be hardcore
1. HTHarj Taggar
  mean, clearly, Alex, you've done a lot of incredible things and transformed your company multiple times and you have all these deep matter expertise in many areas. You're clearly hardcore. Is there advice for the audience to be more like you?
2. AWAlexandr Wang
  You know, I think that the, the, the biggest thing is, um, you just have to really, really, really care, um, and I think it's, like, a, a folly of youth in some ways that, um, that when you're young, like, almost everything feels like, you know, so astronomically important that you just, like, you try immensely hard and you care about every detail. You know, everything, uh, matters just way more to you, and I think, um, and I think that that trait is really, really important. And, um, you know, it's, like, just in varying degrees for different people so I wrote this post many years ago called Hire People Who Give A Shit, and it really is pretty simple. You notice, w- I notice, you know, when you interview people or when you interact with people, you can tell people who are just sort of, like, phone it in versus people who sort of, like, they, like, hang onto their work as, like, th- you know, it's like, it's like th- so incredibly monumental and forceful and important to them that they, they do great work and it sort of, like, eats at them when they don't do great work and when they do great work they're sort of so satisfied with themselves. And so there's sort of this, like, um, the magnitude of, of care, and one of the greatest indicators of, like, A, just, like, how much I enjoy working with people, or, like, frankly how successful they were at Scale was really just this, like, what is w- you know, to what degree of their s- (laughs) to what degree their soul is invested in, into, um, uh, into the work that they do. And so I think that that, you know, if you were to pick one thing, that that probably is the sort of, like, unifier in some way. It's like, you know, um, I care a lot. Uh, I care a lot about every decision we make at the company, um, you know, I still review every hire at the company. You know, I, I, we have this process where I, where I, uh, approve or reject literally every single hire at the company. Um, uh, and, and so I care immensely and then this, and then, like, I work with all these people who care immensely, and then that enables us to really sort of, like, we, um, we feel much more deeply what happens in the business, and as a result we sort of, like, uh, you know, we'll change course more quickly, we'll learn more quickly, um, we will, uh, we'll take our work more seriously, we'll adapt more quickly, and I think that that's been quite important to the, to the success that we've had.
3. JFJared Friedman
  Alex, you were telling me a story recently that stuck with me about how, like, quite recently, ev- even when Scale was a very large company, you were personally hand reviewing all, like, the data that was being sent to partner companies and being, like, basically, like, the final quality control, like, you know, like, you know, "That data point's not good enough." (laughs)
4. AWAlexandr Wang
  Yeah, exactly. I think a lot of founders would probably, um, would probably, uh, you know, agree with this, but, um, what your customers feel and e- when your customers are happy and sad, like, it really, like, gets to you. And so (laughs) when you have, when you have unhappy customers it's like, it's, like, personally a very painful thing. Broadly speaking, you know, we have this value at our company, um, quality is fractal, um, and, and I do believe that, like, high standards sort of, like, um, they trickle down within an organization and, um, you know, it's very rare that you see an organization where, like, where, like, standards, um, increase as you get lower and lower down in the organization. You know, most of the time when people realize their manager or their managing manager or their, like, director or whomever don't really care then they sort of, like, you know, that, that removes the sort of, like, the, like, deep desire to, to need to care. Um, and so it's, like, incredibly important that, that high standards, um, and, and this sort of, like, this deep, um, uh, sort of care for quality is, like, this, is this, like, um, deeply embedded sort of, um, tenet of the entire organization.
5. GTGarry Tan
  Founder mode, man. (laughs)
6. AWAlexandr Wang
  (laughs) Founder mode.
7. GTGarry Tan
  Man, we gotta have you back. Thank you so much for spending time with us. With that, sorry we're out of time, but we'll see you next time. (instrumental music)

Episode duration: 1:01:12

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode 5noIKN8t69U

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome

Intro

Alexandr’s early days at YC

Dialing in on what worked

Model improvements, evals

The techno optimist view of work

The turning points for Scale AI

Agentic workflows

“Humanity’s Last Exam”

U.S. vs China in AI and hard tech

How to be hardcore

Get more out of YouTube videos.