No PriorsNo Priors Ep. 32 | With NEAR’s Illia Polosukhin
EVERY SPOKEN WORD
80 min read · 16,386 words- 0:00 – 9:58
Blockchain, AI, and Web3 Intersection
- SGSarah Guo
(music plays) A blockchain operating system might just be the key to a democratized Web3. In fact, more than 25 million users are already getting a taste of this, thanks to NEAR. This week, Elad and I are joined by Ilya Polosukhin, the co-founder of NEAR and co-author of the landmark Transformers paper, to discuss the interaction of blockchain and AI technologies, what we should expect from AI agents, how to handle the content authenticity problem, and why the alignment problem in AI is really a human problem. Ilya, welcome to No Priors. Thanks for doing this.
- IPIllia Polosukhin
Thanks for inviting me.
- SGSarah Guo
You are one of the authors of the original Transformers paper.
- IPIllia Polosukhin
Mm-hmm.
- SGSarah Guo
We've also had Noam and Jacob on. How did you get involved with that seminal work in AI?
- IPIllia Polosukhin
I worked on a team wi- on natural language understanding that focused on question answering, and the state of the art at this time was LSTMs, recurrent neural networks, which you could not launch in production at all, because they're too slow and take fair bit of time to process as documents scale. So Jacob at the time was using attention for query similarity, and he had this idea, like, using attention for encoder-decoder type. Um, I kind of jumped into it and, uh, with Ashish were playing around with, can we actually get it to train and understand the order of words and do translation just based on, you know, attention. So yeah, it was pretty cool to explore that, and obviously grew into something very interesting and awesome.
- SGSarah Guo
You originally co-founded NEAR in, I think, 2018, meaning for it to be an AI-focused company. What was that initial mission, and how did it become a blockchain company?
- IPIllia Polosukhin
Yeah, so we started with the idea that we wanted to teach machines to code. You know, we had Transformers coming out. There was a lot of kind of really interesting push in '17, '16, '17, around AI, and so our expectation was we kind of would ride the exponential growth of AI, which has happened in (laughs) this year. We thought it will happen in '17, '18, and so with that, we got a really interesting dataset around language to code. But more interestingly, we had a whole community of developers, mostly students, who were doing crowdsourcing for us. So we would give them code. They would write descriptions. We would give them descriptions. They would write code for them, write tests, like all kinds of tasks. And we actually faced a challenge of paying them, because a lot of them were in China, in Eastern Europe, and kind of other countries where there's monetary control problems. People don't have bank accounts, and so we started looking into blockchain just, like, to solve our own problem. The AI kind of, uh, exponential explosion didn't happen at the time, and so we saw an opportunity of, we can actually build a blockchain that we would use to solve this first and focus on that, uh, while kind of waiting out the AI thing to, uh, really happen. And as you go into the blockchain rabbit hole, you realize there's a lot more that meets the eye.
- SGSarah Guo
Yeah, yeah, ended up being a pretty big mission.
- IPIllia Polosukhin
Exactly.
- SGSarah Guo
So you call NEAR a blockchain operating system. For any of our listeners who haven't used it, like, what does that mean?
- IPIllia Polosukhin
So the idea is that we want to kind of go upstack, right? We want kind of an environment where you can discover and use Web3 experiences, you know, benefit from them, and not need to think about the low-level, you know, implementations and "hardware" that runs under it, right? So similarly how operating systems on your phone, you know, kind of abstracts out all the complexity of, you know, networking and payments and everything. You just use it, and you have apps that developers can build. And so that's really what we're trying to achieve, and kind of build this framework and platform for everybody to build their applications in Web3 and really deliver it to the user and to consumer.
- EGElad Gil
Where do you see a lot of the overlap coming in terms of Web3 and AI? You've thought very deeply about both. I, I remember when I first met you. You were just switching from sort of NEAR's original mission into the blockchain-based mission, and, you know, you were known as a team that could literally build anything, right? Like, you had yourself and Alex and Pi Guy, and all these, like, amazing people, and you went down the direction of building blockchain, in part, I think, originally around this data labeling kind of mission and the ability to do payments and things like that. And now, I know you've been thinking a lot again about how these two worlds interact or intersect. Where do you think are gonna be the biggest places of overlap between AI and blockchain, or Web3?
- IPIllia Polosukhin
There's few levels of interesting intersections. I think the, the most obvious one that everybody talks about is various marketplaces for resources, right, be that compute, model, or data, right? So data crowdsourcing. So those are pretty obvious, right? Web3 is really good at mar- at creating marketplaces, creating traceability, and, um, providing, like, an equitable place for everyone to participate. Now, the more interesting ones is where AI kind of agents, right, which, you know, we've seen, like, initial versions of, but obviously they're gonna continue evolving. If we- you equip them with a blockchain account, right, they now become an economic agent that is able to pay other people and pay other AIs to do work, right? And they can communicate, right? And I think one of the things that a lot of people who are, "Oh, like, these language models are just the same advancements, like, as everything before," missing the point that this is the first time that a machine is able to communicate with people in the same way, right? There's no n- more need in an intermediate human that interprets data and then tells it to other people. Now machine can communicate directly to people, and so it can task them with work. It can provide them context. And so I really think one of the most interesting...... cases is organizations that are run completely by AI, right? Where, quote-unquote, CEO role is taken by AI agent, who is tasked by, you know, by community or board of directors or whatever is oversight governance is, to, you know, s- hit specific KPIs and follow specific mission. They can even give specific feedback with training data when they don't think it's doing the right job. But what it does is, like, creates this kind of a new layer of management that potentially removes a lot of middle management right now, which is like trans- transforming information and context for each individual person and giving them specific area of work, and then gather, like, kind of harnessing their creativity and putting it back together. Right? And I think that's as very interesting use case that kind of really melds blockchain and AI together.
- SGSarah Guo
Why... Like, you have a traditional biotech cancer research commercial entity. Like, why blockchain and why AI for that?
- IPIllia Polosukhin
I use this example, right? We want to, you know, continue making progress on solving cancer, right? And it's a very complex problem, right? There's a lot of, like, specific sub-cancers that, you know, need research. And so all of this and, like, coordinating people doing experiments, propagating information, recruiting, you know, people recruiting the candidates, right? All of this requires, like, somebody to do this work and kind of organize the process and really set up a lot of pipeline and, you know, funding and all those things. And right now, there's so much overhead around everything from, you know, how grant funding is allocated from the nonprofits that collect money for research, how th- uh, like, experiments are set up, the information sharing. Like, all of those pieces are really kind of broken, and so you can actually have, you know, like, coordi- coordinated effort that is designed just to do that, and it can consume all of this information and kind of specifically task, you know, who is the best person at doing the experiment of which lab is the best at doing this specific sets of experiments, you know, fund them for this, you know, amount of money, you know, over- oversee their delivery, and then kind of iterate. And, you know, if- if it thinks this lab is not doing a good job, fire them without having, like, extra, you know, personal affiliations that, you know, people do have. I'm actually excited about some folks that are already building some examples of this in like a simpler, uh, forms, but I think we'll see, you know, first organizations like this probably even this year where potentially with a simpler missions and kind of more straightforward, like, KPI metrics, but where kind of this information propagation and onboarding of people happens already through a- a kind of, uh, language model AI agent.
- EGElad Gil
A similar version of this that I've heard people talk about, and it may be the first step towards it, is actually providing on-the-job feedback via an AI versus like a human manager with the idea that it depersonalizes the feedback, right? So if you have a agent or an AI providing feedback, some surveys at least have suggested that the average employee may be more comfortable with that because it feels more objective, it feels depersonalized, it feels like it can be provided in a directive way, and it seems like that's one aspect of sort of this AI as CEO concept that you're describing. Do you think the first place that it'll show up is DAOs or do you think it'll show up in a different part of the community?
- IPIllia Polosukhin
Yeah, I think DAOs is... And especially what happened with DAOs, there was a lot of people who are really excited about DAOs kind of as a concept, and so they put a lot of time running them, but it's actually a very, like, not interesting job, right? It's like in- s- you onboard new members, you explain to them all the same thing, you know, you answer to their questions, and so that's the part which, like, you can already automate, right? You can, like, have a Discord bot that is, like, have all the context about the DAOs, you know, interactions and kind of onboards new people and gives them, like, new in- ... you know, tasks to start with and kind of coordinate them. So I think that will be the first place where this kind of starts showing up and as well because you have, like, payments kind of very, like, there, and you don't have any social constraints that usually you have in, like, regular organizations. Like, you know, I- I... A lot of people will revolt if you, like, tomorrow say, "Hey, by the way,
- 9:58 – 16:07
Blockchain and AI
- IPIllia Polosukhin
your new boss is this AI model." So... (laughs)
- EGElad Gil
Yeah. Yeah. H- how do you think about AI in the context... or I should say blockchain and AI in the context of things like alignment?
- IPIllia Polosukhin
Yeah. So I think this is a very interesting topic. So I- I have this view that we need human alignment instead of AI alignment. So right now, kind of when we talk about, you know, "Hey, we need to align AIs with, like, human values," but the reality is that, you know, all the problems that exist, they all exist because of humans doing things and- and they've existed before. I actually like to use the Byzantine fault tolerance problem, right, which is basis for, uh, blockchain, but the re- the- its roots are in, you know, history where there was people propagating misinformation and you were trying to d- like, figure out how to prevent misinformation in his army, right? So this is, like, a really old problem of misinformation and kind of cr- um, like, how to work around that. And so I think what we need to start doing is figuring out how do we build a society that is actually able to deal with, uh, kind of effective misinformation at scale, right? So, like, we've kind of built a, like... a lot of our society has started building up tolerance to misinformation around, you know, TV and mass media, but we don't have, like, a system and framework around dealing with it at scale, and that's what AI brings, brings just scale to the same problem. And so this is where reputation, identity, and kind of systems around our social echo operating system, uh, that powers our comm- kind of, uh, communities is really important, right? How do all those- these pieces work together and how do they actually operate when there is malicious actors who potentially are able to, you know, a- en masse create, like, very personalized misinformation or create, you know, fake political actor that is, you know, convincing every individual exactly in what, uh, they think, you know, that government should do to get elected? And the... This is where Web3 comes in as, like, a set of primitives, right? We have cryptography to authenticate content and create, uh, a path, everything from, you know, you take a picture with camera, some of them already has, uh, secure enclave that can sign the image that's taken, and so as that image gets processed, we can actually propagate that information and have a proof that it came from, you know, specific time and place and then being processed by a specific set of filters, right? So that can give you, like, an anchor. Then you still need to know kind of who is publishing what, right? Like, we're recording this podcast, you know, people listening to it, it could have been completely generated at this point, right? But if, for example, we all sign the, you know, the final podcast and say, "Hey, yes, we've recorded it and this is valid content," now when somebody's listening to it, they can check that indeed, hey, this content is signed by us. Now, the question of us comes in, right? So this is where, uh, kind of identity and reputation is important. And so...This is where, uh, kind of on-chain identity becomes your kind of coalescence of all of the content and all those interactions that you do, and then that links to kind of, you know, reputation and different communities, uh, and, uh, provides context for people who are watching for th- this content to be able to understand, you know, who is this person who's talking or where they're coming from and what are the information values they have. So I think, like, it's, it needs to be a kind of systematic approach, and it'll start with pieces, right? I think one of the important pieces will be kind of a green lock, similar to SSL transition on the content, right? Like, as you go to YouTube, as you go to, uh, you know, New York Times, you actually will see that, like, "Hey, this content has been signed by this party, and this party ha- is in some trust, root, or trust commun- like, uh, graph of communities that you are following," right? So that's probably, like, uh, one important piece, and again, blockchain and cryptography is just, like, tools to enable that product experience. And then from there, you know, we need similar things on the government level, right? When you file paperwork, when you file, you know, your identity, the fact that your SSN is a, you know, number that you give to everyone, which is, like, supposed to be secret is, like-
- EGElad Gil
(laughs)
- IPIllia Polosukhin
... for example, ridiculous, right? So things like that, it's like all of this needs to improve and kind of upgrade to this new level where, like, a massive amounts of kind of at scale of things that have been happening now are possible.
- EGElad Gil
What do you think is the most likely form of blockchain-based identity? Because, you know, the blockchain really has been the earliest place where you've had programmatic actors interacting around economic and other utility functions, right? It really is money as code, and effectively smart contracts are ways to programmatically interact with that, right? So you, you had almost like the execution layer without the intelligence, and now we're adding the intelligence. You have the cryptography, but you're missing a real sense of identity, which is needed if you have an agent or bot representing you interacting with another agent, which is probably where a lot of things will work in the future online. What do you think is the most likely form of identity on the blockchain, and why hasn't it happened yet?
- IPIllia Polosukhin
It has happened to some extent, right? We have, you know, like, m- millions of people actually using blockchain right now, and they're using it more for financial use cases and kind of that sort of financial identity.
- EGElad Gil
The wallet as identity kind of thing.
- IPIllia Polosukhin
Yeah, wallet has became an identity, right? And the reality is, like, your "private keys" are your identity, but that's just too hard of a concept for people to actually work with, right? And so on NEAR, we actually change that. We, you know, you have a properly named account, so like mine is root.near, which can have lots of different private keys accessing it with different permissions, right? I can give a key and in a way, permissions to an agent to, for example, interact on behalf of this, uh, or I can destroy it, right? I can give it to a specific application, et cetera. So like a more extensible model is needed. That's one. We need to have more social interactions kind of being spawned from this, and so this is again, blockchain operating system is powering actually social interactions and kind of communication. We actually have a project working on chat and other ways of using now this identity in more places. It's mostly because we didn't have a critical
- 16:07 – 23:35
Blockchain and AI Integration Challenges
- IPIllia Polosukhin
mass of these applications that are using this identity, so to, for it to really become kind of the core. And if it's not the core, it's not as useful because nobody, you know, like, "Hey, you don't have it, so like we're not gonna use it as a default thing everywhere." So like we really need to kind of go over like, again, I think SSL is a really good example of something that's like, it, it delivers value, it's clearly valuable, but it w- it was such a, like, uphill battle to get it there, right? And so I think like until you have this critical mass of like kind of website switched and browser support, it, it didn't become a default, right? So we kind of need l- like the same here to happen, like we'll need to have a critical mass of applications using, you know, identity and then, uh, then we kind of see the tr- like in browsers or wallets or whatever, like applications to hold it, and then we'll see a transition function happen where like, "Hey, oh, you don't have it, like you should get it because it's actually easier and better to use it." A- and it gives you like more financial freedom as well and more upside.
- EGElad Gil
Where do you think the most likely failures, like system wide are, are to be, like with, um, you know, growing capabilities in AI? Like where, where do these mitigants in terms of, uh, reputation systems with blockchain or like, uh, content provenance are, are likely to... H- how's that going to manifest in ways that affect us?
- IPIllia Polosukhin
Yeah, I think there will be probably next year will be very interesting in US because I think this, this will be a place where everybody will just take whatever their toys they have in toolbox and do it even just for kicks, right? Even, even if it's not malicious, although some players will be malicious, and I think that what we'll see everything from like completely fake narrative candidates, uh, to like, uh, I would be very interested to see like a webpage where you land and you know, you log in, and it literally generates specifically for this user based on their interest, a agenda for this candidate, right? So like hyper-focused, you know, marketing for candidates based on like who this, um, voter is, right? So things like that, like we'll have all those possible things where the media will kind of be flooded with like, you know, you can spin up new media right now and just generate content about your candidate, like that you want, uh, and then market that. So like you can have like all kinds of things now just exploding without any way of like framing it on the user's side if like, does this have history? Is this coming from the right sources? Has this been validated, right? And so I think that's gonna be a really, um...... uh, important. I think the other side actually is law enforcement, and this is sadly already happening. The people are using these tools now in very malicious ways right now, and law enforcement don't have a, like, really good ways to deal with this. And so, I think everything from this, like, on-camera si- (laughs) like, signing, we need this now. Like, they really have no way to, like, kind of identify, uh, if the image was generated or not. And similarly, like, for, you know, audio recordings and things like that, like, there needs to be, uh, kind of additional kind of levels of, uh, verification. Uh, and this goes into actually, like, video calls and voice calls because, uh, right now, somebody can call you on the phone and play a recorded, recor- like, generated audio of somebody they recorded 30 seconds of, right? And this can be with very nefarious means, right?
- SGSarah Guo
It's a huge consumer fraud problem already.
- IPIllia Polosukhin
Well, it's huge consumer, but it's also, like, be- beyond that is becoming, like, a real criminal problem. Like, criminals that are being able to use these tools now, and it's, like, the barrier of entry there is, like, very low. And so, uh, this is where, like, you really need, like, you know, f- the phone calls, the kind of all of this. Like, you need more information, identification, and, like, kind of cryptography em- embedded into the system. Otherwise, it's completely going sideways really quickly.
- EGElad Gil
Yeah, this is where people would be using APIs, like Element or LFG or ElevenLabs to create a voice snippet, right, where they'll upload, to your point, 30 seconds of voice, train a model, and then the output sounds, uh, close enough to the person that you could fool a financial advisor or a bank or somebody else to, you know, do transactions on your behalf or things like that.
- IPIllia Polosukhin
Yeah. Or, and you, like swipe their phone, and, and now you're able to, like, impersonate them completely, right? So yeah, so this is, like, a real problem, and, like, having kind of authenticated pass is required there, uh, to really stop it. And, like, we have actually, like, the phones are- actually have so much already. Like, we have face ID and fingerprints. We have, you know... There's secure enclaves that sign things that are, like, haven't been hacked as far as I know. Like, so there's, like, a r- a lot of the pieces are there. Now we just need, like, a product stack that actually pushes it, uh, to the user and, and, like, to the products.
- EGElad Gil
Yeah, that makes sense. I guess one other area where some people have talked about overlap between the blockchain world and the AI world is around training, and there's almost, like, two or three different forms of that. One form of that is there's a lot of GPU capacity that was purchased for mining on the crypto side, and given how valuable GPU is now on the training side, there's all sorts of sort of models to aggregate GPU specifically for training in different ways. You know, aggregating access capacity. And then separate from that, there's ideas around, "Well, can you train a model in a distributed way across a blockchain more generally?" Do you think either of those things are concepts that will work, or how do you think about them relative to the future?
- IPIllia Polosukhin
Yeah, I mean, it, it's interesting because it, it sounds like such a no-brainer that, hey, let's grab those GPUs that, for example, Ethereum just moved from proof of work to proof of stake. Let's grab those and start using them. The challenge is the GPUs there are, like, not the ones that AI folks want to use, right? Uh, like, kind of old AI is really (laughs) zeroed in on, like, how do we get A100s or H100s. Uh, and the GPUs that, like, folks used for Ethereum mining and, like, similar, um, is, like, older ones, like, uh, that are not also focused on, like, floating point arithmetic, for example, as much. And so the challenge was more around, like, people who did, did that, like, CoreWeave is probably a good example, right? They were a mining company. Like, it's more that they had a know-how how to build data centers, and they can, like, get access to m- massive... Like, talk to NVIDIA and, like, get massive access to that, versus, like, repurposing the same GPUs. Although, I mean, obviously, like, for smaller models, for some specific, uh, maybe inference things, there's, there's maybe tr- transition. There's a question of decentralized training, right? Uh, in general, right? Like, hey, we have, like, lots of GPUs everywhere. Can we train it? And the reality right now that the requirements on bandwidth, right? Like, people who are training these models right now, they have like a, you know, 800 gigabit con- connect, right, between these GPUs, right? So (laughs) maybe you have 100 megabits on, between this. Usually not, and you need to, like, replay and, like, uh, work around problems for decentralized. So I think decentralized training right now is, like, still not as realistic, although there's some research people are trying. I think... And inference is really interesting because we do need so much more compute for inference than we need for trading, right? Like, it's, it's a very interesting, like, economy of scale. You train once, like LLaMA trained once, and then
- 23:35 – 30:13
Inference and Decentralized Data Labeling
- IPIllia Polosukhin
everybody runs it everywhere. And so the inference is where I think there's a lot of interesting cases. One is you want it to be private, right? Right now, if you're doing inference, uh, you need to send it to some service, and that service may or may not record it, and, uh, get both input and output. Second one is, uh, you want large capacity at, like, that can scale with more usage, right? Tomorrow, I have, you know, 10X more users. I wanna be able to scale with that. And so this is where I think using some of this hardware that exists, as well as kind of leveraging maybe new methods of privacy and coordination that kind of... Again, crypto has, like, MPC, like, multi-party computation. There's zero knowledge, uh, proofs, et cetera. Like, they can be leveraged to, uh, achieve that and have kind of, uh, secure, like, secure, decentralized inference. So I think that's way more realistic than training and also way more, uh, needed.
- EGElad Gil
Mm-hmm. And then, I guess one of the really early applications that NEAR was thinking about was data labeling, and to your point, the ability to pay people who were doing data labeling for AI purposes, right? And since that time, I think a number of companies have really grown out in terms of the data labeling world in a centralized way. There's Scale.AI, there's CERS, there's a few others. Do you think the best solution in the long run is still a decentralized model where you're using tokens to pay effectively for labeling? Do you think things will stay in the centralized world? Like, how do you view all that evolving over time?
- IPIllia Polosukhin
Yeah, I think decentralized kind of a Web3 marketplace is a more effective way to do this, and it kinda provides few interesting benefits. One of them is that it opens up kind of the market, right, where you don't need to set up, like, a local office and kind of hire people and train them, et cetera. Like, you can just open up global market, anybody can join, and you have very specific rules, right, that if they follow, they get paid, right? So I've used Mechanical Turk before, for example, and you can actually, as a client, you can just decline th- paying them, right? So people in Mechanical Turk, like the, the workers, have very low kind of, uh, (laughs) way to push back, if, if I say. At the same time, they don't have any, like, quality and knowledge assessment on the platform, right? So, so I think having quality knowledge and this kind of escrow model all embedded into one marketplace that opens up for everyone and c- you know, anybody everywhere can get paid at any time. Like, offering that both the people who doing this work want, because they kind of are more protected actually, and it's like fair game, and then the people who wanna give tasks, they can actually get access to, like, way larger, uh, workforce. They can, like, specify specific parameters. They can, you know, price it at whatever level they want. That's gonna be the kind of f- future of it.
- SGSarah Guo
Can you talk a little bit about what makes the quality control problem for annotation hard here, right? Because o- one thing that I've seen with significant research labs is, like, still continued, uh, insourcing of annotators, um, for both pre-training sets and RLHF, because some of the external services and marketplaces can't get to the level of quality that they're looking for in particular domains. So can you just describe the dynamics there?
- IPIllia Polosukhin
Yeah, so I think there's two parts. One is, like, domain knowledge, right, um, that generally, like, hard ... Like, it's hard to tap in in- into like a ve- a specific centralized service, right, because they need to kind of s- Like, for them to do payments, do all those things, they need to set up a subsidiary in whatever country they have the workers. They need to train them. They need to hire them. Maybe it's contracts. But, like, they need a, a lot of overhead that they do that. For example, developers. Let's imagine, you know, you're building a new really cool developer platform, uh, wi- which uses, you know, language models, and you wanna fine-tune on code, right? Well, the existing platforms, like, them hiring a bunch of developers, uh, to actually do this, right, and, you know, if they're doing this full time, that's like super complicated. Then, uh, kind of building out the validation tooling for how to, like, cross-validate that, uh, the work has been done. Now, on the Web3 marketplace, you know, any student can join and like do, do this, right? They don't need, like, you know, join at like, get a contract with specific company. They don't need to have the company in the local region to work with them. Um, and like students, you know, for coding, for example, are really interested in doing this because they, uh, usually don't have much money, and this is a way for them to practice their, uh, work anyway. And then as a task giver, you can actually specify the specific way you want the cross-validation to happen. And, uh, one of the things we've done, uh, it's like honeypots, right, where you actually specify specific types of incorrect answers that people need to mark as incorrect, and otherwise they're gonna actually lose, uh, the buy-in. And so there's a actual like very clear, like, economic game theory where people have buy-ins, they, uh, they lose them if they, like, do poor quality of work, and so, uh, they have, um, like way more incentive to do this versus like, let's say if you're working on a contract, there's like way more leeway usually, uh, if you're not doing your work, right? So there's like just way higher kind of, uh, self-evaluation as well that happens. And so, I mean, there's a lot of pieces that needs to come together for this to be like high quality, but again, it just opens up this marketplace and makes it effective and it in a way removes a lot of the human part as well.
- EGElad Gil
One thing that I think is really neat about how NEAR approaches innovation is you do both internal sort of NEAR roadmapping and product development, and then you also have a series of things that you either spin out or spin up or you're sort of involved with as sort of these ancillary companies or projects or efforts. What area are you most excited about over the next coming year in terms of either NEAR or some of these other efforts that you're involved with?
- IPIllia Polosukhin
So we do actually have a project, uh, in this, uh, Web3, uh, AI data marketplace, uh, that we are spinning o- out, um, to focus on n- now like they built the product, they have all the pieces. Now it's like ready to actually go to market and bring customers. I think the, the really interesting area is kind of partnering with existing kind of either already Web3 enabled or interested in Web3 teams who wanna give access to more functionality to their, uh, users, right? We have s- for example Sweatcoin, which is really good example of like ... It was a Web2 project that had 120 million installs,
- 30:13 – 38:18
Web 3 and AI SaaS Challenges
- IPIllia Polosukhin
that had a ton of people using it every day, kind of for a very specific use case, right, kind of tracking their steps and, you know, maybe getting a discount on their next shoes. But now as they transform into Web3, they're kind of opening up, right, and you can now participate in, uh, economic activity. You can, you know, learn about new kind of innovations that happen in the ecosystem. You can now, you know, part- like as they integrate more into blockchain operating system, uh, can potentially interact with s- like on the social side, do the tasks and gigs, and so like you kind of really open up the what before was like a very limited kind of economy.... to really just, like, you know, composable open web. I think that's really exciting and, like, we will see probably more and more examples of that. Uh, and finally, I'm really interested in kind of, as I mentioned, like, because we have now open web and socialware, the kind of what I call future of SaaS. So, I think a lot of... Between Web 3 and AI, a lot of SaaS will actually start being, uh, replaced. Because right now what SaaS is, is like one database with a specific UI for a specific problem. The database is the same between CRM, the hiring tool, marketing tool, uh, even some of the, uh, project management tools, right? The database underlying is, like, not that different, and it's been just, like, the front end. And, like, interconnecting all of those databases is, like, a ton of work. It always breaks, right? Um, but now, you can have, like, the database you own, right? So, using kind of Web 3 tech- tech, and then you can build all of this front end on top, either through kind of blockchain-based system shared components, or even sort of describing with natural language some of the interfaces and business processes you want to have, right? So the way people will interact with, like, kind of their business operations and all the tooling they need will start to change. Uh, and, um, so I'm really excited about this space and, like, we have one company that is kind of, you know, starting, uh, to build out some of the things in this space, and over next year we'll see kind of that evolving.
- EGElad Gil
Do you think that moves to an agent-driven world? In other words, when you imagine the interfaces on top of this that are sort of driving these business processes for future SaaS applications, do you view them as sort of traditional UIs or do you view them as agents that are interacting programmatically or some hybrid?
- IPIllia Polosukhin
It will be a hybrid. So, I... Like, in my imagination right now at least, I expect, like, you- you can describe a business process which is like, "Hey, you know, when we have a new creative from, like, marketing department, spin up a Twitter campaign and create me a dashboard that tracks the conversions on our product." Right? And so what it does, it, like, creates, you know, the pipeline of these things, and then it also creates a page where I can see, like, normal us- user interface of, like, analytics.
- SGSarah Guo
So it might be more generated dynamic UI?
- IPIllia Polosukhin
Exactly, yeah. And it's, like, adjusted for a specific use case you need. And probably there is, like, a bunch of templates that is, like, you know, fine tunes for your specific problem. Like, and this is possible right now.
- EGElad Gil
Yeah, I guess it kind of moves you, um, down the path of what you were talking about in terms of, like, AI CEO or AI as project manager where you're kind of morphing into a world where you're delegating to an AI to drive a bunch of activities and then come back to you with the results, like you would an employee or a coworker, which is very different from the world of UI today where you just go to the same spot to see analytics, you go to the same spot for communication, you go... which is your email, you go to the same spot for, you know, interacting with a workflow, and you're saying this should be more of a dynamic world where things get brought back to you based on a series of tasks that you provide out.
- IPIllia Polosukhin
Yeah, and it's, like, probably a shared environment as well where, you know, we probably will co-work on a business process and, you know, we'll share one display, but then we'll maybe fork it because I'm more interested in conversion and you're more interested in retention, for example. And so, so that's kind of the dyn- dynamism right now that also doesn't exist where, like, we all look at the same, you know, Jira task management, and I'm like, "I don't really care about half of this stuff." Right? But it's not a filter problem, it's like I want different information showed in a different way.
- SGSarah Guo
Author of The Paper That Changed the World, here we are in 2023. Is it bigger Transformers all the way, or are there other architectural directions that are worth thinking about that you're paying attention to?
- IPIllia Polosukhin
I think there's definitely something around, like, how do we get these models to have the capacity to, like, let themselves think before outputting or, like, kind of, uh, process more. And I- I think it's, like, still within the Transformer structure and it can be, like, advanced, but I haven't seen anything that's, like, really matches my intuition around this. But I think the... like, the simplicity of this architecture and, like, indeed, like, the- the amount of optimization that's going into this right now is just... It'll be really hard to match, uh, and kind of, you know... with enough expressivity you can express any function, so, like, it's not... This is not a problem at this point of, like, "Hey, we don't have an expressivity." Right? It's more around, "How do we either, like, compose a dataset that's, you know, cleaner, better, or add some, you know, self-critique and understanding of, like, is this content correct?" Or, "I need more time to think" w- versus, you know, "Hey, I'm forcing you to output next token even if you don't have an answer yet." So, I think that- that part's really neat and- and I think, uh, they kind of fit into the architecture, but, um, just require more engineering and more different types of tasks as well for training. I think, like, you know, the fact that we're just using a big language model is kind of interesting because this is not a task you would expect, uh (laughs) everything to be able to, you know, just predict next token. So, like, you know, starting to... Obviously RLHF has been already helpful, but, like, starting to, like, "Hey, can you critique this answer? What would be the, uh, better answer?" Et cetera.
- EGElad Gil
Do you view that as a training or fine tuning thing or do you view that as an inference thing?
- IPIllia Polosukhin
I mean, it- it's gonna be, like, a combination, right? So I think w- we just need an architecture that, at training time, you're- you're able to... So, like, I mean, this- the simplest thing is, like, instead of outputting a token in the next, right? You can actually give it m- like, you know, empty token, for example, for some period of time, and then when it says, like, "Okay, I'm ready," give it to output next token, right? And so this way you can train it to, like, think more before outputting, and then at inference time you can vary it, right? Like, "Hey, I'll give you more time to think," you know, uh, or like, "No, you have no time to think." But then you can train it to, like, actually being able to, like, dynamically to output this. So again, this is, like, a very simple thing, but, like, you can keep expanding on this. You know, output it and then feed it back and, like, is this the right answer? Like, et cetera. So there's a few different models. But I think the-To Jacob's point, like, the, the fact that this model is, like, doing a really effective search in kind of this knowledge space means that probably, like, pushing more into that concept is more useful than doing more searches e- at inference time, because, like, it means you already lost all the semantics if you're doing search at inference time.
- EGElad Gil
I think you made a really interesting point where it's possible that transformer architecture increasingly is getting locked in, and there's two components to that. One is it just seems to run really well on the main se- silicon that we're using right now for AI, um, which is GPUs. And then secondly, um, there's so much optimization work going into it, and so much being built around it, that it effectively creates optimization that just won't happen for any models anytime soon, and so you effectively end up with this interesting feedback loop or lock in effect for this set of models. Do you think that we're in a spot now where this is just kind of the future for the next five years or 10 years or something, or what do you think is the likelihood that other a- approaches or architectures will emerge anytime soon?
- IPIllia Polosukhin
I mean, there might be an- an- another architecture that, like, reasonably fits with the same silicon. I think that there's an interesting quest- example of there's a company that built an alternative, right, silicon, that is kind of, allows to process things in pi- pipelines. And so, like, the chips are actually, like, kind of smaller compute chips, but they're kind of all, uh, l- like
- 38:18 – 42:24
The Future of Hardware Accelerators
- IPIllia Polosukhin
in a grid, and the data flows from one side to another, right? So the example there is on one side it's, like, a really interesting architecture, you can build really cool things with this, but it doesn't fit transformers very well, right? Like, you can do transformers with it, but it doesn't fit very well, and, like, your, your cost to, like, you get, like, you know, cost to, to output ratio is not that interesting. And so in comparison to, you know, you just optimizing on GPUs or using some of the new hardware accelerators. And so this is where exam- like, I mean, and I'm not sp- to speculate here on specific company, but, you know, I wouldn't expect they will have, like, ton of people lining up, because, like, there is ton of alternatives for transformers that are coming in, and, like, somebody would need to, like, go in and develop a lot of new architectures that fit better, uh, this model. And so, uh, it'll be really hard for them to, like, be a viable business and kind of have the economies of scale that NVIDIA is having right now to just kind of continue optimizing and, and building best state of the art chips, right? So unless somebody's, like, really investing in this, I think it will be more around, like, what, what else we can do with current silicon, right, and kind of combinations of this, and then, uh, I mean, maybe there's n- something new will come out.
- EGElad Gil
Yeah, but when things lock in technologically, they actually tend to lock in pretty strongly until there's a really big sea change or sort of the optimization of those things hit a asymptote. And it's interesting because I think a prior example of this kind of, um, chip plus software reinforcement loop was really the Windows and Intel monopolies of the '90s. You know, they used to call it Wintel, for Windows and Intel, because there was such a strong mutual lock in effect, where you had chips that were optimized for Windows and Windows was optimized for the chipset, and it just kind of kept going from there. And so it's int- this is, I feel like, a stronger version of that in some sense, where you have the underlying compute architecture and the most important model reinforcing each other in a way that kind of locks both of them in.
- IPIllia Polosukhin
Yeah, and what, what changed that is pretty much come of mobile, right, and creation of ARM devices, ARM chips that are kind of optimized for mobile, and then came back into PCs, right? So yeah, so un- un- unless there's, like, a completely new form factor, which hard to predict, right? But also it's, like, that's a lot of investment to go from not just ha- software, not just hardware, but, like, full stack, right, innovation.
- SGSarah Guo
Yeah, I, I think it's unclear if this is a strong enough market force, but the short term, you know, demand supply imbalance around GPUs with all of the growth of applications, especially as, like, you think any of these applications work, like, inference needs grow, right, your ability to build enough, uh, for NVIDIA really, to build enough GPUs to, um, service the demand, is, like, it's blocking a lot of companies, right? And I think the question is, like, there is more incentive to make heterogeneous hardware work than there ever has been, and, like, can that catch up with the full stack optimization that you described, the CUDA, like, um, investment that NVIDIA has made o- It's super unclear, but I, I think, like, there's been no reason to chase that until, you know, this past 18 months, and I think now there is.
- IPIllia Polosukhin
Yeah, but at the same time we have, like, every single, you know, large companies doing their own hardware accelerator as well as, you know, a bunch of folks who are kind of spun out of those. And so, like, we're gonna have a f- you know, a market full of hardware accelerators which are still optimized for transformers, or at least, like, similar structured architectures hitting the market, like, this year and n- and next year.
- SGSarah Guo
Yeah. Ilya, this was great. I hope you will, uh, after Elad and I work through all of the Transformers authors, like Pokémon style, got to catch them all, I hope you'll come back for a reunion episode, but thank you for doing this.
- EGElad Gil
Yeah, thanks for jumping on.
- IPIllia Polosukhin
For sure. Thank you.
- SGSarah Guo
(instrumental music) Find us on Twitter @nopriorspod. Subscribe to our YouTube channel if you want to see our faces, follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.
Episode duration: 42:24
Install uListen for AI-powered chat & search across the full episode — Get Full Transcript
Transcript of episode BqkHRj6pdDs
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome