Skip to content
No PriorsNo Priors

No Priors Ep. 121 | With Chai Discovery Co-Founders Jack Dent and Joshua Meier

AI has already fueled breakthroughs in biotechnology—but now, further advances in AI are poised to fuel pharmaceutical discoveries as well. Sarah Guo sits down with Joshua Meier and Jack Dent, co-founders of Chai Discovery, whose newly launched Chai-2 designs bespoke antibodies that bind to their targets at a jaw-dropping 20% rate. Jack and Joshua talk about the implications for Chai-2’s success rate at discovering antibodies for the pharmaceutical industry, how structure prediction is pivotal in making the model work, and future potential for using the model to optimize other molecular properties. Plus, they talk about what they believe bioscientists should be learning to best utilize Chai-2’s technology. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @_jackdent | @joshim5 Chapters: 00:00 – Joshua Meier and Jack Dent Introduction 01:09 – Genesis of Chai Discovery 06:12 – Chai-2 Model 10:13 – Criteria for Specifying Targets for Chai-2 13:12 – How the Chai-2 Model Works 16:12 – Emergent Vocabulary from Chai-2 18:15 – Hopes for Chai-2’s Impact 20:33 – Reception of the Chai-2 Model 22:16 – Future of Wet Lab Screening and Biotech 27:08 – Optimizing Other Molecule Properties 31:37 – Where Chai Invests From Here 36:20 – What Bioscientists Should Learn for Chai-2 40:23 – How Jack and Josh Oriented to the Biotech Space 43:38 – Platform Investment and Chai-2 46:53 – Scaling Chai Discovery 48:21 – Hiring at Chai Discovery 49:09 – Conclusion

Sarah GuohostJoshua MeierguestJack Dentguest
Jul 3, 202549mWatch on YouTube ↗

EVERY SPOKEN WORD

  1. 0:001:09

    Joshua Meier and Jack Dent Introduction

    1. SG

      Hi, listeners. Welcome back to No Priors. Today, I'm excited to speak with Josh Meyer and Jack Dent, two of the co-founders at CHAI Discovery and former bio, AI, and engineering leaders at Meta, OpenAI, @Science, Stripe. This week, CHAI released their industry-leading CHAI 2.0 zero-shot antibody discovery platform, which at its core, is a generative model that can design antibodies that bind to specified targets with a hundredfold the hit rate of prior computational approaches. We'll talk about their product, the next frontier for CHAI, why they're bullish on biotech, and why the most effective antibody engineers will soon be working as expert prompt engineers. Jack, Josh, uh, congrats on the CHAI 2.0 launch. Thanks for doing this. Welcome.

    2. JM

      Thanks for having us, Sara. We're excited to be here.

    3. JD

      Good to be here.

    4. SG

      Josh, I'll start by just asking, you know, you and several scientists on the team have been working on AI drug discovery for about a decade now in different settings. I've also been looking at this area for, for over a decade. We haven't yet seen successes of drugs to market that were designed, you know, um, with these AI computational techniques. What, what made you believe? Why start the company when you guys

  2. 1:096:12

    Genesis of Chai Discovery

    1. SG

      did?

    2. JM

      That's a great question. So many of us have been working on this space for a while, uh, and we didn't start a company because it was really a research idea, I think, until very recently. You know, there were signs of life that someday this was gonna work, but it wasn't really on the timeline of a company, right? Uh, you can't really start a company thinking that 10 years from now, things are gonna work. You also don't want to start a company after it's already working and kind of miss the boat. So the sweet spot is like, okay, we have like maybe one, two years, uh, where, where, that we have to, to really get this off the ground. And we made a bet when we started the company that was gonna work. There were really a couple of things that fueled that decision. The first one was, uh, we, we made a bet that structure prediction, protein folding, was gonna get a lot better. So obviously, protein folding, uh, is considered solved in, uh, a couple of years ago, around like 2020. We had the breakthroughs of AlphaFold2 and being able to predict protein structures with experimental accuracy, but it was just a single protein structure at a time. So we can take a single protein sequence, and we can see what that protein looks like. That's very useful for basic biology, so we can understand what the proteins we're looking at look like. But if you think about drug discovery, which is, uh, where, you know, we're really focused on at CHAI Discovery, in drug discovery, you need to understand how multiple molecules interact with one another, so you need to understand how a small molecule drug is going to modulate a protein or how an antibody protein is going to modulate an, an antigen protein. So, we started to see early signs of life that that was going to be possible, and, uh, again, we made a bet that we would be able to take this to the next level with the kinds of breakthroughs that we were seeing around diffusion models and around language models. The previous generation of, of structure prediction models, uh, would really just predict, you know, like one conformation protein at a time and kind of like one view on a protein. It's like the early image models. Like, they didn't have diffusion models. You weren't really able to, uh, to look at the diversity of generations that could come out, and we thought the same thing would impact drug discovery and protein folding as well. So, that's a bit of color on, on how we decided to start the company and when we did, and maybe lastly I should say, uh, almost every AI bio company before us has had some kind of very tight lab integration with what they are doing, and it all was too tight. I think the lab integration is great. We do a lot of lab experiments at CHAI, but the thing that was missing was, could you actually have some kind of portable AI platform, something that would actually be generalizable and could be applied to lots of different areas? If you could do that, it means that your impact, uh, could really be taken to the next level. We can take CHAI 2.0, the model that we've just released, uh, and we can deploy it, uh, to hundreds of, of different projects, thousands of different projects. CHAI 1.0, which we open-sourced, is already being applied throughout the industry to tons of different pro... We don't even know everything it's being applied to because it's open-sourced. But that was something that was also really important to us, uh, if we were going to kind of see this transformation of biology from a science into more of an engineering discipline, which is, uh, uh, ultimately the goal of the company.

    3. SG

      Yeah, I want to come back to, um, what you said about lab integration as we talk more about the, um, technical approach here, but Jack, you and I met w- in the context of, you know, you being a beloved engineering and product leader at Stripe, um, coming from the engineering side and looking for, like, the most interesting problems to work on in AI. Why did you decide to work on this versus, like, some of the other things we were talking about, like code-gen and such?

    4. JD

      Yeah, so as you know, Sara, I spent, uh, quite some time thinking about my next steps and what I wanted to do with my, my life after, uh, my period I was at, at Stripe, and I give a lot of credit to Josh actually for this, that, you know, we were good friends going back even to the, to college. You know, we were Pset buddies in, in college, uh, at Harvard, in, in many of the same classes together. While I was maxing out the CS curriculum, Josh was also doing that somehow for the chemistry and physics and all the other curric- scientific curricula as well. Uh, but we, we had landed in a, a lot of the same classes and, uh, as, uh, as we went our separate ways after college, we really just made a point to, of keeping in touch every, you know, three, six months, and Josh would always talk to me about, about his research. Once it became clear that, that the research that, that, uh, Josh and others were doing in this space wer- was really no longer just a, a toy, uh, and, but was, was really going to impact and change the entire industry, that idea became infectious, right? It, it sort of become impossible to unsee the, the future once you, you have that glimpse, and although you didn't know until very recently that, that any of this go, is, was gonna work, and, uh, of course, there's, there's still a lot left to prove. Uh, once you start to grasp the, the implications of the fact that over the next few years we are going to have the ability as a human race to engineer molecules with, with atomic precision, it, it's almost hard to work on anything else, uh, with, with your life. Uh, the, the impacts for society, just broadly, uh, and, and human health, and not just health, there, there are, there are a ton of other areas which we'll, we'll, this will touch which we can get into, but that is just a platform shift in an entire, en- entire industry, and, uh, so put that together with the fact that the kind of the, the belief or conviction that you might just be able to get it working, uh, and I think it was, it was, uh, impossible to say no to working on this in many ways.

  3. 6:1210:13

    Chai-2 Model

    1. JD

    2. SG

      So, there's a breakthrough, uh, result in CHAI 2.0. Can you give us a sort of layperson's explanation of what the result was and, and the model itself, and, and, um, what you think is the most valuable part?

    3. JD

      Sure. CHAI 2 is our latest series of models which are state-of-the-art across a number of different tasks, but specifically the one we're most excited about is design, and what we've shown is that we can design a class of molecules known as antibodies, which are some of the most therapeutically interesting molecules as well. These account for close to 50%, uh, of all recent drug approvals, and seven of the top 10 best-selling drugs out there are actually, actually antibodies. And so what we've shown with CHAI 2 is really the ability to design antibodies against targets that one wants to go after in just a small, what we call a 24-well plate, in just 20 attempts. What this means is that we take a target, run our models, ask the models to design a antibody. We then ship that antibody to the lab. We have about a two-week validation cycle in the lab, and two weeks later, we see that roughly close to 20% of these antibodies actually bind their targets in the intended way. So CHAI 2 is a major breakthrough for the field. When we set out on this project, uh, we were actually only targeting a success rate of 1%. That was the company-wide goal for, for the entire year, uh, and th- the reason we set that goal of 1% is that previous attempts at this problem are maybe successful around 0.1% or even lower of the time, and that's... those are the computational techniques. If you look at the traditional lab-based high-throughput screening techniques, people are really screening between millions or billions of compounds just to find one molecule that, that sticks. You know, there's a reason we call it drug discovery is it's a discovery problem. It's a search problem, and so people are really s- sort of panning for gold in these massive yeast or phage libraries, or alternatively, you might inject a mouse or a llama, uh, you might, uh, wait a couple of weeks for them to get really sick. You might then, uh, bleed, uh, bleed them, take their plasma, take the, the, the antibodies out and isolate them, um, and this is actually what we did for COVID actually. We would... we actually took some humans who had already got COVID, took their antibodies out of them, tried to find one which, which actually then neutralized the virus. So you can imagine, not, not an ideal or the most efficient or the most, uh, principled process. And so what we've shown, uh, with CHAI 2 is that we've able... been able to increase these, these success rates in discovering antibodies computationally by multiple orders of magnitudes compared to the prior state-of-the-art computationally, and by many, many, many orders of magnitudes compile- compared to the traditional lab-based alternatives. And what this means is, um, is pretty profound for the, the industry in, in our view. You know, there, there are two ways to look at this. There's of course the faster, better, cheaper. You know, this is going to allow us to make drugs against targets and get them turned around faster, but I think the thing that we're really excited about and what's I think more important is the entire class of targets that this will unlock in the future, which have just been inaccessible to, to previous methods. And I think in general, the biotech, uh, industry is, uh, everybody's a little glum right now. XBI hasn't done that well over the last five years. I think we're in one of the worst markets in biotech, um, over the last few, few decades. But I think with CHAI 2 we're starting to see I think those first early signs of a real platform shift in, in biotech, the sort that comes around only so often. Uh, you know, we've had one in the '70s with all sorts of new techniques then, but the idea that in the next five, 10 years that there are gonna be entire new class of molecules that we're gonna be able to discover and entire new targets that we're going to unlock and entire markets that we can open up and therapeutics that we can get to patients to real... really cure diseases that have had no cure before. That's just an incredibly exciting prospect for

  4. 10:1313:12

    Criteria for Specifying Targets for Chai-2

    1. JD

      us.

    2. SG

      I wanna come back to impact because I think the ramifications here are, are, um, really huge, but, uh, if we just go and, like, think about first problem design, you I think looked at 52 problems. Why that many and, like, how do you specify a target? I'm picturing, like, bind to epitope X, but I'm sure there are other requirements you'd want to have as drug designers.

    3. JM

      It's a great question, Sara. So in the CHAI 2 paper, we look at over 50 targets. Most of the existing, uh, papers in this area of doing AI for, for drug discovery are, are usually looking at, like, one, two, or three targets, but again, it was important for us if we were seeing this as an engineering problem to make sure that this is gonna be generalizable. It's like imagine you had a new LLM paper and you said, "Oh, I solved, like, one problem in the UCIMO, uh, contest. Like, really, really cool." It's like, no. You, you need a real benchmark and you need to actually have that benchmark at scale. You need to have enough problems to convince yourself that the system is working. So that's why whenever we do these experiments, you know, sometimes we'll, we'll try one or two targets just to make sure there's not, like, a huge bug and, you know, make sure not everything fails. But, you know, even if everything fails in one or two, you know, the hit rate's 50%. You could have just gotten unlucky. So that's one of the reasons why we decided to do a big benchmark here, really convince ourself things are working. The way we selected the 50 problems, the biology people would laugh at this, and engineering people would love it. We actually just went to the vendor catalogs to see what was in stock 'cause we wanted to turn around this experiment quickly.

    4. SG

      (laughs)

    5. JM

      We ordered all of these designs at the same time. So we actually wrote a scraper that would go and see what was in stock. We would go and pick out the protein. We would go look up what that protein sequence was. Now we need to make sure this is held up from training as well, right? So we would take that, that protein sequence. We would go compare it to all of, like, the, uh, the, uh, a database called the SABDAB. It's, it's a group of, like, antibody structures in the protein data bank, and we'd make sure that, like, none of these sequences were in there and that none of these sequences were actually even close to anything in there. We removed things that had, you know, more than, uh, 70% sequence identity, so really things that are, like, a bit different than what we, uh, we could have trained on, then selected those, made our designs, and then we shipped everything off to the lab.So, we actually think, uh, it's possible that the 50% is actually an upper bound because we might have just, like, messed things up because of how we set up this experiment. We did not think about the biology. These are not necessarily things that are even that useful for therapeutics. Some of these already even have drug programs against them. We were just doing this really from a- a model assessment perspective. Let's understand how well the model is working, let's convince ourselves, let's convince the community that CHAI 2 is working, and then in terms of applying this to- to problems, I think, you know, now- now we've got, like, hundreds of people that want to go and, like, try the model tomorrow, uh, and- and apply it to- to the various drug programs that- that they're working on. Uh, so that was really how we- we came up with those 50 tasks. "Let- let's benchmark this and treat it as an engineering problem."

    6. SG

      We have a broad audience for No Priors. It ranges from, like, business people to engineers, machine learning researchers, some scientists in other fields. Like, what intuition can you give listeners for how the model works under the hood? Like, e- especially for anybody who might start with, um, some familiarity with, like, structure prediction

  5. 13:1216:12

    How the Chai-2 Model Works

    1. SG

      models.

    2. JM

      Yeah, well, structure prediction is really a key part in making these models work, and it's actually the first thing we did when we started the company, is we sprinted to build a state-of-the-art structure prediction engine. We actually open-sourced the first version of that. It's called CHAI 1.0, and again, like, scientists a- around the world are using that now. But structure prediction basically gives you an atomic level microscope, and it allows you to see where atoms are placed in 3D space. So, once you can do that and you have this microscope, then the next question is, "Well, can we start moving those atoms around," right? We can now start to make changes in a sequence and then we can see, uh, the ramifications of those changes in 3D space. So, the actual design model, you can think of it as, uh, you prompt it with, uh, some information, like, "Here's a target, uh, that we wanna go and- and- and design, uh, an antibody against," and then the model will- will try to place again these atoms in 3D space in order to satisfy that constraint. Like, we tell the model, "Here's a target and I want you to make a molecule that, you know, binds to- to that location," and then the model will go and- and generate, uh, both a- a sequence and- and a structure that, uh, that kinda fits into that. So, that's, like, the high level intuition for this.

    3. JD

      Yeah, one piece of intuition around that is that you can almost think about structure prediction as the ImageNet moment for the field, where with structure prediction, we are asking a model to go from sequence to a predicted structure, and it's almo- sort of like a classification task. And then design, where you're trying to design binders, that is much more like a generative task. That's sort of like Midjourney for molecules. Whereas structure prediction, you are looking to predict the placement of atoms in- in 3D space, with design you're taking an existing placement of atoms and you're trying to craft a new set of atoms that is complementary to- to that original set. So, one analogy that people like to use is out of a- a lock and a key, and that when designing, uh, a protein or- or- or a drug, you're- you have some target, which is your lock, and you're trying to design a key using a generative model that- that- that fit- fits that lock. And the, uh, the way that the models work is actually pretty- pretty interesting. They, uh, they reason quite literally by placing individual atoms in- in 3D space, and often they're getting the- the resolution of- of these structures, the error down to less than the width of- of one atom when we look at the error across the entire structure. So, when we talk about atomic level microscope, uh, you can see now why that might be important for design, because how can you hope to be able to design the- the key if you can't see- see the lock?

    4. SG

      Yeah. That's completely wild from a precision of prediction perspective. Uh, y- you know, if we analogize to LMS, you know, you have learned grammar, syntax, semantics, capabilities that emerge in the model that you can measure. Is there anything that would be analogous in terms of emergent vocabulary or concepts that you think CHAI 2 has?

  6. 16:1218:15

    Emergent Vocabulary from Chai-2

    1. SG

    2. JM

      Yeah, I think this whole point about the atomic level microscope is actually that point, right? There is something really, I don't know, I think deep. We still don't fully understand it, about, like, why these models work. Again, we didn't even know this was possible. Obviously we tried it, so we thought that there was a chance, but...

    3. SG

      (laughs)

    4. JM

      And I- I think it just tells you something about, you know, maybe the signature of how proteins interact with one another is really embedded in the data, right? And w- we're generalizing to a new setting. So it's not like the model has seen, you know, specific binders against a target and then we're just trying to do some in-domain generalization and walk through that space. That's actually quite an impactful application as well, uh, and that- that's already being done through the biotech industry. Our- our team published work on that years ago already. But I think this really new frontier about generalizing to a new space, it tells us that... A- again, like, the model is learning something really fundamental about how the molecules interact with one another. It, again, it's- it's able to generalize to- to problems that look very different in terms of how we would actually organize it in the biology. I think the whole rules about, you know, what do we think about, like, a- a protein family being different? These targets that we tested on are- are... Again, to biologists, they- they're very, quote unquote, "dissimilar" from what we saw during training, but i- it doesn't seem like the model thinks that way. We actually even have a thought in our paper in the supplement where we actually look at an even harder subset. So not looking at things that are, you know, up to 70%, uh, sequence similarity with the model, but actually pushing all the way down to 25%. So really looking at tasks that are very different we saw in training.Success rate was basically the same. Like the model didn't care. And, and again, I think that indicates something very profound about what the models are learning here.

    5. SG

      I mean, my assumption is the same here, where, uh, obviously the, the fastest path to immediate impact is going to be, you know, antibodies in clinic or whatever other therapeutics Chai and its partners work on. But it does raise a question of, like, if the model has learned something that fundamentally, like the biology research community doesn't yet know from a principles perspective, like, we will also learn those rules from these models, or whatever the principles are of, of structure and interaction. Um, so I think that's super exciting.

    6. JM

      Yeah, totally

  7. 18:1520:33

    Hopes for Chai-2’s Impact

    1. JM

      agree.

    2. SG

      Well, how would you characterize like overall hoped for impact of CHAI 2 in terms of like bringing it to industry or your own programs?

    3. JM

      That's a great question, Sarah. So there is maybe two main areas that we can break it down into. The first one is, uh, again, like we've, we're turning this into an engineering problem. It's been, it's been months or, or sometimes even years trying to discover some molecule. You know, now, now we can actually do it way faster, uh, because the screening, if you will, is happening on the computer instead of in the lab. But the second area that we're actually even more excited about is how do we actually solve problems which just weren't even reachable with traditional methods? The model is not perfect. You know, it worked in 50% of the targets that we tried. Maybe it would've been more, right, for the caveats we, we talked about before. But, you know, it worked in 50% of cases. The failure mode of the models is gonna be different than the failure mode in the lab today. And I think that's really gonna be the, the sweet spot to focus in on. What are the areas that were not possible, you know, a few months ago, uh, where now we'll be able to actually generate potential molecules really quickly against? So, so that, those are the two areas. You know, things that you could do today, let, let's do them a lot fastest and a lot cheaper, but I think really the breakthrough opportunities are, are things that just weren't possible before.

    4. JD

      Yeah, well one other thing that we've announced is that we will be opening up access to both, uh, academic groups and industry partners. I think when you think about how this space is just going to evolve in the next few years, and the amount of opportunity that's, that's out there given this platform shift, there is way too much opportunity for any one company to capture alone. And their, uh, dr- drug discovery itself is just an incredibly resource-intensive process, and I think it would be probably a conceit to assume that we could go after and pursue every target and w- we've done every program ourselves, e- even if we, we wanted to. And so when we think about impact and think about what is, is going to move the needle for the company, of course, but also f- for the world, we think that the way to, to do that is to go out and, uh, bring this to life with a really exciting set of, o- of partners. And so, uh, we- we've opened up access. There's an access page on our website which people can go to and fill out. We're currently walking through those, being inundated with requests. But, uh, uh, our... M- my hope is that we can, can, uh, really enable quite, quite a few use cases with this, and, and do that quite quickly.

  8. 20:3322:16

    Reception of the Chai-2 Model

    1. JD

    2. SG

      What has the reception been like so far? What is the biggest objection? Because this is a, you know, significant challenge to the ideas of high-throughput screening, or even like the workflow that, um, you know, even innovative pharma and biotechs have today.

    3. JM

      Yeah, it's a great question. You know, usually when these kinds of, uh, papers come out, again, people have tried to, to do this many times. The critique is, is often, you know, does this really work? You know, you, you showed this on maybe COVID, for example. Is this gonna work for a case where we have less training data? Are the molecules going to be high quality? Do we really, you know, kind of believe the data? So I think the approach we did, like benchmarking this at scale, has really helped a lot with that reception. Like I think people really appreciated that approach, which has been great. Some of the questions people have is, "Okay, like I can already discover drugs, so, uh, you know, so now I have AI that can do it a lot faster, but does that actually change the kinds of, of molecules I can work on?" And it goes back to what we just discussed before. I think there are other folks that are responding to that saying like, "No, like the, the transformation here is how about those projects that didn't work for you, uh, or, or where you're really struggling today? Now you've got another tool in the toolkit, and you kind of have to use this tool now or, or you might be left behind." So I, I think that it's been really interesting to see the, the community kind of digesting this. Of course, a lot of the AI folks are, are really excited, right? Like we're getting artificial antibodies, uh, before we're getting, you know, uh, maybe other breakthroughs we would've (laughs) expected earlier. But it's, uh, it's overall be- been, uh, been really exciting to see that reception. I mean, our inboxes are just flooding up, like the early access has gotten hundreds of people, you know, within hours of, of launching, uh, reaching out to us. We, we just announced, so I think we're still kind of digesting all that. We're a small team, uh, so we're, we're prioritizing early access to, to the right people. But we're, we're really excited, uh, to kind of get the models out there and for them to, to start solving some, uh, some really, uh, uh, hard problems in the drug discovery

  9. 22:1627:08

    Future of Wet Lab Screening and Biotech

    1. JM

      space.

    2. SG

      Is there an important future for like large-scale wet lab screening? Uh, does it just become a data collection exercise to fill out the distribution for CHAI models? I- are there areas where you will... you think you'll need that in 10 years, 20?

    3. JM

      Yeah, I think if you just take the models and then you sample more, you probably will get a better result. So we tested only 20 molecules per, per target in the paper, up to 20 molecules. You know, if you were to do 10 times that or 100 times that, orders of magnitude more, you probably just get into spaces, uh, with better, better molecules. So, you know, the machine learning model is probabilistic. It's like using ChatGPT if you, if you're trying to solve a math problem and then you look at the top one response or if you look at the top 10,000 responses, you're gonna get a better result if you look at the top 10,000.You can't really do that with a product experience on ChatGPT. I'm not gonna look through 10,000 math responses. I won't even know which one is correct. The cool thing with a lab actually is we actually could just test all 10,000 of those in the lab. So I don't know if you have to, uh, but that's definitely something that is, I think, going to be tested out with these models. And I think the future of, of high-throughput screening and how they kind of interact with the models, I think the question is still open, but I- I expect that, uh, you know, people will be creative and, and will find ways to actually take the best of AI and marry that with the best of biology, uh, to kind of push the bounds forward.

    4. JD

      And just to add to one thing to that, there's a whole host of really, uh, amazing a- CROs and other players with this incredible expertise running those traditional methods and to, to Josh's point, we have many, many companies asking us, "So can you run this not just 20 times, but can you run this 100,000 times, even if it's going to work in 20 'cause I just might find something better," right? And that something better can result in a better drug. That could be the difference between getting a patient, uh, an antibody which requires an injection or something which requires subQ dosing, for example. And so, I think with these tools, you can sample the search space sort of ad infinitum and that, that marrying of traditional technique and models will actually hopefully get us into areas of this space where we can just find better products for patients.

    5. SG

      I wanna ask one more question, uh, generally about, like, predictions for biotech, and then I wanna talk about the future of CHAI as well. What do you think biotech looks like 25 years from now? I realize that's a ludicrous question to anybody working in AI where you're like, "Hey, I didn't know if this was gonna work at all last year."

    6. JD

      As I mentioned before, there is a lot of doom and gloom in the biotech industry right now due to macro factors, uh, with rates where they are, and the long-term investment cycles that are required to make biotech viable. There is just a real pessimism in the industry right now. It's sort of the worst market in, in, in a couple of, uh, of decades. And I think that it's moments like this, breakthroughs like this, which give us these flashes of light and these, these reasons for just immense optimism about the, the future of this industry, not just in terms of improving timelines and reducing costs, but also, in terms of fundamentally enabling tho- those new products. And so, if we think ahead over the next 25 years, you know, we've gone from a less than 0.1% success rate to a close to 20% success rate in a year. Well, who's to say that in another year, that it can't be a 50 plus or a one- even a close to 100% success rate? I think if you see our mini protein results, we are a, a, a, uh, I think close to 70%, uh, on those with pICAMola affinities, like, really, really tight binders for every single target that we tested. So, uh, all five targets that we tested worked and 70% of the designs that we ordered worked. I think that there's no reason that other class of molecules, th- th- those success rates can't be that high as well, and I think o- once you, you have that, you really enter this, this era where you sort of have a computer-aided design suite for molecules in a way that, you know, we have maybe SolidWorks for, uh, mechanical engineering or we have Photoshop for creat- creatives and, uh, that, that entire software suite will exist for, um, for biology. I think the, the implications of that, the ability to design, program, understand the interactions between atoms and, and molecules at the most fundamental level are pretty vast and should just give us a lot of, a lot of hope and excitement about what's, what's about to happen. We were just talking last night about maybe we should be getting baseball caps saying, uh, that say, "Bullish on biotech," on them, uh, because I think this is one of those special moments which I think can really... We, we've heard from many others writing in to the company that this has really shifted their opinion.

    7. SG

      If you think about going from antibodies to, you know, obviously better success rates and then also other therapeutics, is there a difficulty hierarchy we should have in our minds? Or is it just, like, unexplored space in terms of enzymes and peptides, small molecules, other

  10. 27:0831:37

    Optimizing Other Molecule Properties

    1. SG

      domains?

    2. JM

      Yeah, it's actually a lot more than just success rates. There's lots of properties that need to be optimized for a molecule. You know, finding a drug is like looking for a needle in a haystack, and I think we've really passed through massive swaths of, of that sequence space, uh, with CHI2, right? By really focusing on the things that bind. That's where, like, a lot of the search space gets... H- has to be searched in the lab today. Going deeper into other properties as well. Let's make sure that these antibodies can be manufactured well. Let's make sure that they can be really stable. So the- there's lots of other properties that, that we're excited about, so, so stay tuned for that. And then the other thing is actually, there are next-generation antibody formats even. So, what we predict will happen is, uh, people probably won't be as interested in, in the clinic for, for things like monoclonal antibodies. These are, you know, antibodies that are hitting, for example, like, a specific epitope on a, on a protein. But now, if we can make antibodies much faster and, and more easily, you can imagine a future where, if I wanna hit a target, let me choose two different parts of that target, make two different proteins that are hitting them, like, basically two different primitive antibodies and le- let me bring them together. Uh, this, this is called a biparatopic, two paratopes, so, so basically two different antibody interactions, and that kind of stuff is gonna become a lot easier to do today. I think these days, there's a lot of trade-offs that get made in biotech about, like, you know, risk on your target, uh, risk on your discovery process, uh, how hard is it gonna be to make our molecule, and I think AI is gonna raise the bar across the board. I think the bullish on biotech, uh, you know, movement (laughs) that, that, you know, Jack is, is announcing here as well, if we think about what that could even represent, uh, there's right now a lot of risk in biotech. There's a lot of crowding on the same kinds of targets. The risk...... this actually starts to go down in terms of discovering some of this stuff. Maybe there's still clinical risk if you try something that's, like, totally new that people haven't- haven't done before, but we've just, like, opened up, I think, the aperture of opportunities, uh, that- that can be pursued here, and that's something that I think is- is really exciting. So still a lot more work to do for us to validate that, like, all this is- is gonna be possible, but I think just the pace at which the field is moving, uh, just gives us a lot of optimism, uh, for- for what can be possible next.

    3. JD

      And maybe I can just share one anecdote about, uh, why we- we are so optimistic. We had a- a partner come to us as we were in the process of- of building these models. We didn't even really know... We hadn't- we hadn't had back our first few batches of data, so we didn't know if- if it was going to- to really work yet. This partner had been working on this problem for a- for a few years. They had a team of, I think, five to 10 people working on it. They estimated it fully loaded. All of those- those people might have, uh, set the company back with the experiments that they- they had done as well maybe five, 10, $10 million, and it was a- a problem where they wanted to build a molecule that cross-reacts against two different species, so both a human form and a cyno or a monkey form of- of this protein such that when they put this- this molecule into animal testing if, you know, they didn't want it to fail because the- the monkey has a slightly different version of that- that protein th- than the human does. So we were really struggling to get this to work, uh, uh, for whatever reason, and we put it into the model and just prompted the model to- to design for these- these two targets at the same time, not just one target. So you can imagine that this is a slightly more sophisticated challenge than just designing against one. We ordered actually only 14 sequences to the lab, and, uh, I think four of those were hits to humans. Uh, one of those was a hit to the cyno. One of them was actually overlapped and- and hit both. That one now allows us to move forward with that program and gives us a whole amount... a- a host of diversity around that molecule that one can explore as well.

    4. SG

      First of all, that's very cool. Uh, and second, I- I think it's interesting that a lot of industry observers would say, like, the bottleneck in pharma and the expense in pharma is clinical, not discovery. And, like, I think you're pointing to the fact, well, like, we can design for a clinic, right? And actually, it's intuitive, but it's just because it is an argument from people bearish in biotech or concerned about the ability to, uh, make progress in programs and, uh, e- and reduce cost f- for- for any given successful drug is... Well, you know, if discovery had less risk, as Josh was pointing out, which is, like, a- a huge claim, then the entire, uh, industry is more efficient, right? And more effective.

    5. JM

      That's the- that's the hope, yeah, and I think we've got a lot of reason to be o- optimistic. I also don't want to oversimplify things. You know, there's lots of other things that go into making a drug. There's capital markets that go into this. Uh, you know, there's- there's tons of clinical risk. This is really just the tip of the iceberg, but, uh, we're- we're really excited about- about the progress that- that this could represent.

  11. 31:3736:20

    Where Chai Invests From Here

    1. JM

    2. SG

      I wanna ask strategically, like, where CHAI invests from here. So you talked about other attributes that you want to be able to design in CHAI Models, but if we just look at this generically as a... like an AI model company, um, where do you think the defensibility is?

    3. JD

      There are two key areas of investment for- for the company. I- I- I think firstly, what comes out of these models, these just aren't drugs yet. They're- they're hits, they're antibody hits, but there's a lot more work to be done to actually turn these into viable molecules that we can put into humans. We have early data, uh, which we- we put in our pre-print to suggest that a lot of the properties that one might want from a drug, that these- these molecules actually have, but we need to do a lot more further characterization and assays to convince ourselves that w- we can do that. And then I think there's also the next step. Stage beyond that is actually designing entire drug candidates in zero-shot right out of the models. And I think a few months ago, we might have said this was a pretty futuristic idea and nobody in the- the company was really- really talking much about this, but I think once you see these results and grapple with the implications, the fact that we can get antibody hits in just 20 attempts, there's- there's no reason, uh, that- that we couldn't generate entire drug candidates in that same num- number of attempts. So I think there's gonna be some key investments there and really, you know, the- the model right now is a model. It's not really a- a product. It is a product and it's certainly useful today, but there's a lot better that product can- can get with- with more- more investment into just making sure that we can optimize all the therapeutic properties that people care about. And then oth... of course, there's the entire interface and software layer around that to make- make this- this really, um, easy to use, uh, and the real platform that- that goes around supporting- supporting that. So, you know, how do you, if you want to hit two targets, design a molecule that- that, uh, hits both? How do you specify that in the software? You know, this is gonna be a sufficiently a- advanced piece of software, um, it's gonna be... become, you know, as advanced as- as did Photoshop over- over time and as- as we build that out, I think we're gonna need to make some- some really core investments into, uh, just the- the engineering and the- the products to ensure that, uh, that- that we are building- building a software that we ourselves and others will- will really love to use.

    4. JM

      Yeah. One thing to add on to that as well, we released CHAI1 open source. We thought of it as a model, and I think CHAI2 is a lot more than a model, right? It's become a product. Uh, it's actually more of a- of a bigger pipeline that comes together to even make this happen, and it also becomes trickier to use these models. Protein folding, you put in your sequences, you get a structure. Design is a different story, right? Actually specifying the prompt on its own, we did that programmatically in the paper to go and assess this thing at scale, but a scientist who wants to use this to initiate a drug discovery program, probably not using a script, uh, to come up with that prompt. Uh, he's probably gonna be really thoughtful, uh, a- about it. And I think that's why investing in- in the product layer here is really important. And not to mention, it's only gonna get more complicated from here, right? As we start to support more advanced drug modalities, as there's various properties that come online, as the... we- we actually show some early evidence of this in the- in the white paper. You know, you might want to actually optimize for multiple proteins at the same time. Sometimes, uh, you know, actually, it's a- it's a good time to be a sick mouse. In order to have a human, uh, drug, it- it usually needs to work in animals as well. And sometimes drug programs actually get stuck there. It's like...Okay, guys. Like, we either have a mouse drug or a human drug. It's really hard to get both, and there are actually some cases where people have to discover two different drugs. They have a c- they call a surrogate antibody. "I'm going to, like, make the mouse version. I'm going to study that, convince the FDA that, like, this mechanism works." And then, but you're even taking risks. You're like, "Maybe this molecule, like, works, like, s- slightly differently." We literally show that example in the paper of optimizing, uh, we don't do mouse, we actually do monkey. Uh, so like, monkey and, and human together. Uh, but you can throw other species into that as well. Sometimes you've got the opposite problem, "I wanna hit this protein, I don't wanna hit this other protein." Uh, we've got some early evidence as of late, uh, that that's, that's possible as well. And, uh, these sorts of things, you know, the, the prompts are just a lot more complicated, an- and it means that you need to have, like, the right product. Uh, what happens when you start doing those experiments in the lab, we want the models to learn from that and then help us really be like a co-pilot in, in driving, like, the next stage of, of those designs as well. You know, all of this is, uh, again, it's, it's more than just the models. It's really thinking about those workflows as well, um, and it, it's even about just getting that word out to people and having them think about this as a new tool in their stack. What happens if you're an antibody engineer and you've been doing things in a certain way for the past 30 years, and now there's a new paradigm in discovering drugs? Like, that itself is actually a problem, uh, that, that a company needs to solve. So, these are all different areas that, that we're investing in right now.

  12. 36:2040:23

    What Bioscientists Should Learn for Chai-2

    1. JM

    2. SG

      That actually begs the question I was, um, going to ask you is, like, if you are an antibody engineer or a biologist today, what advice, you know, given... Let's say they believe you about how much is going to change and this, like, CAD for biology, like, software suite that is, um, coming into, in existence. Like, what should they learn, be good at, like, go study?

    3. JM

      Well, number one, get access to Try2.

    4. SG

      (laughs)

    5. JM

      Number two, you know, figure out how to, to get your prompts right and, uh, and actually take full advantage of it. And then I think number three, you know, start dreaming about, uh, the, the new possibilities. You know, it's interesting, uh, we've talked to a lot of, uh, antibody engineers since, since starting the company, and we've been alluding sometimes to, you know, what we're doing here, you know. Uh, sometimes you do the market research question, you ask, you know, "Suppose you had, like, a 1% success rate in designing antibodies. Like, what would you use that for?" Uh, the conversations are, are changing now that, first of all, it's not 1%, it's 10%, and like, people see that as working. I think that creativity is really being unlocked, even our- even ourselves, right? I think when people are thinking about the answer to that question, there's always some big doubt in your mind. It's like, "Ah, it's a hypothetical question," you know. Your neurons are now activating in the same way of doing something with it. It was the same thing with LLMs. It's like, imagine asking someone five, 10 years ago, "Oh, you know, if we could predict the next word in a sentence perfectly, like, what would you do with that?" It- it's actually very hard to imagine until you start playing with the models. Even the, even our team internally, you know, now, uh, even without sending it to the lab, you know, we can, again, choose some targets, choose some prompt, generate stuff against it. You start to look at the, uh, generations that are coming out of the model, and you're like, "Oh, wait, I can actually solve this problem by, like, choosing the right epitope on a target, choosing the right part of the target." Like, these two targets are different. Like, sure, we, we have an engine that the model can optimize for one, uh, or optimize for both, uh, you know, or selectivity optimize for one and not the other. But you can actually get a lot of that by, uh, by choosing your, your prompt in, in a smart way, so- so let me hit part of that protein that is actually quite different between the two things, or quite similar between the two things. Uh, these are the sorts of realizations that in retrospect are quite obvious, but they- they don't really hit you until you actually start to, like, use a product like this yourself. So, I think, uh, people are, are just, once they get their hands on this, uh, I think they will, th- they'll start to dream of the new possibilities.

    6. JD

      I think it just really raises the bar. You know, the people who are most excited about that are often the- these antibody engineers and these biologists. Uh, a lot of the, the work that they're doing today is painstaking (laughs) and they're not the biggest fan of these slow feedback loops and these intractable problems, 'cause many of them th- that we speak to are just really motivated to, to solve, solve a particular task. And so, you- you give them... Uh, you know, I- I'm an engineer. You give me a tool which says I have to write less code, I love that. I can now think more about system design and architecture and more complex products and, and all these other things, um, but, uh, it- it's really going to raise the bar for, for a lot of these people and, uh, I think people are only really now starting, as Josh said, to, to think through all the possibilities. I was on calls with people, uh, it was a matter of a few weeks ago, th- where people were saying, "When do you think this is gonna happen?" They say, "Oh, not for three to five years. This is a really futuristic idea." And then, then a couple of weeks later, you show them what they have, and they sort of fall off their chair. And so there's gonna be a, a sort of joint effort with us alongside these, these domain, real domain experts to actually find, figure out th- these key application areas 'cause biology is so vast and, and so complicated that actually there is so much knowledge that so many of the practitioners, the specialists have that, that, uh, no one company will just ever possess, uh, which is why we're so excited to go out and be, be partnering with, with people to, to really bring this to life.

    7. SG

      I wanna ask you a couple questions more just, uh, specifically about company building before we run out of time, and maybe Jack, I will start with, you're an amazing engineer and then you guys also have, like, a, a very software-oriented team working on biological problems. Some of those people come from, you know, long-term research in that space in particular, but for, for yourself, Jack, like, as you said, you're, you're a software person. How do you get up to speed on the bio area to go do leading work?

  13. 40:2343:38

    How Jack and Josh Oriented to the Biotech Space

    1. SG

    2. JD

      Well, I- I think it's two things. First of all, uh, ramping up on any new field is always just a, a total fight. Uh, you have to, to get to the frontier and to be... have read the right papers and to be knowledgeable about the, the areas that you need, you need to learn. You just have to sort of push your head down and push through, and there are, there are waves of, uh, excitement and misery in that experience, but you can get there fast if you really set, set your mind to it. And-

    3. JM

      Okay.

    4. JD

      ... I'd say the second part is that surrounding yourself with just the most incredible team is the best thing that you can do, far beyond anything that you can learn by yourself, and we have certainly the most special group of people that, that I've ever worked with within the company. Our, our co-founders, Matt McPartland and, and Jack, uh, Beautreau, who, who are, you know, just rare talents, uh, and then, you know, the entire team beyond that. Some of the former heads of AI at other drug discovery companies, some of the o- top open source contributors. The team is so, uh-... multi-talented. It's, it's small. There's, there's around a dozen people, but, but mighty. And, uh, uh, I think as we've seen in other areas, uh, of, of AI, small but mighty teams can go a, a, a really, really long, long way these days. And so, you know, there, there are... Uh, I think there are actually surprisingly few people on our, on our team even with a computer science degree. Josh himself got a chemistry degree. Uh, Alex got his PhD in physics. Uh, and a whole, whole... Y- uh... A, a whole host of others. But that, that, that, this work is so, um, uh, interdisciplinary, but really having that, that, um, breadth of, of knowledge across biology, chemistry, physics, artificial intelligence, computer science, engineering, it, it really takes a village. And everybody is, is learning from each other the, uh, the every day because of just how vast that subject matter is that one has to have a command of.

    5. JM

      I think we've also benefited from such immense focus as well. Everyone is so- has been so passionate about trying to solve this problem, and I think I really credit that to being a huge reason on like why we were able to achieve it. And we've also got a team that, because of that focus, is very engineering-centric as well. So if you look at the whole team, you know, we, we have a, a very research-oriented team right now, but everyone is a stellar engineer as well and takes that very seriously. So it's not everyone solving, you know, their favorite pet problem. We are all going in- after the same problem and solving that together. And, and, you know, even just 10 people solving a problem together, there's a lot of code being written every day. You have to be very thoughtful about how that all comes together and interacts. And I think especially in our next phase of growth for the company as we start to invest, you know, more and more in product and, and the, and the velocity around that and getting this in- into folks' hands, uh, that's just gonna become even more important. How do we make sure that the latest research breakthroughs that, that we're shipping internally are actually making their ways into partners' hands? That's something that, again, we are very thoughtful about at Chai, uh, and, and take very seriously.

    6. SG

      Yeah. I, I also remember, um, Jack in our office at Conviction like debating the merits of dev containers with some of your scientist teammates at the very beginning of the company. And both of you, from the beginning, you know, talked a lot about platform investment. And, and so I actually think that's like a little bit sort of unconventional in terms of such a research-oriented team to say like, "We need to make this platform investment." Can you talk a little bit about that?

  14. 43:3846:53

    Platform Investment and Chai-2

    1. SG

    2. JD

      Yeah. Uh, so I, I've gone through the experience of going from, you know, zero to 100 on, on engineer- large engineering products before I, I worked on, uh, Stripe Link, which was kind of a multi-year project, and again, Stripe Capital, where, uh, engineering teams scaling from zero to 25, 50 people by, by, by the time we were, were, w- were done there. Um, same, same for Link. May- maybe more. And I think you just learn that unless somebody is really taking care to keep the entire system in their head and is an effective technical steward of the architecture, that things just devolve and the sort of entropy of a software takes over and slows down your rate of progress to zero because nobody can, can get, get work done anymore. And so somebody needs to keep the, the entire system in, in their head and the interaction between all those components and make sure that people who are working on individual sub-components of your code base actually have to minimize the amount of context they need to load into their heads to understand how to accomplish that task. So, these are just the, the principles of really, you know, it's- it's pretty basic. It's just simplicity and modularity. But making sure that's a practice and a kind of cultural, uh, cultural practice and that everybody is on- on the same, same page about investing in that, uh, a- and that, you know, people aren't cutting corners. They, they, they see it as their responsibility to lay the groundwork for the next person. And this is doubly hard to do in deep learning code bases because often if you introduce a bug or write a regression, you won't know for weeks that that has sh- that that has shown up. You know, it's sort of terrifying, uh, because, you know, you could spend a million dollars on a training run with a bug that crept in four weeks ago. We've- we've literally had to do this i- in Chai's history, but we've had to go and bisect git history, run, launch training runs, you know, with a, with a sort of a binary search to identify a small enough range of pull requests to, to identify a bug, then go to that, that, that pull request to identify the, the bug. And, uh, I think it's those sorts of experiences and the cost of th- th- th- that, that... Finding that bug probably, I'm not sure if it was millions of dollars, but it was certainly tens of thousands of dollars of, of compute time to go back and find that thing. It, it's experiences like that which I think make rigor such an important practice in engineering rigor in the company. And so being rigorous about... Uh, I, I think, you know (laughs) , some people are surprised to learn that even though we're- we, we do deep learning that we are pretty rigorous about writing unit tests for everything. Uh, but I think these basic software engineering practices are actually sorely lacking from most research code bases. And so bringing in some of those basic principles has, uh, moved, uh, allowed us to move very fast and not just fast in the short term, but should give us a, a, a mechanism to compound on that investment over, over time.

    3. SG

      Well, it's overall very aligned with your just mission of term biology from science to engineering, right? It makes sense that it would go through the core of the company's practice too. Um, I have two more questions before we run out of time here. The first is, you know, you talk about the expense of like training experiments. Like, what's your decision framework for like how quickly to scale compute or, you know, paralyze, uh, experiments

  15. 46:5348:21

    Scaling Chai Discovery

    1. SG

      here?

    2. JM

      Yeah. We tried to set up the company in a pretty scrappy way. Actually, when we were getting started, uh, we should have even talked about this as well, you know, we, the company wasn't even... We, we're based in San Francisco. It wasn't even clear the company would be in San Francisco when we started. And, you know, back then we hadn't like raised capital for, uh, for the company really yet. Uh, we were kind of like using free compute, uh, credits fr- from the cloud providers. Um, I think for us, it's, it's just about, uh, being, again, laser focused on solving the problem and just like really making the case, like, why are we doing something? And, and I think that, you know, if that's reasonable, we'll go and invest in it. Again, in an engineering problem, if it's, if it's, uh, you know, kind of clear you're seeing signs of life, you're seeing some scaling law or whatever it is, like let's go as fast as possible and make that work. But let's also not get distracted like scaling something out if we are not convinced that, that it's going to work yet. Uh, so I think that, you know, kind of scrappy culture on the, you know, where are we spending side?It kind of goes hand-in-hand with, with making really fast progress because it means we have a high bar for, like, where we're spending our time. Everyone on the team works extremely hard. You know, there's, uh, there's, there's people in the office, like, you know, all times of day, all times of night, uh, and it's, it's pretty beautiful to see that. So we work hard, but I think we also work really smart, and, uh, and I think you have to do that, uh, to, to make progress with how fast the field is moving right now.

    3. SG

      You now see signs of life. You know, you're very, you're very bullish on biotech. That also means, like, given that you are, uh, going to try to scale to support, you know, demand from the industry and your own, your own efforts, um, who are you

  16. 48:2149:09

    Hiring at Chai Discovery

    1. SG

      looking to hire now?

    2. JM

      We're really hiring across all functions right now. So we've done s- made some really big breakthroughs here on, on the AI research side, and as we take that to the next level and, and try to get Chi 2 in, in, in front of the, the right partners, we're hiring for product engineering, for antibody engineering, for business development, uh, account executive. Like, there's a, uh, there's a whole host of, of roles that are, that are open on, on our site right now. And again, thi- this work is extremely interdisciplinary, and we, we really wanted to build this, uh, in a thoughtful way, uh, so that we can make, you know, Chi 2 as, as useful as possible for the industry.

    3. SG

      Well, thanks for doing this, guys, and congratulations on, you know, progressing the frontier of, of, uh, AI discovery.

    4. JD

      Thanks so much for having us on, Sarah. It's, uh, it's, it's been really fun.

    5. JM

      Thank you, Sarah.

  17. 49:0949:27

    Conclusion

    1. JM

    2. SG

      (music) Find us on Twitter @nopriorspod. Subscribe to our YouTube channel if you wanna see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way, you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.

Episode duration: 49:27

Install uListen for AI-powered chat & search across the full episode — Get Full Transcript

Transcript of episode rFFi2Guv2nU

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome