At a glance
WHAT IT’S REALLY ABOUT
AI meets cryo-EM to reveal protein motions and discovery
- Zhong’s lab applies machine learning to inverse problems in structural biology, especially reconstructing protein structures and dynamics from noisy cryo-EM images.
- Cryo-EM’s recent surge parallels deep learning’s rise, with improved instrumentation enabling near-atomic resolution and creating new computational reconstruction challenges.
- A central theme is that proteins are dynamic molecular machines, so focusing only on static sequence-to-structure prediction misses functionally critical conformational changes.
- Experience across D. E. Shaw (reproducibility and simulation), DeepMind/AlphaFold (clean objectives and optimization), and academia (open-ended problems) shapes how Zhong frames research problems.
- Future progress will require not just better models but new experimental data sources and tight collaboration with experimentalists to bridge molecular insights to human health.
IDEAS WORTH REMEMBERING
5 ideasCryo-EM turns structural biology into a data-driven inverse problem.
Rather than directly “seeing” a single 3D structure, cryo-EM produces noisy 2D projections of many particle states; ML helps infer the underlying 3D structures and distributions of conformations.
Protein function often lives in motion, not a single structure.
Zhong emphasizes that proteins “jiggle” and shift between conformations to perform work, so understanding dynamics is essential for mechanistic biology and next-generation discovery.
Experimental measurements address a key weakness of pure simulation.
Compared with molecular dynamics, cryo-EM is compelling because it is grounded in observed data, reducing (though not eliminating) the burden of validating purely simulated motions.
Defining the objective is easy for some tasks and messy for others.
Structure reconstruction can often be framed with clearer likelihood/objective formulations, while design problems (e.g., protein design) can be harder to validate and may not reduce cleanly to a single metric.
Scaling from “folding solved” to real biology requires new methods and new data.
Even if sequence-to-static-structure is strong, many real cellular machines are massive multi-component assemblies with complex, poorly characterized structural space that likely needs new experimental technologies plus ML.
WORDS WORTH SAVING
5 quotesI think one major, uh, advance in structural biology over the last, you know, couple years is, okay, we- it's so hard to get a single structure, and we think of it as just this, like, static object, but in reality, like, everything is jiggling, everything is moving in order to actually perform functions that lead to life.
— Ellen Zhong
So on the machine learning side, the class of problems that our group works on are all these inverse problems. We have the experimental measurements, but they're actually extremely incomplete, right? So you have noisy 2D projection images, and somehow from this data you wanna infer the 3D structure.
— Ellen Zhong
That was one of the main reasons why I decided to go down the academic path, because I think there's a lot of really long-term just, like, uh, research directions and problems that we don't understand, and protein dynamics is one of those, right? We don't really have a good grasp on the motions of proteins in general, a way of describing it.
— Ellen Zhong
Will machine learning alone be able to do that? No.
— Ellen Zhong
I think to really bridge that gap, there's going to need to be new experimental technologies, and then I, I'm excited in the future about kind of how we can collaborate with the, kind of collaborate with experimentalists and develop new kind of machine learning enabled models for, um, you know, doing science.
— Ellen Zhong
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome