Skip to content
No PriorsNo Priors

No Priors Ep. 105 | With Director of the Center of AI Safety Dan Hendrycks

This week on No Priors, Sarah is joined by Dan Hendrycks, director of the Center of AI Safety. Dan serves as an advisor to xAI and Scale AI. He is a longtime AI researcher, publisher of interesting AI evals such as "Humanity's Last Exam," and co-author of a new paper on National Security "Superintelligence Strategy" along with Scale founder-CEO Alex Wang and former Google CEO Eric Schmidt. They explore AI safety, geopolitical implications, the potential weaponization of AI, along with policy recommendations. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @DanHendrycks Show Notes: 0:00 Introduction 0:36 Dan’s path to focusing on AI Safety 1:25 Safety efforts in large labs 3:12 Distinguishing alignment and safety 4:48 AI’s impact on national security 9:59 How might AI be weaponized? 14:43 Immigration policies for AI talent 17:50 Mutually assured AI malfunction 22:54 Policy suggestions for current administration 25:34 Compute security 30:37 Current state of evals

Sarah GuohostDan Hendrycksguest
Mar 5, 202536mWatch on YouTube ↗

At a glance

WHAT IT’S REALLY ABOUT

AI, Geopolitics, and Nuclear Parallels: Dan Hendrycks’ Safety Playbook

  1. Dan Hendrycks, director of the Center for AI Safety, argues that AI safety is primarily a geopolitical and strategic problem, not just a technical alignment issue. He believes labs can mitigate obvious misuse (e.g., terrorism, bio/cyber help) but cannot meaningfully control macro outcomes driven by state competition, especially between the U.S. and China. Hendrycks lays out a deterrence framework he calls “Mutually Assured AI Malfunction” (MAIM), drawing analogies to nuclear strategy and advocating for espionage, cyber-sabotage options, and chip-tracking regimes to prevent destabilizing AI ‘superweapons’ and rogue-actor access. He also discusses the current state of AI evaluations, explaining Humanity’s Last Exam as a near-terminal benchmark for exam-style tasks, and forecasts a future where models become superhuman oracles in STEM long before they become competent agents at everyday tasks.

IDEAS WORTH REMEMBERING

5 ideas

AI safety extends far beyond alignment and technical mitigations inside labs.

Hendrycks frames alignment as just one subset of ‘safety’; even perfectly obedient AIs can still drive destabilizing arms races, economic upheaval, and dangerous strategic competition between major powers.

Labs can and should implement straightforward misuse safeguards, but cannot solve the geopolitical problem.

He argues that companies can handle tail risks like casual bioterror queries (e.g., via gated enterprise access for sensitive capabilities), but global risk is ultimately governed by states, export controls, and national security strategy.

Trying to “race to superintelligence” as a U.S. advantage is strategically fragile.

Hendrycks criticizes strategies that assume the U.S. can safely monopolize superintelligence; he points to corporate espionage, multinational talent, and inevitable foreign responses as reasons such a unilateral advantage is unlikely and potentially destabilizing.

MAIM proposes deterring destabilizing AI ‘superweapon’ projects rather than all AI development.

By analogy to nuclear deterrence, MAIM relies on shared vulnerability: states use espionage and prepared cyber-sabotage options against each other’s data centers to dissuade attempts at building AI systems capable of delivering a decisive strategic edge.

Compute security and chip tracking are practical, near-term tools for nonproliferation.

He advocates for basic statecraft—licensing regimes, end-use checks, and knowing where advanced AI chips are—similar to fissile material tracking, especially to keep powerful compute from rogue actors, even if China’s capabilities can’t be fully constrained.

WORDS WORTH SAVING

5 quotes

Safety or making AI go well and the risk management is just much more of a broader problem. It's got some technical aspects, but I think that's a small part of it.

Dan Hendrycks

If you want to expose those [bio] capabilities, just talk to sales, get the enterprise account... We're not exposing those expert level capabilities to people who we don't know who they are.

Dan Hendrycks

If you do it voluntarily, you just make yourself less powerful and you let the worst actors get ahead of you.

Dan Hendrycks

You can't rely as much on restricting another superpower’s capabilities... You can restrict their intent, which is what deterrence does.

Dan Hendrycks

We're on track overall to have AIs that have really good oracle-like skills... but not necessarily able to carry out tasks on behalf of people for some while.

Dan Hendrycks

Distinction between AI safety and alignment, and limits of lab-led safetyGeopolitics of AI: U.S.–China competition, export controls, and espionageBiosecurity, cybersecurity, and other weaponization vectors for advanced AIThe MAIM (Mutually Assured AI Malfunction) deterrence concept and nuclear analogiesCompute security, chip export controls, and proliferation to rogue actorsStructural constraints on pauses, races, and “beat China to superintelligence” strategiesCurrent and future state of AI evals, including Humanity’s Last Exam and agentic tests

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome