Skip to content
OpenAIOpenAI

Shaping model behavior in GPT-5.1— the OpenAI Podcast Ep. 11

What does it mean for an AI model to have "personality"? Researcher Christina Kim and product manager Laurentia Romaniuk talk about how OpenAI set out to build a model that delivers on both IQ and EQ, while giving people more flexibility in how ChatGPT responds. They break down what goes into model behavior and why it's an important, but still imperfect blend of art and science. Chapters - 00:00:43 — GPT-5.1 goals and the shift to reasoning models - 00:02:18 — Differences between GPT-5 and GPT-5.1 - 00:04:55 — Unpacking the model switcher - 00:07:24 — Understanding user feedback - 00:08:27 — Measuring progress on emotional intelligence - 00:10:02 — What is model personality? - 00:14:25 — Model steerability, bias, and uncertainty - 00:21:59 — Advantages of memory in ChatGPT - 00:25:27 — Looking ahead and advice for getting the most out of models

Andrew MaynehostLaurentia RomaniukguestChristina Kimguest
Dec 1, 202528mWatch on YouTube ↗

At a glance

WHAT IT’S REALLY ABOUT

GPT-5.1 makes ChatGPT reasoning-first, warmer, and more steerable today

  1. OpenAI discusses GPT-5.1’s central shift: all ChatGPT-tier models are now reasoning models that can choose when to “think” more deeply, improving performance broadly (not just on overtly hard problems).
  2. They unpack why users perceived GPT-5 as colder or less intuitive—often due to system-level factors like insufficient carried context, jarring auto-switching between response modes, latency tradeoffs, and inconsistent adherence to custom instructions.
  3. The guests frame “personality” as both response style (tone/format traits) and the whole product experience (“the harness”), including memory, UI, switching behavior, and multimodal performance.
  4. They describe the ongoing challenge of maximizing user freedom while minimizing harm, emphasizing nuanced safe-completions over blanket refusals, and a future where personalization is increasingly inferred—but remains transparent and user-controlled.

IDEAS WORTH REMEMBERING

5 ideas

GPT-5.1 normalizes reasoning as the default experience.

All chat models being reasoning-capable lets the system allocate more “thinking” only when needed, improving instruction following and answer quality across many everyday prompts—not just puzzles or math.

“Coldness” often comes from system design, not just model tone.

Short carried context, memory/config issues, and abrupt switching into a more clinical reasoning style can make the model feel less warm even if the underlying model is improved.

Auto-switching is a UX and evaluation problem as much as a modeling problem.

The switcher optimizes across signals like factuality and latency, but mismatched response styles in sensitive moments (e.g., health news) can feel jarring, requiring careful tuning and UI guidance.

Steerability is the antidote to model quirks users dislike.

Users tolerate idiosyncrasies if they can reliably say “stop” via custom instructions or style/trait controls—so 5.1 focuses on carrying those instructions forward more consistently.

“Personality” is an overloaded term spanning the whole ChatGPT experience.

Beyond tone and verbosity, users experience personality through latency, UI feel, memory behavior, rate limits/model fallbacks, and how seamlessly text/voice/image features work together.

WORDS WORTH SAVING

5 quotes

“For the first time ever, all of the models in chat are reasoning models.”

Christina Kim

“The model right now can decide to think.”

Christina Kim

“Personality… for most of our users… is the whole experience of the model.”

Laurentia Romaniuk

“Part of the art here is figuring out how to pull out these quirks… without breaking steerability.”

Laurentia Romaniuk

“Intelligence too cheap to meter?”

Christina Kim

Reasoning-by-default and adaptive “thinking”Auto-switching between models and response-style discontinuitiesWarmth/EQ as context, memory, and user-intent understandingCustom instructions consistency and steerabilityPersonality as style/tone vs full product “harness”Safety evolution: refusals vs safe completionsFuture personalization: inferred expertise + user control

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

Add to Chrome