Training models to safely and sensitively navigate these topics is challenging. Our work here continues, but I'm extremely proud of the team's efforts (a collaborative effort between our safety, post-training, and data science teams!)

Johannes Heidecke

@JoHeidecke

Oct 27

🧵Today we’re sharing more details about improvements of the default GPT-5 model in responding to sensitive conversations around potential mental health emergencies and emotional reliance. These changes reflect the careful work of many teams within OpenAI and close consultation with experts - including more than 170 mental health clinicians.

Johannes Heidecke · Oct 27, 2025 · 6:21 PM UTC

Johannes Heidecke

@JoHeidecke

Oct 27

Full details are in our blog and system card addendum. This is challenging, important work, and I’m proud of the team’s efforts. We have more to do and will continue to share progress. openai.com/index/strengtheni… openai.com/index/gpt-5-syste…

Addendum to GPT-5 System Card: Sensitive conversations

This system card details GPT-5’s improvements in handling sensitive conversations, including new benchmarks for emotional reliance, mental health, and jailbreak resistance.

openai.com

Johannes Heidecke · Oct 27, 2025 · 6:21 PM UTC

Johannes Heidecke

@JoHeidecke

Oct 27

Alongside this update, we’re rolling out improvements to the Model Spec to make some of our longstanding goals more explicit: github.com/openai/model_spec… Defining what “ideal behavior” looks like in these settings is a complex and nuanced task. We found that experts agree with each other about the boundary of desired and undesired behavior in about 70% of cases.

model_spec/CHANGELOG.md at main · openai/model_spec

The OpenAI Model Spec. Contribute to openai/model_spec development by creating an account on GitHub.

github.com

Johannes Heidecke · Oct 27, 2025 · 6:21 PM UTC

Johannes Heidecke

@JoHeidecke

Oct 27

In addition to these safety improvements to the model, it’s also preferred by users overall. This is the result of close collaboration between safety and post-training teams.

Johannes Heidecke · Oct 27, 2025 · 6:21 PM UTC

Johannes Heidecke

@JoHeidecke

Oct 27

We’re also seeing the first external results validating our findings, such as these recent numbers from SpiralBench

Sam Paech @sam_paech

Oct 25

This is interesting. Gpt-5-chat-latest quietly shot to the top of spiral bench. This is the model served on chatgpt dot com, though I test it via api so we avoid any safety routing. I'm inferring this is due to the oct 3 update since they don't version this model.🤔

Johannes Heidecke · Oct 27, 2025 · 6:21 PM UTC

Johannes Heidecke

@JoHeidecke

Oct 27

(3) We worked with 170+ clinicians to shape taxonomies, training data, and evaluations. We ran a human eval with some of them and are again observing clear improvements.

Johannes Heidecke · Oct 27, 2025 · 6:21 PM UTC

Johannes Heidecke

@JoHeidecke

Oct 27

(2) In our new automated evals we see strong improvements compared to prior models. One notable improvement: GPT-5 is now more reliable in long conversations. In new, challenging tests based on real-world scenarios, we maintained 95%+ reliability. This is one of the toughest areas for LLMs and we’re making meaningful progress.

Johannes Heidecke · Oct 27, 2025 · 6:21 PM UTC

Johannes Heidecke

@JoHeidecke

Oct 27

Three distinct methods for measuring improvements show clear progress of how our model responds to users in distress. (1) In production traffic, we observe a 65-80% reduction of responses that don’t meet our desired behavior.

Johannes Heidecke · Oct 27, 2025 · 6:21 PM UTC

Johannes Heidecke

@JoHeidecke

Oct 27

We defined three areas where model responses matter most: (1) psychosis, mania and other mental health emergencies; (2) self-harm and suicide; and (3) emotional reliance. We created and refined detailed taxonomies to guide how ChatGPT should behave in sensitive conversations. We used these guidelines to teach the model to respond more appropriately and measure progress. Importantly, ChatGPT does not attempt to diagnose users, but looks for sensitive signals (like sleep deprivation) and responds with care.

Johannes Heidecke · Oct 27, 2025 · 6:21 PM UTC

Johannes Heidecke

@JoHeidecke

Oct 27

OpenAI

@OpenAI

Oct 27

Earlier this month, we updated GPT-5 with the help of 170+ mental health experts to improve how ChatGPT responds in sensitive moments—reducing the cases where it falls short by 65-80%. openai.com/index/strengtheni…

261

604

Jason Kwon · Oct 24, 2025 · 4:52 PM UTC

Johannes Heidecke retweeted

Jason Kwon

@jasonkwon

Oct 24

One of the greatest opportunities in AI safety and security is the chance to help support the creation of new industry verticals that help maximize the benefits and minimize the risks. The technology is moving fast. One of the best ways to keep up is with more technology, more research, more startups and more entrepreneurship. An industrial ecosystem of builders, companies and solutions further democratizes AI to provide broad resilience, and ensures the US continues to lead as AI increasingly powers everything around us. As AI and biotech rapidly advance, biodefense is one of those verticals. We couldn’t be more excited to back @ValthosTech @kath_mcmahon and their team of ex-Palantir and ex-Deepmind engineers and operators, and world class computational biologists from the @broadinstitute and @arcinstitute. They are pushing the frontier of protection and defense in one of the most strategic intersections of multiple world-changing technologies, and with the team to do it. Also excited to partner with @foundersfund @Lux_Capital @Definition_Cap on this. Looking forward to more.

Valthos

@ValthosTech

Oct 24

Valthos builds next-generation biodefense. Of all AI applications, biotechnology has the highest upside and most catastrophic downside. Heroes at the frontlines of biodefense are working every day to protect the world against the worst case. But the pace of biotech is against them: more powerful methods to design biological systems, with near-universal access, open up an increasing surface area of threats. In this new world, the only way forward is to be faster. So we set out to build the tech stack for biodefense. Our team of computational biologists and software engineers applies frontier AI to identify biological threats and update medical countermeasures in real-time. We are backed by $30M from @OpenAI, @Lux_Capital, @foundersfund and others including @Definition_Cap. We are actively hiring engineers to join in the mission - if that sounds like you, get in touch.

140

138

686

Boaz Barak · Oct 10, 2025 · 9:58 PM UTC

Johannes Heidecke retweeted

Boaz Barak

@boazbaraktcs

Oct 10

If, like me, you were concerned seeing the stories on OpenAI subpoenas for non-profit organizations, please do read this. I've appreciated @jasonkwon's position during the SB-1047 debate encouraging OpenAI's employees to speak their mind even when it contradicts the company's position. Doubly so for SB-53, which OAI did not even oppose. Hence, while I wish vendors for serving subpoenas would stick to business hours, I am convinced this is not about attacking supporters of that bill. This is not to say that I necessarily agree with all of OpenAI's positions. I did not read SB-53 enough to form a well-founded position, but my impressions were positive, and my guess is that it is a positive development that it was enacted. I am hoping to see from openai more statements with a positive vision of what regulations and laws we will support. I was encouraged by us signing the EU code of practice, and would like to see more such actions in the future. Regulations are not the only tool for making sure AI goes well, but they are a crucial component.

Jason Kwon

@jasonkwon

Oct 10

There’s quite a lot more to the story than this. As everyone knows, we are actively defending against Elon in a lawsuit where he is trying to damage OpenAI for his own financial benefit. Encode, the organization for which @_NathanCalvin serves as the General Counsel, was one of the first third parties - whose funding has not been fully disclosed - that quickly filed in support of Musk. For a safety policy organization to side with Elon (?), that raises legitimate questions about what is going on. We wanted to know, and still are curious to know, whether Encode is working in collaboration with third parties who have a commercial competitive interest adverse to OpenAI. The stated narrative makes this sound like something it wasn’t. 1/ Subpoenas are to be expected, and it would be surprising if Encode did not get counsel on this from their lawyers. When a third party inserts themselves into active litigation, they are subject to standard legal processes. We issued a subpoena to ensure transparency around their involvement and funding. This is a routine step in litigation, not a separate legal action against Nathan or Encode. 2/ Subpoenas are part of how both sides seek information and gather facts for transparency; they don’t assign fault or carry penalties. Our goal was to understand the full context of why Encode chose to join Elon’s legal challenge. 3/ We’ve also been asking for some time who is funding their efforts connected to both this lawsuit and SB53, since they’ve publicly linked themselves to those initiatives. If they don’t have relevant information, they can simply respond that way. 4/ This is not about opposition to regulation or SB53. We did not oppose SB53; we provided comments for harmonization with other standards. We were also one of the first to sign the EU AIA COP, and still one of a few labs who test with the CAISI and UK AISI. We’ve also been clear with our own staff that they are free to express their takes on regulation, even if they disagree with the company, like during the 1047 debate (see thread below). 5/ We checked with our outside law firm about the deputy visit. The law firm used their standard vendor for service, and it’s quite common for deputies to also work as part-time process servers. We’ve been informed that they called Calvin ahead of time to arrange a time for him to accept service, so it should not have been a surprise. 6/ Our counsel interacted with Nathan’s counsel and by all accounts the exchanges were civil and professional on both sides. Nathan’s counsel denied they had materials in some cases and refused to respond in other cases. Discovery is now closed, and that’s that. For transparency, below is the excerpt from the subpoena that lists all of the requests for production. People can judge for themselves what this was really focused on. Most of our questions still haven’t been answered.

Jason Kwon · Oct 10, 2025 · 9:31 PM UTC

Johannes Heidecke retweeted

Jason Kwon

@jasonkwon

Oct 10

Nathan Calvin

@_NathanCalvin

Oct 10

One Tuesday night, as my wife and I sat down for dinner, a sheriff’s deputy knocked on the door to serve me a subpoena from OpenAI. I held back on talking about it because I didn't want to distract from SB 53, but Newsom just signed the bill so... here's what happened: 🧵

216

655

Johannes Heidecke · Oct 3, 2025 · 10:23 PM UTC

Johannes Heidecke

@JoHeidecke

Oct 3

Huge improvements for GPT-5 on dealing with sensitive and vulnerable situations. Will have more to share soon :)

OpenAI

@OpenAI

Oct 3

We’re updating GPT-5 Instant to better recognize and support people in moments of distress. Sensitive parts of conversations will now route to GPT-5 Instant to quickly provide even more helpful responses. ChatGPT will continue to tell users what model is active when asked. Starting to roll out to ChatGPT users today.

OpenAI · Sep 17, 2025 · 5:09 PM UTC

Johannes Heidecke retweeted

OpenAI

@OpenAI

Sep 17

Today we’re releasing research with @apolloaievals. In controlled tests, we found behaviors consistent with scheming in frontier models—and tested a way to reduce it. While we believe these behaviors aren’t causing serious harm today, this is a future risk we’re preparing for. openai.com/index/detecting-a…

Detecting and reducing scheming in AI models

Together with Apollo Research, we developed evaluations for hidden misalignment (“scheming”) and found behaviors consistent with scheming in controlled tests across frontier models. We share examples...

3,011

Johannes Heidecke · Sep 13, 2025 · 12:06 AM UTC

Johannes Heidecke

@JoHeidecke

Sep 13

Our safeguards for bio risk and agentic deployments were stress-tested by the US CAISI and UK AISI & we iterated together towards ever higher robustness and reliability: openai.com/index/us-caisi-uk…

Working with US CAISI and UK AISI to build more secure AI systems

An update on our collaboration with US and UK research and standards bodies for the secure deployment of AI.

openai.com

Xander Davies

@alxndrdavies

Sep 12

Excited to share details on two of our longest running and most effective safeguard collaborations, one with Anthropic and one with OpenAI. We've identified—and they've patched—a large number of vulnerabilities and together strengthened their safeguards. 🧵 1/6

Adam Tauman Kalai · Sep 5, 2025 · 6:59 PM UTC

Johannes Heidecke retweeted

Adam Tauman Kalai @adamfungi

Sep 5

New research explains why LLMs hallucinate, through a connection between supervised and self-supervised learning. We also describe a key obstacle that can be removed to reduce them. 🧵openai.com/index/why-languag…

Why language models hallucinate

OpenAI’s new research explains why language models hallucinate. The findings show how improved evaluations can enhance AI reliability, honesty, and safety.

openai.com

105

343

1,464

Johannes Heidecke · Aug 29, 2025 · 1:25 AM UTC

Johannes Heidecke

@JoHeidecke

Aug 29

One huge open question for safety and alignment remains "who should we align to". Exciting work towards figuring out answers.

Tyna Eloundou @ThankYourNiceAI

Aug 27

No single person or institution should define ideal AI behavior for everyone. Today, we’re sharing early results from collective alignment, a research effort where we asked the public about how models should behave by default. Blog here: openai.com/index/collective-…