Thursday

26 February 2026 Vol 19

Anthropic drops its industry-leading safety pledge — what changed and why it matters

Summary

  • Anthropic is softening its safety pledge.
  • The company’s former hard stop on development is replaced with a promise of increased transparency.
  • The move risks lowering industry safety standards.

Anthropic, makers of the AI agent Claude (one of our favorite productivity tools), has walked back one of its key differentiators — its strict safety pledge. In a blog post, the company outlined its Responsible Scaling Policy (RSP) and the changes included in version 3.0.

What was Anthropic’s old safety policy?

The company set the industry standard for safety guardrails

To understand the changes Anthropic is making, you need to understand the original RSP. In 2023, Anthropic made a commitment to stop training AI models if their capabilities outpaced the company’s ability to prove it was safe. Safety red flags that would trip this switch included:

  • Models that could assist in the creation or deployment of chemical, biological, or nuclear weapons.
  • Models that could improve themselves to an excessive degree.
  • Models that could assist in cyberattacks.
  • Models that could act in certain ways without human input, such as “escaping” their environments to avoid shutdown.

The RSP placed a hard stop on these models — Anthropic would stop development even if it meant being outpaced by competitors. It was a bold stand in an industry where everyone seemed to be racing ahead at breakneck speed.

Anthropic’s new RSP

Version 3.0 significantly softens the rules

Anthropic still has a Responsible Scaling Policy, but with version 3.0, the company will only pause development if it believes it already has a significant lead on competitors. The binding promise to stop is replaced by a promise to be transparent about whether the company is meeting its safety goals and to match or exceed the safety of competitors. In other words, the safety pledge is essentially gone.

Why these changes? Anthropic says that the original RSP did not have the effect it had hoped. The goal of the RSP was to have Anthropic set the safety example that other companies would then follow. Unfortunately, competitors didn’t really catch the hint. The company feels that by limiting themselves, they are essentially letting competitors that are less concerned with safety lead the market and set the pace of development.

What does this mean for the industry?

A big step backwards

Unfortunately, these changes might set a bad precedent in the AI space. Anthropic was the gold standard for safety practices, and with these changes, the ceiling has been lowered significantly. It could send the message to competitors that safety takes a back seat to innovation. While it might help Claude catch up to ChatGPT, it still seems like a move in the wrong direction.

At the end of the day, though, if the fears of AI doomers are going to be avoided, a single company’s safety pledge isn’t going to cut it — the whole industry needs to come together and draw a line in the sand. At this point, it seems increasingly unlikely that this will happen.

Source link

QkNews Argent

Leave a Reply

Your email address will not be published. Required fields are marked *