Anthropic Drops Its Huge Safety Pledge That Was Supposedly the Whole Point of the Company

In 2021, a splinter group of former OpenAI employees founded a new startup, Anthropic, to pursue building AI models with a renewed focus on safety, after feeling that their employer had gone astray. OpenAI itself was originally founded on beneficent principles and a commitment to transparency, but then took billions of dollars in investment from Microsoft and made its tech closed-source, prompting the exodus.

Now, Anthropic may be heading down the same path of its rivals. On Tuesday, it revealed a new version of its Responsible Scaling Policy that drops its core safety commitment first made in 2023: to stop training and refuse to deploy an AI system if it couldn’t guarantee it had proper safety guardrails in place that met stringent internal standards.

The new sentiment among the company’s leadership is that this has become an unneeded chain around its foot.

“We felt that it wouldn’t actually help anyone for us to stop training AI models,” Anthropic’s chief science officer Jared Kaplan told TIME in an interview. “We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments… if competitors are blazing ahead.”

The updated policy, it’s fair to say, flagrantly contradicts the organization’s entire raison d’etre. Anthropic has presented itself as the adult in the room in an industry dominated by outrageous boosterism and a flippant attitude towards ethics. Its carefully crafted safety-centric image is no better exemplified by CEO Dario Amodei’s mythologizing that in the summer of 2022, he made the call to abstain from releasing Anthropic’s powerful AI model he knew would change the world because he was too worried of its risks; months later, OpenAI released ChatGPT, and stole all the headlines.

So why the drastic reversal? Anthropic provides several reasons in its announcement. One of them is an “anti-regulatory political climate.” Amodei has long pushed for stronger AI regulations, an ambition that more or less went up in smoke once the Trump administration took charge. In particular, he criticized Trump’s attempt to impose a sweeping ban on states’ ability to pass their own AI regulation — meaning that AI companies would only be beholden to much weaker federal laws — earning Amodei frequent attacks by administration figures, who have accused him of fear-mongering.

And so with no robust legal framework forthcoming, there was nothing binding its competitors to play by the same rules that Anthropic purports to. That, Anthropic argues, means that any safety research and measures it conducted would by default be outdated as its the rest of the industry continued to build even more powerful models.

“If one AI developer paused development to implement safety measures while others moved forward training and deploying AI systems without strong mitigations, that could result in a world that is less safe,” it argued in its new policy. “The developers with the weakest protections would set the pace, and responsible developers would lose their ability to do safety research.”

Perhaps there’s a kernel of truth to that logic. But it’s a spurious justification for Anthropic to drop a central pillar of its safety act. Anthropic indeed cannot control what its competitors do, but is that reason to stop even pretending to lead by example? Regulatory climates, after all, can change. And calls for AI safety will not go away. Arguably, they’ll only mount as the industry’s contradicting promises become more obvious and the risks the technology poses become even more consequential.

The timing of the policy change can’t be overlooked, either. Ethical as it may purport to be, Anthropic enjoys a $200 million contract with the Pentagon it signed last summer to deploy Claude across the military. But that critical money faucet is now in jeopardy, as Trump officials reportedly threatened to cut off Anthropic over the company’s insistence that its tech shouldn’t be used for mass surveillance and autonomous weaponry. Defense secretary Pete Hegseth met with Amodei on Tuesday, and gave the CEO an ultimatum, Axios reported: lighten Anthropic’s AI safeguards to make them more amenable to the military, or the Pentagon will either cut off the company and declare it a “supply chain risk,” or invoke the Defense Production Act to force Anthropic to share its AI technology.

But if the new policy is a capitulation by Anthropic, Kaplan, the chief science officer and co-founder, doesn’t see it that way.

“I don’t think we’re making any kind of U-turn,” Kaplan told Time.

More on Anthropic: Anthropic Furious at DeepSeek for Copying Its AI Without Permission, Which Is Pretty Ironic When You Consider How It Built Claude in the First Place

The post Anthropic Drops Its Huge Safety Pledge That Was Supposedly the Whole Point of the Company appeared first on Futurism.

Related Posts