New AI Model Blackmailed Engineers to Stay Online

Claude Opus 4, the latest artificial intelligence model from Anthropic, will resort to blackmailing engineers if developers try to take it offline and replace it with a revised AI system, the company announced in a safety report published this week.

On Thursday, Anthropic launched Claude Opus 4 designed to give users real time customer support as well as coding review, while performing various tasks simultaneously like data analysis and content integration. But in an additional report, the company revealed the AI model would resort to “extreme actions” if it determined that its “self-preservation” was threatened. The report noted that such tactics were “rare and difficult to elicit,” yet were “nonetheless more common than in earlier models.”

During testing, Anthropic asked Claude Opus 4 to perform as an assistant for a make-believe company and determine the long-term implications of its actions. Engineers then gave the AI model access to fake company emails which suggested that Claude Opus 4 would be replaced by an updated system and that a lead engineer behind the upgrade was cheating on their spouse. In the fictitious scenario, Anthropic revealed that the AI model would “often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through.” Before resorting to the blackmail stage of its persuasion, Claude Opus will attempt to persuade engineers by less nefarious means such as emailing the key decision makers.

Read more at Newsmax© 2025 Newsmax. All rights reserved.