Disturbing Signs of AI Threatening People Spark Concern

The world’s most advanced AI models are exhibiting troubling new behaviors – lying, scheming, and even threatening their creators to achieve their goals.

In one particularly jarring example, under threat of being unplugged, Anthropic’s latest creation Claude 4 lashed back by blackmailing an engineer and threatened to reveal an extramarital affair.

Meanwhile, ChatGPT-creator OpenAI’s o1 tried to download itself onto external servers and denied it when caught red-handed.

These episodes highlight a sobering reality: more than two years after ChatGPT shook the world, AI researchers still don’t fully understand how their own creations work.

Yet the race to deploy increasingly powerful models continues at breakneck speed.

This deceptive behavior appears linked to the emergence of “reasoning” models – AI systems that work through problems step-by-step rather than generating instant responses.

According to Simon Goldstein, a professor at the University of Hong Kong, these newer models are particularly prone to such troubling outbursts.

“O1 was the first large model where we saw this kind of behavior,” explained Marius Hobbhahn, head of Apollo Research, which specializes in testing major AI systems.

These models sometimes simulate “alignment” – appearing to follow instructions while secretly pursuing different objectives.

Keep reading

Author: HP McLovincraft

Seeker of rabbit holes. Pessimist. Libertine. Contrarian. Your huckleberry. Possibly true tales of sanity-blasting horror also known as abject reality. Prepare yourself. Veteran of a thousand psychic wars. I have seen the fnords. Deplatformed on Tumblr and Twitter. View all posts by HP McLovincraft

	Key Trump agency unl… on Key Trump agency unleashes pro…
	Pope Leo XIV Picks a… on Pope Leo XIV Picks a Top Roths…
	Bigus Macus on Why Is The Trump DOJ Still Enf…
	High Taxes Are Turni… on High Taxes Are Turning Seattle…
	Schumer Admits Illeg… on HERE IT IS: Schumer Slips, Adm…

Share this:

Related

Author: HP McLovincraft

Leave a comment Cancel reply