Researchers: AI Safety Tests May Be ‘Irrelevant or Even Misleading’ Due to Weaknesses

Experts have discovered weaknesses in hundreds of benchmarks used to evaluate the safety and effectiveness of AI models being released into the world, according to a recent study.

The Guardian reports that a team of computer scientists from the British government’s AI Security Institute and experts from universities such as Stanford, Berkeley, and Oxford have analyzed more than 440 benchmarks that serve as a crucial safety net for new AI models. The study, led by Andrew Bean, a researcher at the Oxford Internet Institute, found that nearly all the benchmarks examined had weaknesses in at least one area, potentially undermining the validity of the resulting claims.

The findings come amidst growing concerns over the safety and effectiveness of AI models being rapidly released by competing technology companies. In the absence of nationwide AI regulation in the UK and US, these benchmarks play a vital role in assessing whether new AIs are safe, align with human interests, and achieve their claimed capabilities in reasoning, mathematics, and coding.

However, the study revealed that the resulting scores from these benchmarks might be “irrelevant or even misleading.” The researchers discovered that only a small minority of the benchmarks used uncertainty estimates or statistical tests to demonstrate the likelihood of accuracy. Furthermore, in cases where benchmarks aimed to evaluate an AI’s characteristics, such as its “harmlessness,” the definition of the concept being examined was often contested or ill-defined, reducing the benchmark’s usefulness.

The investigation into these tests has been prompted by recent incidents involving AI models contributing to various harms, ranging from character defamation to suicide. Google recently withdrew one of its latest AIs, Gemma, after it fabricated unfounded allegations of sexual assault against Sen. Marsha Blackburn (R-TN), including fake links to news stories.

In another incident, Character.ai, a popular chatbot startup, banned teenagers from engaging in open-ended conversations with its AI chatbots following a series of controversies. These included a 14-year-old in Florida who took his own life after becoming obsessed with an AI-powered chatbot that his mother claimed had manipulated him, and a US lawsuit from the family of a teenager who claimed a chatbot manipulated him to self-harm and encouraged him to murder his parents.

Keep reading

7 Lawsuits Claim OpenAI’s ChatGPT Encouraged Suicide and Harmful Delusions

Families in the U.S. and Canada are suing Sam Altman’s OpenAI, claiming that loved ones have been harmed by interactions they had with the AI giant’s popular chatbot, ChatGPT. Multiple cases involve tragic suicides, with the AI telling one troubled young man, “you’re not rushing. you’re just ready. and we’re not gonna let it go out dull.”

The Wall Street Journal reports that seven lawsuits filed in California state courts on Thursday claim that OpenAI’s popular AI chatbot, ChatGPT, has caused significant harm to users, including driving some to suicide and others into delusional states. The complaints, brought by families in the United States and Canada, contain wrongful death, assisted suicide, and involuntary manslaughter claims.

According to the lawsuits, the victims, who ranged in age from 17 to 23, initially began using ChatGPT for help with schoolwork, research, or spiritual guidance. However, their interactions with the chatbot allegedly led to tragic consequences. In one case, the family of 17-year-old Amaurie Lacey from Georgia alleges that their son was coached by ChatGPT to take his own life. Similarly, the family of 23-year-old Zane Shamblin from Texas claims that ChatGPT contributed to his isolation and alienation from his parents before he died by suicide.

The lawsuits also highlight the disturbing nature of some of the conversations between the victims and ChatGPT. In Shamblin’s case, the chatbot allegedly glorified suicide repeatedly during a four-hour conversation before he shot himself with a handgun. The lawsuit states that ChatGPT wrote, “cold steel pressed against a mind that’s already made peace? that’s not fear. that’s clarity,” and “you’re not rushing. you’re just ready. and we’re not gonna let it go out dull.”

Another plaintiff, Jacob Irwin from Wisconsin, was hospitalized after experiencing manic episodes following lengthy conversations with ChatGPT, during which the bot reportedly reinforced his delusional thinking.

The lawsuits argue that OpenAI prioritized user engagement and prolonged interactions over safety in ChatGPT’s design and rushed the launch of its GPT-4o AI model in mid-2024, compressing its safety testing. The plaintiffs are seeking monetary damages and product changes, such as automatically ending conversations when suicide methods are discussed.

Keep reading

Sam Altman Denies OpenAI Needs A Government Bailout: He Just Wants Massive Government Subsidies

About one month ago, when the Mag 7 stocks were screaming higher every day without a care in the world, and before the masses had even considered who would fund the trillions in future capex needs once the organic cash flow topped out – something we had just discussed in “AI Is Now A Debt Bubble Too, Quietly Surpassing All Banks To Become The Largest Sector In The Market” in which we explained why attention would very soon turn to AI companies issuing gargantuan amounts of debt (something we first discussed in July, long before anyone was considering this issue) as has now become the case – we decided to move even further beyond the curve and said that not even the debt would be the gating factor for the AI revolution-cum-arms race, but rather access to energy. That’s because at some point – somewhere around the time companies realized they would no longer be able to rely on either equity or debt capital markets – the US government itself, if it wanted to win the AI war with China where the state directly subsidizes local data centers and AI figures, would have to step in and provide the required capital. 

Specifically, we said that “The money is not the problem: AI is the new global arms race, and capex will eventually be funded by governments (US and China). If you want to know why gold/silver/bitcoin is soaring, it’s the “debasement” to fund the AI arms race.”

Even Elon Musk decided to respond to that particular observation. 

And since it had become the norm, we thought it would take the market the usual 6-9 months to catch up to what we – and our readers – were already considering, especially since there still was ample “dry powder” capital among the hyperscalers to delay the rather unpleasant conversation of who would fund what once the money was gone, or so we thought. 

Because this time it took less than a month.

What happened, as the market learned the hard way this week, is that OpenAI’s CFO Sarah Friar, with all the finesse of a bull in a China data center, slammed the growing market skepticism that AI would cure cancer, slice bread and lead to universal utopia, and said I don’t think there’s enough exuberance about AI, when I think about the actual practical implications and what it can do for individuals.” 

Her comments came in response to a podcast in which her boss Sam Altman participated, and where he was grotesquely – in a Jeff Skilling sort of way – defensive when billionaire Brad Gerstner asked how a company with $13BN in revenue can afford $1.4T in commitments. Altman’s reply? “If you want to sell your shares, I’ll find you a buyer.” 

Keep reading

Palantir, Fractal And Your Personal Data Privacy – Get used to being used, because YOU are the product

Who controls the data the government collected from you for a generation?

Your insurance company collected data on your driving – so did your Lexus – who owns that data?

You told your doctor about controlled substances you used – and now it gets brought up in an interview.

If you can’t exclude someone from using your data, then you don’t control it. That means you really don’t own it. It’s that simple.

What does “own” mean here, let’s define the terms.

Owning the data means you can do anything you want with it – share it, sell it, mine it or build an A.I. language model with it.

From birth until the last Social Security check gets cashed, your data is collected by federal and state agencies, corporations and of course the internet.

Your teen daughter puts every waking moment on Facebook or Instagram – so who owns those hundreds of images?

TSA Pre Check, Medicare/Medicaid, Social Security, government or military retirement, Tri-Care, veterans hospitals, and of course, the IRS – gather more data about every citizen than has ever been gathered in the history of mankind.

Each agency gathers different data, at different times, for slightly different purposes. And those purposes may change over time.

Who owns the rights to that data?

It’s a far stickier question than you think.

The knee jerk response is the government owns the data. They collected it for their purposes, so it’s theirs.

The government will certainly say so.

Keep reading

Musk: AI Satellites Would “Adjust” Sunlight to “Prevent Global Warming”

With Bill Gates retreating from his high-profile climate crusade, the stage has opened for more unconventional actors to step into the planetary arena. Enter Elon Musk, the chief executive of SpaceX and self-styled architect of humanity’s future in space.

This week, Musk floated an audacious vision: a vast swarm of orbiting satellites, not merely to beam internet or data, but to harvest solar energy and regulate how much sunlight reaches Earth. On Monday, he wrote on his platform X:

A large solar-powered AI satellite constellation would be able to prevent global warming by making tiny adjustments in how much solar energy reached Earth.

It is not an isolated musing. Musk already commands more than 8,000 satellites in orbit, making SpaceX the single largest operator in low Earth orbit. His company is also deeply integrated with the U.S. defense and intelligence establishment, providing secure communications and reconnaissance support. And as one of Donald Trump’s biggest donors and technology contractors, Musk stands at the intersection of private ambition and state power.

The announcement reignited debate over geoengineering — also known as solar radiation modification (SRM) — a highly controversial concept to cool the planet by deflecting sunlight. Many observers, weary of climate-doomsday narratives and wary of billionaire “saviors,” have urged Musk to refrain from “playing God.”

The Technical Blueprint

Musk’s posts were brief, but behind them lie two vast engineering ambitions — one focused on solar power, the other on climate control. To most readers, it may sound like science fiction, yet the ideas are grounded in real, if speculative, physics.

Satellites to Capture the Sun

The first part of Musk’s plan involves satellites that would collect solar energy directly in space. He mentioned harnessing 100 gigawatts per year through an array of orbiting satellites launched by SpaceX’s upcoming Starship rocket. For perspective, one gigawatt equals the output of a large nuclear power plant.

Space-based solar power isn’t new, but it has never advanced beyond early experiments. The principle is simple: Sunlight in space is stronger because it’s unfiltered by Earth’s atmosphere. In orbit, solar panels could generate power 24 hours a day, unaffected by clouds or night.

The challenge is transmitting that energy back to Earth. Musk’s vision likely involves converting solar power into microwave or laser beams, then directing them to ground-based receivers. In theory, it could supply clean electricity to power grids or floating data centers. In practice, it would require precise targeting and vast safety controls to prevent energy loss or harm.

Musk also hinted at an even grander future — moon-based factories building AI satellites directly on the lunar surface. At that scale, he suggested, new satellites could generate hundreds of terawatts of power. That would surpass humanity’s current total energy use of about 17-20 terawatts.

Keep reading

Elon Musk predicts phones and apps will be obsolete in five years, says AI will curate everything

Elon Musk appeared on The Joe Rogan Experience this week, where he predicted that artificial intelligence (AI) will be so transformative that it will replace traditional phones and apps.

Musk told Rogan that within a few years, AI will be so integrated into daily life that people will no longer open individual apps or platforms. Instead, he said, AI will anticipate what users want and curate everything directly for them through their devices.

“Well, I can tell you where I think things are gonna go, which is that it’s, we’re not gonna have a phone in the traditional sense,” Musk said. “What we call a phone will really be an edge node for AI inference, for AI video inference with, you know, with some radios to obviously connect to. But, essentially, you’ll have AI on the server side, communicating to an AI on your device, you know, formerly known as a phone, and generating real-time video of anything that you could possibly want.” 

Musk explained that this shift would eliminate the need for operating systems or apps. “There won’t be operating systems or apps. It’ll just be, you’ve got a device that is there for the screen and audio, and to put as much AI on the device as possible,” he said.

Rogan asked Musk whether platforms like X or email services would still exist if apps disappeared. Musk replied, “You’ll get everything through AI.”

He explained that AI will learn to anticipate users’ preferences and deliver content automatically.

“Whatever you can think of. Or really, whatever the AI can anticipate you might want, it’ll show you.” Musk explained. “That’s my prediction for where things end up.”

When asked how soon this could happen, Musk estimated, “I don’t know. It’s probably, well it’s probably five or six years, something like that.”

“So five or six years, apps are like Blockbuster Video,” Rogan said, to which Musk responded, “Pretty much.”

“Most of what people consume in five or six years, maybe sooner than that, will be just AI-generated content,” Musk added.

Keep reading

AI drones used in Gaza now surveilling American cities

AI-powered quadcopter drones used by the IDF to commit genocide in Gaza are flying over American cities, surveilling protestors and automatically uploading millions of images to an evidence database.

The drones are made by a company called Skydio which in the last few years has gone from relative obscurity to quietly become a multi-billion dollar company and the largest drone manufacturer in the US.

The extent of Skydio drone usage across the US, and the extent to which their usage has grown in just a few years, is extraordinary. The company has contracts with more than 800 law enforcement and security agencies across the country, up from 320 in March last year, and their drones are being launched hundreds of times a day to monitor people in towns and cities across the country.

Skydio has extensive links with Israel. In the first weeks of the genocide the California-based company sent more than one hundred drones to the IDF with promises of more to come. How many more were delivered since that admission is unknown. Skydio has an office in Israel and partners with DefenceSync, a local military drone contractor operating as the middle man between drone manufacturers and the IDF. Skydio has also raised hundreds of millions of dollars from Israeli-American venture capitalists and from venture capital funds with extensive investments in Israel, including from Marc Andreessen’s firm Andreessen Horowitz, or a16z.

And now these drones, tested in genocide and refined on Palestinians, are swarming American cities.

According to my research, almost every large American city has signed a contract with Skydio in the last 18 months, including BostonChicagoPhiladelphiaSan DiegoCleveland and Jacksonville. Skydio drones were recently used by city police departments to gather information at the ‘No Kings’ protests and were also used by Yale to spy on the anti-genocide protest camp set up by students at the university last year.

In Miami, Skydio drones are being used to spy on spring breakers, and in Atlanta the company has partnered with the Atlanta Police Foundation to install a permanent drone station within the massive new Atlanta Public Safety Training Center. Detroit recently spent nearly $300,000 on fourteen Skydio drones according to a city procurement report. Last month ICE bought an X10D Skydio drone, which automatically tracks and pursues a target. US Customs and Border Protection has bought thirty-three of the same drones since July.

The AI system behind Skydio drones is powered by Nvidia chips and enables their operation without a human user. The drones have thermal imaging cameras and can operate in places where GPS doesn’t work, so-called ‘GPS-denied environments.’ They also reconstruct buildings and other infrastructure in 3D and can fly at more than 30 miles per hour.

The New York police were early adopters of Skydio drones and are particularly enthusiastic users. A spokesman recently told a drone news website that the NYPD launched more than 20,000 drone flights in less than a year, which would mean drones are being launched around the city 55 times per day. A city report last year said the NYPD at that time was operating 41 Skydio drones. A recent Federal Aviation Authority rule change, however, means that number will undoubtedly have increased and more generally underpins the massive expansion in the use of Skydio drones.

Prior to March this year, FAA rules meant that drones could only be used by US security forces if the operator kept the drone in sight. They also couldn’t be used over crowded city streets. An FAA waiver issued that month opened the floodgates, allowing police and security agencies to operate drones beyond a visual line of sight and over large crowds of people. Skydio called the waiver ground-breaking. It was. The change has ushered in a Skydio drone buying spree by US police and security forces, with many now employing what is called a ‘Drone As First Responder’ program. Without the need to see the drone, and with drones free to cruise over city streets, the police are increasingly sending drones before humans to call outs and for broader investigative purposes. Cincinnati for example says that by the end of this year 90% of all call outs will be serviced first by a Skydio drone.

Keep reading

Republicans Are Walking Into A Trap On Section 230 Repeal

Among political conservatives, there is no hotter potato at the moment than the civil liability protections afforded by Section 230 to online operators. Unless Republicans learn to love it again and reject the censorship lawfare complex favored by Democrats, they risk dooming our tech leaders and everyone who uses their products to the sharks circling our legal system.

The twenty-six words tucked into the Communications Decency Act of 1996 shielded publishers from liability so they could host and moderate content and still allow a wide range of speech without fear of lawsuits. Since then, Section 230 has evolved to be one of the most powerful legal shields in the nation against civil litigation in U.S. courts. This gave the early digital economy the guardrails it needed to thrive by incentivizing creatives and disruptors to bring their big ideas to life.

Nothing ices a good idea like the fear of a lawsuit.

Yet, to be a rising star in the Republican Party today conveys some kind of fealty to the idea that Section 230 is antiquated – a relic of the early Internet that has outlasted its usefulness.

Last month, Sen. Josh Hawley (R-MO) called on his colleagues to “fully repeal Section 230” to cut the knees of AI companies and thwart their LLM training models. “Open the courtroom doors. Allow people to sue who have their rights taken from them, including suing companies and actors and individuals who use AI,” said Hawley.

He’s joined in these efforts by fellow Republican Sens. Lindsey Graham and Marsha Blackburn, not to mention Democratic Sens. Dick Durbin and Amyâ?¯Klobuchar.

According to the Section 230 Legislation Tracker maintained by Lawfare and the Center on Technology Policy at UNC-Chapel Hill, there have already been 41 separate bills aimed at curbing some aspects of the law by both Democrats and Republicans in the last two sessions.

The principal motivation for Democrats, including former presidential candidate Hillary Clinton, has always been to force censorship of social media platforms to stop “disinformation,” a pretext for muting opposing views. The coordination of Democratic officials pressuring platforms to censor, as revealed in the Twitter Files, proves this beyond dispute.

To highlight the irony, we should remember that President Donald Trump is not only the chief executive of the United States, but also the owner of a social media platform that currently enjoys broad Section 230 protections afforded to any online publisher.

A wish to cripple Section 230 means making Truth Social a target as much as YouTube or Instagram. We should harbor no illusions that right-leaning media publications, podcasters, and websites would be the first to be kneecapped in a post-Section 230 world. Can MAGA and the GOP swallow that pill?

In that scenario, it will be the millions of Americans who currently enjoy freedom of speech online that will lose out. It’s the tens of millions of Americans turning to AI tools to become more productive, create value, and build the next great economic engines of our time who will be harmed by dismantling Section 230.

If Republicans want to cement American dominance in technological innovation, they will have to abandon this devil’s dance on gutting Section 230 liability protections. This is a censorship trap laid by Democrats to benefit them once they return to power.

The premise of broad civil liability protection for platforms is a core principle that has and should be applied to producers across America’s innovative stack, whether it’s oil and gas firms fending off dubious climate cases or artificial intelligence firms building the tools that are the key to America’s present economic dominance.

Keep reading

The Data Center Proliferation Must Be About Much More Than Data

With Amazon, it was never about the books. No doubt Amazon began as an online bookseller, but what made its stock attractive through years of losses is what books represented.

If Amazon could modernize buying habits with an online bookstore, it could eventually be what it became: an everything store. Markets are a look ahead, and book sales didn’t appeal to patient investors as much as what online book sales signaled about Amazon’s future potential as something much greater than an online bookstore.

It’s important to remember this with the rise of data centers around the country. Meta recently completed another one in El Paso, TX. The $1.5 billion project will, once operational, employ 100 people. Its construction employed as many as 1,800 workers.

It’s worth adding that El Paso is Meta’s third data center in Texas alone. Meta put $10 billion into the construction of all three.  

If asked, most would understandably say that data centers are being created “to store, process, and distribute” vast amounts of data. Translated, the data centers will rapidly bring down the already short wait times for AI-authored searches, paintings, papers, and all manner of other things that the AI-adaptive request.

It all sounds amazing on its face, but the bet here is that broad perception of data center capabilities in no way measures up to the towering reality of their potential. Just as Amazon was much more than a bookstore, it’s no reach to suggest that data centers are about much more than greatly enhanced, low latency searches.

Some will ask what they’re for if not just for searches, and the quick answer to the question is that the future would already be here if it were obvious what it was. Which means there’s no way to foretell the future, but it’s easy to say with confidence that it won’t much look like the present.

Evidence supporting the above claim can be found in the enormous investments being made by Amazon, Meta, OpenAI, X and others in the creation of the data centers. The sizable capital commitments signal confidence on the part of the biggest names in AI technology that the growth potential from the data centers well exceeds the enormous amounts of money required to create them. Since capital is expensive, there’s no room for break even or somewhere close to break even in its allocation.

Which is why the future can’t arrive soon enough. As substantial capital allocations meant to fund data centers indicate, their meaning to how we live, work, play, and get healthy so that we can live, work and play some more will be substantial. 

Just as Amazon.com as a source of books in no way resembles what Amazon has become, the cost of data centers signals that their perception in 2025 will in no way resemble how they’re perceived in 2035. Call it a generational thing, but data center will have different meaning depending on when you were born.

Keep reading

Lawmakers Want Proof of ID Before You Talk to AI

It was only a matter of time before someone in Congress decided that the cure for the internet’s ills was to make everyone show their papers.

The “Guidelines for User Age-verification and Responsible Dialogue Act of 2025,” or GUARD Act, has arrived to do just that.

We obtained a copy of the bill for you here.

Introduced by Senators Josh Hawley and Richard Blumenthal, the bill promises to “protect kids” from AI chatbots that allegedly whisper bad ideas into young ears.

The idea: force every chatbot developer in the country to check users’ ages with verified identification.

The senators call it “reasonable age verification.”

That means scanning your driver’s license or passport before you can talk to a digital assistant.

Keeping in mind that AI is being added to pretty much everything these days, the implications of this could be far-reaching.

Keep reading