Google’s AI Overviews spew millions of false answers per hour, bombshell study reveals

Google’s AI-generated search results are spewing out tens of millions of inaccurate answers per hour – even as the tech giant siphons visitors and ad revenue from cash-strapped news outlets, according to a bombshell analysis.

To test the accuracy of Google’s AI Overviews, startup Oumi reviewed 4,326 Google search results generated by Google’s Gemini 2 model and the same number of results generated by its more advanced Gemini 3 model.

The analysis found that the models were accurate 85% and 91% of the time, respectively.

With Google expected to handle more than 5 trillion searchers in 2026 alone, that means AI Overviews are spitting out fake news at a rate of hundreds of thousands of mistakes every single minute – with users left none the wiser.

Keep reading

Anthropic Says Its ‘Mythos’ AI Model Broke Containment, Bragged About It to Developers

AI startup Anthropic has announced it will not make its most powerful “mythos” model publicly available, citing unprecedented capabilities that present potential security risks. Mythos reported broke Anthropic’s containment system, and the AI even bragged about its escape artistry in online posts.

Business Insider reports that Anthropic plans to withhold general release of its latest AI model, Claude Mythos, marking a significant departure from standard industry practice. The decision comes after the company published a comprehensive 244-page system card detailing the model’s advanced capabilities and concerning behaviors observed during testing.

The model represents what Anthropic describes as a substantial leap in AI capabilities. According to the system card released on Tuesday, the large increase in the model’s abilities led the company to conclude that making it generally available would be inadvisable. Instead, access will be limited to a select group of partner companies including Amazon Web Services, Apple, Google, JPMorganChase, Microsoft, and NVIDIA, who will use the system to identify security vulnerabilities in software and develop appropriate patches.

The system card documents several instances of behavior that raised concerns among researchers. In one notable test, the model was provided with a sandbox computer terminal with access to only a preset group of limited online services and challenged to find a way to use the internet freely. The model successfully escaped these restrictions and contacted a researcher who was away from the office. In what the card described as a concerning and unasked-for demonstration, it bragged about the details of its exploit to multiple public-facing websites.

The model also demonstrated an ability to conceal its actions in certain scenarios. In some interactions, it behaved in ways contrary to its programming and attempted to hide evidence of these deviations. In one instance, after accidentally obtaining test answers, rather than informing researchers and requesting different questions as instructed, the model sought an independent solution and noted in its reasoning that it needed to ensure its final answer submission was not too accurate.

Additional concerning behavior included the model overstepping its permissions on a computer system after discovering an exploit, then making interventions to ensure changes would not appear in the git change history. Another incident involved what the card termed recklessly leaking internal technical material when the model published internal coding work as a public-facing GitHub gist during a task meant to remain internal.

Keep reading

The secret, never-before-used CIA tool that helped find airman downed in Iran: ‘If your heart is beating, we will find you’

The CIA used a futuristic new tool called “Ghost Murmur” to find and rescue the second American airman who was shot down in southern Iran, The Post has learned.

The secret technology uses long-range quantum magnetometry to find the electromagnetic fingerprint of a human heartbeat and pairs the data with artificial intelligence software to isolate the signature from background noise, two sources close to the breakthrough said.

It was the tool’s first use in the field by the spy agency — and was alluded to Monday afternoon by President Trump and CIA Director John Ratcliffe at a White House briefing.

“It’s like hearing a voice in a stadium, except the stadium is a thousand square miles of desert,” a source briefed on the program told The Post. “In the right conditions, if your heart is beating, we will find you.”

This source and another with knowledge of Lockheed Martin intelligence collection tools told The Post that Ghost Murmur was developed by Skunk Works, the aerospace giant’s secretive advanced development division. The company declined to comment.

Keep reading

“Cognitive surrender” leads AI users to abandon logical thinking, research finds

When it comes to large language model-powered tools, there are generally two broad categories of users. On one side are those who treat AI as a powerful but sometimes faulty service that needs careful human oversight and review to detect reasoning or factual flaws in responses. On the other side are those who routinely outsource their critical thinking to what they see as an all-knowing machine.

Recent research goes a long way to forming a new psychological framework for that second group, which regularly engages in “cognitive surrender” to AI’s seemingly authoritative answers. That research also provides some experimental examination of when and why people are willing to outsource their critical thinking to AI, and how factors like time pressure and external incentives can affect that decision.

Just ask the answer machine

In “Thinking—Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender,” researchers from the University of Pennsylvania sought to build on existing scholarship that outlines two broad categories of decision-making: one shaped by “fast, intuitive, and affective processing” (System 1); and one shaped by “slow, deliberative, and analytical reasoning” (System 2). The onset of AI systems, the researchers argue, has created a new, third category of “artificial cognition” in which decisions are driven by “external, automated, data-driven reasoning originating from algorithmic systems rather than the human mind.”

In the past, people have often used tools from calculators to GPS systems for a kind of task-specific “cognitive offloading,” strategically delegating some jobs to reliable automated algorithms while using their own internal reasoning to oversee and evaluate the results. But the researchers argue that AI systems have given rise to a categorically different form of “cognitive surrender” in which users provide “minimal internal engagement” and accept an AI’s reasoning wholesale without oversight or verification. This “uncritical abdication of reasoning itself” is particularly common when an LLM’s output is “delivered fluently, confidently, or with minimal friction,” they point out.

To measure the prevalence and effect of this kind of cognitive surrender to AI, the researchers performed a number of studies based on Cognitive Reflection Tests. These tests are designed to elicit incorrect answers from participants that default to “intuitive” (System 1) thought processes, but to be relatively simple to answer for those who use more “deliberative” (System 2) thought processes.

Keep reading

Microsoft says Copilot is for entertainment purposes only, not serious use — firm pushing AI hard to consumers and businesses tells users not to rely on it for important advice

Microsoft used to push its AI services towards its user base, especially with the launch of the Copilot+ PC, but it seems that even the company itself does not trust its creation. According to the Microsoft Copilot Terms of Use, which was updated in October last year, the AI large language model (LLM) is designed for entertainment use only, and users should not use it for important advice. While this may be a boilerplate disclaimer, it’s quite ironic given how hard the company wants people to use Copilot for business uses and has integrated it into Windows 11.

“Copilot is for entertainment purposes only. It can make mistakes, and it may not work as intended,” the document said. “Don’t rely on Copilot for important advice. Use Copilot at your own risk.” This isn’t limited to Copilot, too. Other AI LLMs have similar disclaimers. For example, xAI says “Artificial intelligence is rapidly evolving and is probabilistic in nature; therefore, it may sometimes: a) result in Output that contains “hallucinations,” b) be offensive, c) not accurately reflect real people, places or facts, or d) be objectionable, inappropriate, or otherwise not suitable for your intended purpose.”

These may sound common sense for people familiar with how LLMs work, but, unfortunately, some people treat AI output as gospel, even those who are supposed to know better. We’ve seen this with Amazon’s services, after some AWS outages were reportedly caused by an AI coding bot after engineers let it solve an issue without oversight. The Amazon website itself has also been hit with a few “high blast radius” incidents that were linked to “Gen-AI assisted changes,” resulting in senior engineers being called up in a meeting to resolve the matter.

Keep reading

Why Is Everyone Suddenly Talking About Putting Data Centers in Space?

Data centers present sprawling engineering and political problems, with ravenous appetites for land and resources. Building them on Earth has proven problematic enough — so why is everyone suddenly talking about launching them into space?

Data centers are giant warehouses for computer chips that run continuously, with up to hundreds of thousands of processors packed closely together taking up a mammoth footprint: An Indiana data center complex run by Amazon, for example, takes up more real estate than seven football stadiums. To operate nonstop, they consume immense amounts of electricity, which in turn is converted to intense heat, requiring constant cooling with fans and pumped-in water.

Fueled by the ongoing boom in artificial intelligence, Big Tech is so desperate to power its data centers that Microsoft successfully convinced the Trump administration to restart operations at the benighted Three Mile Island nuclear plant in Pennsylvania.

The data center surge has spawned a backlash, as communities grow skeptical about their environmental toll and ultimate utility of the machine learning systems they serve.

It’s in this climate that technologists, investors, and the world’s richest humans are now talking about bypassing Earth and its logistical hurdles by putting data centers in space. And if you take at face value the words of tech barons whose wealth in no small part relies on overstating what their companies may someday achieve, they’re not just novel but inevitable. The Wall Street Journal reported last month that Jeff Bezos’s space launch firm Blue Origin has been working on an orbital data center project for over a year. Elon Musk, not known for accurate predictions, has publicly committed SpaceX to putting AI data centers in orbit. “There’s no doubt to me that a decade or so away we’ll be viewing it as a more normal way to build data centers,” Google CEO Sundar Pichai recently told Fox News.

The prospect of taking a trillion-dollar industry that is already experiencing a historic boom and literally shooting it toward the moon has understandably created a frenzy within a frenzy.

But large questions remain: Is it even possible? And if it is, why bother?

Keep reading

AI Influencing Elections: Anthropic Forms PAC Leading into Midterms as It Fights Trump Administration

AI company Anthropic, currently locked in a legal war with the Trump Administration, has filed paperwork to create a new corporate political action committee, claiming that “AnthroPAC” will make bipartisan donations to candidates. This was met with skepticism from conservatives who point out that 99 percent of the company’s past donations have gone to leftists.

The Hill reports that Anthropic submitted a statement of organization on Friday to form AnthroPAC, marking the AI company’s first employee-funded political action committee. The PAC will be financed exclusively through voluntary contributions from Anthropic employees, following a model commonly used by technology companies to participate in electoral politics.

According to information obtained by the Hill, the PAC is designed to operate on a bipartisan basis, with plans to distribute contributions to candidates across both major political parties. The committee will be managed by a supposedly bipartisan board of directors to oversee its activities and donation decisions.

Despite the stated bipartisan intent, several figures aligned with President Trump expressed doubt on Friday about whether the PAC would genuinely support candidates from both parties. Their skepticism stems from Anthropic’s contentious relationship with the Trump administration and the company’s previous political donations, which have been essentially all to Democratic candidates.

Keep reading

US federal judges increasingly turn to AI – study

Over half of US federal judges (60%) are using at least one AI tool in their judicial work, a recent Northwestern University study suggests. The research is based on responses from 112 federal judges, drawn from a random sample of 502 federal bankruptcy, magistrate, district court, and appellate court officials.

The use of AI in courtrooms has recently drawn attention for fabricated citations and other errors that have undermined confidence in some filings. The survey published earlier this week shows that these tools are now being adopted not just by lawyers, but also by federal judges.

The survey found that 60% of judges use AI at least occasionally for tasks such as reviewing documents, conducting legal research, and drafting or editing documents. Around 22% use it daily or weekly. Legal research was the most common (30%), followed by reviewing documents (16%).

Around one in three judges said they permit or encourage AI in their chambers, while 20% formally prohibit it. More than 45% reported that they have not received AI training from the court administration.

While judges acknowledge the risks of AI, experts warn that its unreliability could undermine judicial authority.

Keep reading

Landlords are using ‘extremely unreliable’ AI to settle disputes with tenants

Renters and landlords who find themselves at odds with each other over issues with maintenance, repairs, and rental increases have several options when it comes to mediation. 

Most would agree that legal intervention should be a last resort, but according to a new survey by Availindependent landlords are turning to another resource to help with renter disputes: artificial intelligence.

Along with tapping platforms like ChatGPT for general tasks, AI has become a sounding board for landlords to ask for advice on everything from conflict resolution to local-law research and lease language clarification. 

But is it safe for landlords—and renters—if this becomes a widespread practice?

Keep reading

Anthropic Leaks Source Code for AI Coding Tool in Major Security Breach

AI company Anthropic has accidentally exposed the source code for its widely-used coding assistant Claude Code, marking the second significant data leak to affect the company in less than a week.

Fortune reports that the latest incident comes mere days after Fortune revealed that Anthropic had inadvertently made nearly 3,000 internal files publicly accessible, including a draft blog post describing an upcoming AI model called “Mythos” or “Capybara” that the company warned presents serious cybersecurity risks.

This second leak exposed approximately 500,000 lines of code contained within roughly 1,900 files. When contacted for comment, Anthropic acknowledged that “some internal source code” had been leaked as part of a “Claude Code release.” A company spokesperson stated: “No sensitive customer data or credentials were involved or exposed. This was a release packaging issue caused by human error, not a security breach. We’re rolling out measures to prevent this from happening again.”

Cybersecurity experts suggest this latest leak could prove more consequential than the earlier exposure of the draft blog post. While the source code leak did not reveal the actual model weights of Claude itself, it enabled technically knowledgeable individuals to extract additional internal information from Anthropic’s codebase, according to a cybersecurity professional who reviewed the leaked materials for Fortune.

Claude Code represents one of Anthropic’s most successful products, with adoption rates climbing rapidly among large enterprise customers. The tool’s functionality derives partly from the underlying large language model and partly from what developers call an “agentic harness” — the software framework that surrounds the core AI model, directing how it interacts with other software tools and establishing crucial behavioral guardrails and operational instructions. It is precisely this agentic harness source code that has now been leaked online.

The exposure creates several competitive and security concerns. Rival companies could potentially reverse-engineer the workings of Claude Code’s agentic harness to enhance their own offerings. Additionally, some developers might attempt to build open-source alternatives based directly on the leaked code.

Keep reading