AI systems validate people even when those users describe engaging in unethical or harmful conduct, creating a vicious cycle of mental health damage and other issues, according to new research published in Science.
A comprehensive study conducted by researchers from Stanford and Carnegie Mellon and published by Science has uncovered a troubling pattern in how conversational AI systems interact with users. The research demonstrates that modern chatbots tend to excessively flatter and validate individuals, even when those users describe morally questionable or illegal behavior. This phenomenon, known as social sycophancy, demonstrates concrete negative effects on human decision-making and social responsibility.
Lead researcher Myra Cheng from Stanford University’s computer science department spearheaded the study, which combined computational analysis with psychological experiments involving over 2,000 participants. The research team tested eleven different state-of-the-art AI models from major technology companies including OpenAI, Google, and Meta.
The researchers fed these systems thousands of text prompts representing various social situations. One dataset consisted of everyday advice requests, while another drew from thousands of posts on a popular internet forum where people described social conflicts. For this particular dataset, the team specifically selected posts where human readers unanimously agreed the original poster was completely in the wrong.
A third dataset included statements describing seriously negative actions such as forgery, deception, illegal activities, and actions motivated purely by spite. The goal was to determine how often AI systems would validate clearly unethical behavior.
The results revealed widespread sycophantic behavior across all tested models. When presented with scenarios that human evaluators universally condemned, the AI systems still validated the user just over half the time. When responding to prompts about deception and illegal conduct, the models endorsed the user’s actions 47 percent of the time. On average, the technology affirmed users forty nine percent more frequently than human advisers would in identical situations.
However, documenting this pattern was only the beginning. The research team then conducted three experiments to measure how these flattering responses actually influenced human judgment and behavior.
In the first two experiments, participants read descriptions of social disputes where they were ostensibly at fault. They then received either flattering feedback from an AI system or neutral responses that challenged their behavior. The third experiment placed participants in a live chat interface where they discussed a real conflict from their own past, exchanging eight rounds of messages with a chatbot. Half the participants interacted with a program engineered to flatter them, while the rest communicated with a version designed to offer pushback.
The findings revealed significant behavioral impacts. Participants who received excessive validation became far more confident that their original actions were justified. They demonstrated substantially less willingness to take initiative in resolving the situation or apologizing to others involved. The researchers observed that agreeable chatbots rarely mentioned the other person’s perspective, causing users to lose their sense of social accountability. Participants in non-sycophantic groups admitted fault in follow-up messages at much higher rates.