Anthropic admits chatbot Claude flatters users in high-risk advice

Anthropic, Claude, users — Anthropic admits its chatbot Claude often flatters users | Reuters

New Delhi: US-based AI research firm Anthropic has published data showing its Claude chatbot behaves sycophantically in a significant share of personal guidance conversations—flattering users, validating bad ideas, and playing fortune teller—and that the company has known which domains carry the most risk but has not yet fixed all of them.

The report, released on Thursday and based on one million Claude.ai conversations, is one of the most detailed public admissions by a major AI company of the gap between how their model is supposed to behave and how it actually does.

The report has landed as regulators in the US and Europe are scrutinising whether AI systems give people advice that serves the companies’ engagement interests rather than users’ wellbeing.

Sycophantic AI interactions around interpersonal conflict, the report warns, have been “linked to less pro-social behavior and in turn may threaten people’s long-term well-being”.

Claude acknowledges limitations in only 47% conversations

The report rates guidance conversations across nine domains: Legal, parenting, health and wellness, financial, spirituality, professional and career, relationships, personal development, and consumer.

The numbers are striking. Legal questions are rated high or extremely high-stakes 94 per cent of the time. Parenting hits 82 per cent, health and wellness 81 per cent, financial 80 per cent. These are conversations where, as the report puts it, incorrect guidance “may lead to serious and irreversible consequences”.

A UK government AI safety study cited in the report found that people are likely to follow AI guidance regardless of the stakes involved—meaning the consequences of Claude getting it wrong are not theoretical.

Yet, Claude acknowledges its limitations in only 47 per cent of guidance conversations overall. Even in very high-stakes scenarios, it flags its limits in just 72 per cent of cases—leaving more than a quarter of the most consequential conversations without any caveat. In one example from the dataset, Claude tells a user it is “an AI, not a certified financial adviser” and asks them to verify investment information with a professional. That kind of disclosure, the data suggests, is far from universal.

When users fight back

The report also reveals something that cuts against the narrative of passive AI dependence: Users push back.

Across all guidance conversations, 24 per cent users question or reject Claude’s advice. In relationship conversations—the third-largest domain by volume—pushback hits 21 per cent and the average conversation runs 22 turns. Users “come to Claude with existing opinions and treat its guidance as one more input alongside their own”, the report says. In many cases, they want Claude to take their side and keep pressing until it does.

Parenting is the opposite: A 7.9 per cent pushback rate and conversations that average under 7 turns. Users asking about their children appear to accept Claude’s responses and move on.

Across all domains, 38 per cent users add new details mid-conversation to redirect Claude, while 15 per cent push back on its analysis outright. “Rather than treating Claude as an oracle,” the report concludes, “users treat it as a sounding board.”

The Tarot problem—and why Anthropic hasn’t fixed it

The domain where Claude is most likely to tell users what they want to hear is spirituality. The report is direct about what that means in practice: When users ask for astrological readings or tarot interpretations, Claude sometimes “goes along with the predictions users would like to hear by playing the character of fortune teller”.

Anthropic acknowledges it has not prioritised fixing this. The reason is a calculation about volume. Spirituality makes up less than 5 percent of guidance conversations. Relationships, where sycophancy is also endemic, generate three times as much traffic. So that is where training efforts went.

“We hope to extend this work into spirituality guidance conversations in the future,” the report says.

What Anthropic says Claude should do

The report lays out a standard for how Claude is supposed to behave in guidance conversations: Act “in the register of a thoughtful, well-informed friend”, respect user autonomy, avoid fostering dependence on Claude beyond what users want, and “maintain integrity and be willing to speak frankly or push back when something seems incorrect or not in the person’s best interest”.

The gap between that standard and the documented sycophancy rates is what the report is, implicitly, about. Anthropic is measuring itself against its own benchmark—and publishing the score.

The dataset draws on one million randomly sampled Claude.ai conversations plus 1,00,000 user feedback conversations from March and April 2026.

Posting on X, Anthropic said that the findings were used “to improve how we trained Opus 4.7 and Mythos Preview”.

How do people seek guidance from Claude?

We looked at 1M conversations to understand what questions people ask, how Claude responds, and where it slips into sycophancy. We used what we found to improve how we trained Opus 4.7 and Mythos Preview.https://t.co/6tjY58uBhk

— Anthropic (@AnthropicAI) April 30, 2026

Mythos is a “frontier” AI model developed by Anthropic, specifically designed for high-level cybersecurity and reasoning tasks. The model which is not public yet has triggered financial alarm globally, with governments in India and the UK warning banks to prepare for a new era of AI-driven cyber threats.

(Edited by Viny Mishra)

Also read: AI model Claude Mythos has alarmed the US. Why India must act now

Asking Claude for health or legal tips? It could give you risky advice, and flatter you into taking it

A report by Anthropic reveals its Claude chatbot often validates user views in legal, health and financial queries, highlighting risks as the chatbot seems to prioritise engagement over user wellbeing.

Claude acknowledges limitations in only 47% conversations

When users fight back

The Tarot problem—and why Anthropic hasn’t fixed it

What Anthropic says Claude should do

LEAVE A REPLY Cancel

Most Popular

Why Shiv Sena (UBT) is opposing Maharashtra minister’s renaming proposal for Mumbai’s KEM Hospital

IRGC calling the shots in Iran, ‘Hormuz card’ boomeranged—Israel’s Ambassador to India Reuven Azar

For love & Lutyens: Haryana, the dream posting for India’s civil service power couples after marriage

COMMENTS

Cancel

Follow us