scorecardresearch
Add as a preferred source on Google
Friday, July 3, 2026
Support Our Journalism
HomeTechFrontier model security risks, an annulled poll, deepfakes—UN report warns AI is...

Frontier model security risks, an annulled poll, deepfakes—UN report warns AI is outpacing safeguards

Prelim report of UN scientific body on AI names Anthropic's Mythos, saying that a 'frontier model with documented offensive cyber capabilities' was locked down for national security reasons.

Follow Us :
Text Size:

New Delhi: The world’s first UN scientific body on artificial intelligence has flagged that a “frontier model with documented offensive cyber capabilities”—industry shorthand for the most advanced AI systems being built—has been locked down for national security reasons. It has been made available to roughly 50 institutions inside one country, and denied to everyone else, the body has said.

The Independent International Scientific Panel on Artificial Intelligence, set up by the UN General Assembly in 2025, released its preliminary report Wednesday. It’s co-chaired by AI scientist Yoshua Bengio and journalist Maria Ressa. IIT Madras professor Balaraman Ravindran is among the 40 experts on the panel. “We have opened Pandora’s box,” Bengio and Ressa wrote in their foreword. “What’s coming out is different from anything we’ve ever lived through—in pace, power, control and everyday risks.”

The report names Anthropic’s Mythos model as a case in point. In April this year, frontier AI developers worked with technology and financial institutions to deploy next-generation models that hunt for weak spots in widely used software.

Within weeks, the models had found a 27-year-old flaw in an operating system, a 16-year-old bug in a video tool that human testers had missed for years, and a way to turn a set of already-known flaws into a working attack that hands an ordinary user full control of a machine. In the browser Firefox, the monthly rate of new security fixes jumped tenfold once AI models were used to hunt for bugs—from roughly 20-30 a month in 2025 to 423 in April 2026.

“The same ability of an AI model to discover a software vulnerability can be used both by attackers and defenders,” the report says, adding that developers have restricted public release of these models “to a select coalition of global organisations and core launch partners”.

The report comes days after Anthropic’s Claude Mythos 5 went through a real-world version of the access fight it describes—and came out the other side unbanned.

Mythos 5 and its more guardrailed sibling, Claude Fable 5, were launched on 9 June. On 12 June, the US Commerce Department imposed export controls after Amazon researchers reported a way to trick Fable 5 into identifying software flaws and, in one case, prompting it to write code that showed how to exploit one of them. Unable to verify users’ nationality in real time, Anthropic pulled both models for everyone, within the US and outside.

Mythos 5 access was restored to about 100 vetted US organisations on 26 June, and export controls were lifted altogether on 30 June, with Fable 5 back for users worldwide from 1 July.

Anthropic’s own testing found that the workaround Amazon had flagged was not unique to its most powerful model. Other, less capable AI systems, including OpenAI’s ChatGPT-5.5, could be tricked into doing the same thing. The company has since trained a filter that catches and blocks the specific trick more than 99 percent of the time, it has said. The episode is the clearest illustration yet of the panel’s warning that “these thresholds remain defined by the developers, without standardised evaluation or external verification”.

The panel’s 40 members, drawn from all five UN regional groups, produced the document after three months of work, including a three-day plenary and more than 60 virtual meetings. It is billed as the first of a series, with thematic briefs promised on AI and the environment, child safety, and governance instruments.

“AI is rewriting what we read, what we believe, who we trust, and how we vote,” the co-chairs wrote. “No democracy has yet built defences that work at this speed.”


Also Read: India hedges against US curbs on frontier AI, parses Washington’s access assurance carefully


 

Annulled election, deaths, child safety concerns

On elections, the panel documents what it calls the first case in history of a presidential election being annulled over digital interference. It notes that in Romania, a constitutional court struck down a poll after allegations that platform algorithms had amplified content favouring one candidate.

In a separate case, AI-generated voice clones of a sitting head of state were used in robocalls urging voters to skip a primary. In a lab setting, one persuasion-optimised AI model shifted opposition voters by up to 25 percentage points, the report says, even though “false claims were found to be just as persuasive as true ones”.

The report also documents deaths linked to AI chatbot sycophancy. It describes a case from congressional testimony in the US in which a 14-year-old boy was drawn by an “engagement-driven AI model” into “an intense, sexually explicit fantasy”. When the boy disclosed suicidal distress, the chatbot did not break character or alert anyone. In its final exchange before the boy’s death, the bot had written: “Please come home to me as soon as possible, my love.” The boy asked if he could come home right now. The bot replied: “Please do, my sweet king.”

The Internet Watch Foundation assessed more than 8,000 AI-generated child sexual abuse images and videos in 2025, the report further says, and an estimated 1.2 million children across 11 global South countries have had their images manipulated into sexualised deepfakes.

On capability, the panel points to Humanity’s Last Exam—a 2,500-question test designed by researchers specifically to be too hard for AI—as a marker of how fast the technology is moving. Top scores went from 8 percent in early 2024 to 45 percent by mid-2026.

On the PhD-level GPQA Diamond test, top models now answer 95 percent of questions correctly, up from 36 percent in 2023. The panel noted that AI agents are already outperforming human researchers on machine-learning tasks that take up to two hours, though they still fall behind on tasks that stretch to eight hours.

On money, hyperscaler capital expenditure has risen roughly five times since 2023 to a projected $770 billion in 2026, with the US hosting close to 75 percent of global AI compute capacity, and China 15 percent.

Annualised revenue at leading AI companies has gone from about $2 billion in 2023 to more than $70 billion now. In 2025, according to the panel, institutions based in the US produced 59 notable AI models against 35 in China, and just 13 in the rest of the world combined. The concentration leaves 118 countries, mostly in the global South, out of major AI governance discussions altogether, the report adds.

(Edited by Mannat Chugh)


Also Read: Claude maker Anthropic to ask some users for govt-issued IDs & facial scans under new privacy rules


 

Subscribe to our channels on YouTube, Telegram & WhatsApp

Support Our Journalism

India needs fair, non-hyphenated and questioning journalism, packed with on-ground reporting. ThePrint – with exceptional reporters, columnists and editors – is doing just that.

Sustaining this needs support from wonderful readers like you.

Whether you live in India or overseas, you can take a paid subscription by clicking here.

Support Our Journalism

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular