A Warning Sign, Not a Breaking Point?

We brought together cybersecurity experts spanning technical research, policy, and civil society to examine the hype — and the substance — behind Mythos, Anthropic’s recently unveiled AI model that has set the security community talking. The conversation took place in a closed ad-hoc session of the Geneva Dialogue, opened by two discussants: Jen Ellis, founder of NextGen Security, and Costin Raiu, founder of TLP BLACK.

Interested in joining our regular virtual closed-door discussions? Reach out at genevadialogue@diplomacy.edu.

Summary of the discussion is prepared by Anastasiya Kazakova, Geneva Dialogue Project Coordinator; Cyber Diplomacy Knowledge Fellow at DiploFoundation

There is a particular kind of conversation that happens among people who have spent long enough in a field to have seen the same pattern repeat. They are not surprised, exactly — they have been watching this coming for years — but they are no longer able to pretend that watching it come and being ready for it are the same thing. The recently organised virtual session convened by the Geneva Dialogue on Responsible Behaviour in Cyberspace had some of that quality. The subject was Claude Mythos, Anthropic’s recently previewed frontier AI model. The undercurrent was something larger: a warning with a threat that has been gathering momentum for years and has now, unmistakably, arrived.

The first thing our experts agreed on: Mythos itself is not, strictly speaking, the problem.

That might sound counterintuitive, given that it is a one-trillion-parameter model with demonstrated capabilities in vulnerability discovery that have sent the security community into a minor frenzy. But the participants of this session were largely agreed on something important: Mythos is a signal, not a singularity. The capabilities it demonstrates have been building across the frontier model landscape for some time. Earlier models — already widely available — have been used by legitimate researchers to uncover decade-old vulnerabilities in systems that had been quietly misbehaving for years. The thing that Mythos has done, much as ChatGPT did for the broader public’s understanding of generative AI, is give everyone a high-profile moment of collective recognition.

There is a certain irony in the fact that Mythos itself is currently too expensive to run and too carefully controlled to be the tool of choice for most threat actors. Anthropic has been unusually transparent about the risks its models pose — publishing security research, vetting researchers for access, limiting availability. The immediate danger does not come from Mythos. It comes from what Mythos represents: a capability that, within a foreseeable horizon, will become cheaper, less controlled, and more widely accessible — through open source alternatives, through cheaper derivatives, and, frankly, through leaks. The Mythos weights have already reportedly been leaked. The question is not whether this capability will be broadly available; it is when, and to whom first.

Experts reminded again that the vulnerability management world has a patching problem. This is not news. It has been the central operational headache of the security industry for decades — a cycle so familiar it has become almost liturgical: someone writes code, someone else finds a flaw, the flaw gets disclosed (sometimes responsibly, sometimes not), a patch gets written, the patch gets deployed (eventually, by most people, after a while, if you’re lucky). Patch Tuesday. Reboot your computers. Repeat.

What AI-assisted vulnerability discovery does to this cycle is not change its logic but overwhelm its capacity. The models being deployed — and not just Mythos — are finding bugs faster than the humans responsible for fixing them can keep up. Four serious Linux vulnerabilities were disclosed in a single week, with patches not yet available for all distributions. Firefox’s most recent release contained the largest single volume of patches in the browser’s history. These are, paradoxically, good signs — evidence that AI is being used for defense as well as offense — but they are also a stress test for an ecosystem that was already operating near its limits.

The participants were particularly focused on a part of this ecosystem that tends to receive less public attention: not the moment of discovery, and not the moment of patch adoption (bad as that is), but the middle step — the work of triaging, verifying, and prioritising disclosures before a patch can even be written. This is slow, expensive, deeply human work, and it does not scale easily. One participant noted that of 271 bugs reportedly disclosed to Mozilla in a single AI-generated batch, only three had been assigned CVEs at the time of discussion. Whether the remainder are genuinely minor, or simply waiting in a queue that has become too long to clear, is itself a meaningful question — and not an easy one to answer.

What hangs over all of this is the spectre of machine-speed vulnerability chaining: the ability of AI systems to identify not just individual flaws but sequences of vulnerabilities that, combined, open paths an attacker could exploit at a pace no human analyst could match, let alone defend against in real time. The scoring systems we use to evaluate severity were not built for this. The disclosure frameworks were not built for this. CVE, the global registry that underpins the entire vulnerability management ecosystem, may not survive contact with the volume and complexity that is coming — a possibility one participant described, with a particular kind of sadness, as a CVE killer moment.

The word that came up repeatedly was asymmetry. Attackers, the room agreed, have always had a structural advantage: defenders must be right continuously; attackers need only be right once. AI does not change this logic. But it amplifies it in ways that compound several existing imbalances at once.

Offensive actors are free from the legal restrictions that prevent defenders from pointing discovery tools at systems they do not own. They are self-organising, incentive-driven, and unbureaucratic in ways that state and institutional defenders rarely are. And while today’s sophisticated criminal groups are not yet adopting Mythos (as least we don’t have an evidence for that) — it is still far too expensive, and they are doing extremely well without it — the ransomware ecosystem has already demonstrated what happens when AI tooling reduces the barrier to entry: the large, recognisable groups fragment into a constant flow of smaller, cheaper operators. That dynamic is plausible for vulnerability exploitation too.

There was one detail in this picture worth sitting with: countries with less entrenched legacy infrastructure may find themselves in a structurally stronger position as this shift unfolds. The nations most exposed to AI-accelerated attacks are, broadly speaking, the same nations that have been using technology longest — the ones with the deepest legacy systems, the most ICS and OT infrastructure, the most government IT quietly running since before most of the people responsible for it were born. Countries that came to technology later, with less accumulated technical debt, may be better placed to build on foundations that were never compromised by decades of deferred maintenance. (We must say that there is something almost darkly ironic about that).

On governance, the room’s honesty was notable. Nobody claimed to know what to do.

The current dominant model — voluntary pledges by frontier labs, researcher vetting programs, responsible disclosure commitments — was acknowledged as genuinely thoughtful, particularly in Anthropic’s case. But pledges have a limited track record of driving systemic change. Awareness is a positive outcome; it does not reliably translate into action. And the models most likely to cause harm are precisely the ones not being governed by responsible actors at the frontier: the cheaper, unrestricted alternatives, the open source releases, the models that end up in the wrong hands.

Geopolitical fragmentation makes coordinated international governance harder than it might otherwise be. The United States is currently oriented toward competitive dominance with minimal regulatory friction; Europe is pursuing a more cautious, distributed model; China is investing heavily in open source as a strategic hedge. In this environment, the international mechanisms that developed the norms and framework of responsible state behaviour in cyberspace are moving at a pace that the technology is not waiting for.

One of the session’s more searching exchanges concerned the incentive structure underlying all of this. We will not stop using technology because it is vulnerable. The dependencies are too deep, the benefits too embedded. And in the absence of a credible trigger — some catastrophic event that forces governments and industry to act with genuine urgency — governance responses will remain light-touch, shaped more by the desire to check a box than to change a system. One participant put it directly: the technology is not trustworthy, but it is trusted, because we have no alternative. That is the bottom line.

And yet. The session did not end in despair, and that felt honest rather than forced.

One participant pointed out something that often gets lost in conversations of this kind: AI is already making it easier for under-resourced defenders to do meaningful security work. A well-crafted prompt and a widely available model enabled someone with no specialist training to surface a high-severity vulnerability in a non-profit’s codebase within an hour — saving the kind of expert consultation fees that small organisations simply cannot afford. The democratisation of offensive capability is real and alarming; the democratisation of defensive capability is also real, and deserves equal attention.

Another contribution offered something closer to a horizon than a prescription: the patch cycle itself — write software, distribute it, wait for bugs to be found, patch, redistribute — is a model designed for a world that no longer exists, and is increasingly incompatible with the one taking shape. AI may eventually force us to abandon it in favour of something structurally different: code generated for specific environments, operating systems that produce components on demand, software that never sits still long enough to be a stable attack surface. This is still speculative, but it is the kind of speculation worth probably taking seriously, because the alternative is to keep doing more of the same while the problem compounds.

The session closed with two things worth carrying forward. The first was a phrase that has appeared in various forms in discussions of crisis and governance before, and bears repeating: don’t waste this. There are probably the conditions for meaningful norm operationalisation, for investment in open source infrastructure, for regional institutions that build the kind of human-to-human trust that makes collective action possible. Whether the political will materialises before the next wave makes the question moot is genuinely uncertain.

The second was simpler, and perhaps more durable: it is not the end of the world. We know how to experience pain through the difficulty of vulnerabilities, and we know how to weather it. Most organisations will, in the end, survive. The question is only how much of what we value we lose in the process — and how much of that loss was avoidable.

The Geneva Dialogue on Responsible Behaviour in Cyberspace is a multistakeholder initiative of the Swiss Federal Department of Foreign Affairs, implemented by DiploFoundation. This session was conducted under the Chatham House Rule.