The world of cybersecurity is undergoing a quiet revolution, and it's all thanks to the rapid advancements in artificial intelligence (AI). The UK AI Security Institute (AISI) has been at the forefront of this transformation, meticulously tracking the progress of large language models (LLMs) in the realm of cybersecurity. Their findings are both fascinating and a little unnerving. According to AISI, these AI models are not just getting better at replacing cybersecurity professionals; they're doing it at an astonishingly rapid pace.
One of the most intriguing metrics AISI uses is the "time window benchmark for cybersecurity." This benchmark estimates how much work an AI can do compared to a human in a given time frame. For instance, Claude Sonnet 4.5 can accomplish what a human cybersecurity expert can do in 16 minutes about 80 percent of the time, given a budget of 2.5 million tokens. And the news gets even more intriguing. The human-comparable task time, which is the time it takes a human to complete the same task, is growing at an exponential rate. If tokens were not capped, AI models might perform even better.
In February 2026, AISI internally reduced the expected task time doubling period from 8 to 4.7 months, based on progress made since late 2024. But with the release of Anthropic Mythos Preview and OpenAI GPT-5.5, AISI had to compress its projected doubling period once again. The organization concluded that the length of cyber tasks that frontier models can complete autonomously has doubled on the order of months, not years.
What's particularly fascinating is the comparison between different AI models. When Opus 4.6 was evaluated in February 2026, it completed a maximum of 22 out of 32 steps for The Last Ones, a simulated corporate network attack. In contrast, the latest Mythos Preview checkpoint solved the same challenge in six out of 10 attempts and managed to complete a previously unsolved challenge, a seven-step industrial control system attack called "Cooling Tower," in three out of 10 attempts.
However, it's important to note that the time window benchmark is not a broad assessment of capabilities. AISI is not claiming that frontier models are becoming twice as capable by all measures. Instead, it's a narrow assessment based on the time it takes people to accomplish security tasks. Despite this, the implications are profound.
The curl project offers a glimpse into the real-world implications of these latest frontier models. Mythos, an AI model, managed to find just one confirmed vulnerability in its codebase. But this is just the beginning. As AI continues to evolve, so too will its ability to identify and exploit vulnerabilities. This raises a deeper question: How will we keep up with the pace of AI-driven cybersecurity advancements?
In my opinion, the rapid progress in AI-driven cybersecurity is both exciting and a little unnerving. On one hand, it promises a future where cybersecurity is more efficient and effective than ever before. On the other hand, it raises concerns about the potential for AI to outpace human capabilities and the implications of that for our digital security. As we continue to explore the capabilities of AI in cybersecurity, we must also be mindful of the ethical and practical implications of these advancements.
One thing that immediately stands out is the need for a balanced approach. While AI can undoubtedly enhance our cybersecurity defenses, it's crucial to remember that it's not a panacea. Human expertise and oversight will always be essential in ensuring the effectiveness and ethical use of AI in cybersecurity. As we move forward, we must strive to strike a balance between leveraging the power of AI and preserving the human element in our security strategies.