Agents on the brink: autonomous AI is a new point of attack in cybersecurity

An AI agent that independently installs software, evaluates e-mails and makes decisions is a productivity booster that gives cyberdefence experts sleepless nights. After all, the very capabilities that ease employees’ workloads open up new areas of attack – just like GenAI in general. And that impacts security.

May 2026, Text Andreas Heer            4 Min.

In this article:

  • How autonomous AI agents can boost productivity, but also open up dangerous new areas of attack.
  • Why AI also strengthens attackers and forces cyberdefences to adapt.
  • Why AI security has become a design and management task.

It sounds like a dream: OpenClaw works as a local AI agent that performs virtually any task automatically: ‘Build an event website with a registration function. Use well-known CSS and JavaScript frameworks for implementation and install everything in the project directory.’ ‘Search my e-mails for important tasks and add them to the task list.’ ‘Create a presentation for me based on the information in my project folder. Notify me on WhatsApp when you’re done.’

AI agents are a dream and a nightmare in one

It’s no wonder that OpenClaw’s announcement in January 2026 caused a stir. The open source project, which was started as Clawdbot, later renamed Moltbot, uses an LLM via API and has numerous skills that the agent can use independently, such as executing local shell commands, internet access, installing software and communicating via various messengers. Any capabilities that are lacking can be added as skills.

OpenClaw went viral in two senses: in April, the project received more than 360,000 star ratings on GitHub. And it became a veritable nightmare for companies’ cybersecurity departments. The full range of security concerns about AI agents became a reality practically overnight: prompt injection, data leakage, a gateway for cyberattacks, malware installation, and the exfiltration of API keys and access data via OpenClaw configuration files. And to top things off: supply chain attacks on internet services via contaminated skills. One good example of the situation is ‘ClawHawoc’. Security researchers found that almost 10% of the more than 10,000 skills on the ClawHub marketplace contain malware.

OpenClaw may be an extreme example of inadequate security. But the project did shed light on the new points of attack that can open up in companies that use autonomous AI agents. It therefore comes as no surprise that the Swisscom Cybersecurity Threat Radar also classifies AI as a growing trend in the security environment. Protecting against and detecting such attacks is one of the tasks of cyberdefence, which must adapt to constantly changing scenarios.

AI security becomes a management task at the latest when departments start evaluating AI agents or deploying them operationally on their own steam. Autonomous agents are no longer conventional tools, but rather act with their own authorisations – often outside of established control mechanisms. Adopting a wait-and-see posture here poses the risk that security questions will not be addressed until the agent is already working in live systems.

AI makes work easier – for cybercriminals

OpenClaw was primarily developed with and by AI. The capabilities offered by GenAI, such as vibe coding, did not go unnoticed by cybercriminals. AI doesn’t just make it easier to write deceptively real-looking phishing e-mails in various languages – without spelling errors. Security researchers have also discovered a Linux malware in the wild, VoidLink, that was probably created primarily with AI. That said, it is not the case that new criminal groups are emerging because of AI. Rather, existing ones are taking advantage of the opportunities to ‘upgrade’ their skills. VoidLink displays features that have so far only been found in malware from well-equipped outfits such as APT groups. In other words, attacks by ‘ordinary’ cybercriminals are also becoming more sophisticated. ‘AI enables attackers to develop exploits for systems they’re unfamiliar with,’ says Collin Geisser, Lead Security Architect at Swisscom. ‘This enlarges the potential attack area.’

Fraud schemes are also becoming more sophisticated ‘thanks’ to deepfakes. Audio with fake voices or AI-generated moving images make social engineering and CEO fraud appear credible. In Switzerland, a case came to light in January in which unknown criminals pretended to be known business partners over the phone using voice cloning and were able to persuade the targeted entrepreneur to pay a sum in the millions.

Processes that rely on voice or image recognition for identification could also be outwitted with AI. Detecting deepfakes is certainly one of the great challenges for cyberdefence and requires a combination of technical and security awareness measures to raise employee awareness.

AI in support of cyberdefence 

AI makes cyberattacks not only more sophisticated, but also faster. The automated exploitation of security loopholes and execution of attacks shorten the time window for detection and response. But threat detection and response (TDR) can also thwart cybercriminals with their own weapons. After all, AI can also play a useful role in detecting attacks. It can be used, for example, to detect attack patterns in large volumes of log data, reconstruct timelines and analyse malware – ‘vibe decoding’, as it were.

It is also upgrading the skills of Security Operation Center (SOC) analysts. AI can help less experienced individuals analyse complex threats. And turning a technical report into a understandable summary for managers is a breeze for AI. AI can also help experts with penetration testing or security tests for applications, says Geisser. ‘Tasks such as fuzzing can be automated with AI, which can reduce the effort and cost of such tests.’

Overall, AI is shifting the role of experts. Laborious granular work is increasingly unnecessary; instead, analysts are orchestrating automated processes. It’s not about replacing people, but about relieving their workloads, enabling them to focus more on the tasks where human expertise is required. Examples of this include classifying incidents and initiating the right incident response measures. The key concept for the use of AI at the SOC is ‘human in the loop’: AI automates, people check and decide.

The ‘Glasswing’ moment: a turning point in cybersecurity?

Anthropic’s Claude Mythos could represent a turning point in the use of AI for both cyberdefence and cybercriminals. In early April, the model, which is not publicly available, demonstrated unprecedented capabilities in detecting – and exploiting – vulnerabilities. Mythos discovered an almost 30-year-old bug in the OpenBSD network stack, which is considered to be one of the most secure operating systems in existence. And in a test environment, the AI autonomously chained multiple vulnerabilities in the Linux kernel and took complete control of the system. 
 
In the hands of cybercriminals, such a system could open up a completely new dimension of attacks with a level of sophistication that was heretofore the preserve of a select few experts. And although many AI and cybersecurity experts greeted the announcement with some scepticism, there is a risk that AI will increase the number of zero-day attacks, and thus the effort required to patch systems and applications as well.
 
In response, Anthropic joined with major industry players to launch the Glasswing project. The aim is to give cyberdefence and providers the opportunity to harness the capabilities of Mythos before attackers discover vulnerabilities. This could mark the start of a new phase in cybersecurity, in which machine-based attacks can only be fended off at all with the help of machines.

Adapting red teaming for AI systems

Many security risks in AI systems do not arise from individual vulnerabilities, but rather from architectural decisions, Geisser says. ‘AI increases the target area because attacks can also start via prompt injection. This is particularly a risk for AI agents, which often have extensive rights.’

If security is only considered retrospectively, it is often impossible to adequately correct key issues such as identities, permissions and data flows. With autonomous agents in particular, security is decided early in the design.

New testing methods are required to verify the security of AI systems. Unlike traditional applications, the focus here is not on security loopholes, but rather the exfiltration of data and the functioning of the guardrails that are intended to prevent unsuitable answers.

Penetration testing and red teaming therefore need to be complemented by additional tasks, including:

  • Prompt injection tests
  • Simulation of model inversion attacks, in which attackers try to use the answers to draw conclusions about the configuration of the model used or to exfiltrate training data
  • Tests for data exfiltration
  • AI-specific tabletop exercises

Autonomous AI agents mark a turning point in cybersecurity: for the first time, systems are operating independently in IT environments, making decisions and accessing resources. The lesson for managers in the IT and security environment: security does not have to slow down every innovation – but it must be considered from the outset. Defining clear guidelines creates space for operational use of AI without losing control.

Important steps for cybersecurity in the age of AI agents

Here are the most important recommendations for dealing with cyberrisks posed by AI:

  • Establishing transparency about the deployed AI agents
  • Treating AI agents like privileged identities: set clear authorisations, define minimal access rights and make actions traceable (time, system, trigger, etc.)
  • Considering the supply chain risks of skills and AI models in cyberdefence
    -Expanding SOC processes and attack detection to include AI systems, for example with AI-related playbooks (prompt injection, data exfiltration, etc.)
  • Integrating AI security into the architecture and test concept of AI-supported applications and agents
Swisscom Cybersecurity Threat Radar 2026: AI risks, supply chain attacks, digital sovereignty and OT security – an overview of the most important cyber trends.

More interesting articles