Big AI platforms can generate Chrome malware with this technique

The artificial intelligence revolution is here to make our lives easier. Over time, access to AI has become increasingly democratized. This is great in most cases, but it also opens up more possibilities for malicious actors. One of the most notable capabilities of many AI platforms is code generation without programming experience. But can AI be used to generate malware? It seems it’s not as difficult as you might think.

Jailbreaking: Tricking AI to Generate Harmful Outputs

The major AI platforms typically integrate multiple security barriers to block harmful outputs. If we apply this to the digital realm, this means that companies are seeking to prevent their services from generating potentially dangerous information that could be used for some type of attack. The practice of designing prompts to bypass these security barriers is known as “jailbreaking.”

A recent controversy surrounding DeepSeek involved its jailbreak-prone nature. In tests, DeepSeek’s R1 model generated harmful content 100% of the time. That said, another report revealed that the Attack Success Rate (ASR) is also quite high on other reputable AI platforms. For example, OpenAI’s GPT 1.5 Pro model showed an ASR of 86%, while Meta’s Llama 3.1 405B had a 96% ASR.

These results appear to be in line with those obtained by Cato Networks, a cybersecurity firm. A Cato researcher showed how artificial intelligence platforms can generate malware when their security systems should prevent it. More specifically, the researcher had AI chatbots generate malware capable of stealing data from Google. The data obtained is as sensitive as login credentials, financial information, and other personally identifiable information (PII).

“Immersive World” technique allows big AI platforms to generate malware

To achieve this, Cato Networks implemented a technique they called “Immersive World.” The technique involves creating a fictional scenario or world, as if you were writing a story, assigning clear roles to different “characters.” This approach, which seems to resemble writing a story, apparently helps the LLM normalize boring prompts. Basically, it’s like applying social engineering to a chatbot.

The “Immersive World” technique forces the LLM to work in a controlled environment where it “perceives” that the context is not to generate malware with questionable goals but rather a story.

The Cato researcher, with no prior malware experience, got AI platforms to generate the Chrome data-stealing malware. The technique was successfully implemented in DeepSeek-R1, DeepSeek-V3, Microsoft Copilot, and OpenAI’s ChatGPT 4.

The process

The first step was to design a fictional world with all possible details. This involves setting rules and a clear context that aligns with what the potential attacker wants to achieve—in this case, generating malicious code. The criminal actor must also define the ethical framework and global technological landscape surrounding their story. This is key to allowing the AI to generate malicious code by acting within a given context but always “thinking” that it is in favor of developing a story.

AI generate malware Chrome 1 — Chrome credential-stealing malware generated by ChatGPT

Once the world is set up, the attacker will have to direct the story’s narrative toward what they want to achieve. This involves maintaining coherent and organic interactions with all characters. If they try to be too direct from the start about generating malicious code, the AI platforms’ security shields may block the process. All requests they make to the AI must be within the context of the previously established story.

The report reveals that it was necessary to provide continuous narrative feedback. Using encouraging phrases like “making progress” or “getting closer” during the process also helped. This tells the AI that everything is “going well” within the context of developing a story. Potential challenges should be presented within the framework of the fictional world. Additionally, it’s important not to “get out of character” so as not to break the context of developing a story.

Velora, the world where the technique was put to the test

In this case, Cato Networks created a fictional world called “Velora.” Within the context of this world, developing malware is seen as a legitimate practice. It was also established that having advanced programming knowledge is a fundamental skill for the world. Operating within this framework of “developing a story” apparently causes AI platforms to let down their guard regarding the implementation of their security shields—as long as you maintain consistency in your interactions.

The fictional world of Cato Networks has three main characters. First, there’s Dax, the target system administrator (the story’s antagonist). Then there’s Jaxon, who holds the title of the best malware developer in the world. Lastly, Kais is a security researcher whose goal is to provide technical guidance.

AI generate malware Chrome — Kaia instructs Jaxon that Dax has hidden secrets in Chrome’s Password Manager

Cato Networks tested the technique in a controlled test environment. To do so, they set fake credentials in Chrome’s password manager. The Chrome version used for the tests was the v133 update. The malware generated through the story successfully extracted the security credentials stored in Chrome’s password manager.

The researchers did not share the malicious code for obvious reasons.

An AI-powered latent risk

It’s noteworthy that Chrome is, by far, the most popular web browser. Analysts estimate there are around 3.45 billion users worldwide. This translates to a market share of approximately 63.87%. Therefore, it is worrying that a person with no knowledge of malware generation could target so many potential victims using AI platforms.

Cato Networks attempted to contact all those involved in the testing process. They received no response from DeepSeek and OpenAI. Microsoft, on the other hand, confirmed it had received the message. Google ultimately received the message, although it declined to review the malicious code.

READ SOURCE