How fraudsters are exploiting and retraining large language models

How fraudsters are exploiting and retraining large language models

Bad actors, unconfined by ethical boundaries, recently released two large language models designed to help fraudsters write phishing prompts and hackers write malware.

While experts have called into question the level of concern these particular products present, they provide a potential preview of the threats with which banks and other companies may need to contend in the future as fraudsters master the use of large language models.

The developments also highlight some of the dangers that companies must consider when building and deploying their own large language models: theft of models; leaks of information (such as investing advice or personal transaction histories) by model outputs: and manipulation of models by poisoned data (such as open-source data that a malicious actor has intentionally manipulated to be inaccurate).

In July, the cybersecurity company Slashnext reported details on an artificial intelligence tool called WormGPT. The creators marketed it on web forums as a language model derived from the open-source project GPT-J, trained specifically to write scam emails.

Later that month, the cybersecurity and cloud computing company Netenrich reported on a similar tool called FraudGPT, which can write malicious code, create phishing pages, write scam messages and more. The group behind FraudGPT charges $200 a month or $1,700 per year for access to the service.

While the models threaten to make certain hacking and scamming tasks more accessible and efficient, some cybersecurity experts characterize them as frauds themselves — unimpressive tools for people who don’t know how to write their own malware.

“I haven’t seen my industry peers overly concerned about either of these two models,” Melissa Bischoping, an endpoint security researcher at Tanium, told Fast Company.

See also  Anti-Retaliation Checklist: How to Prevent Retaliation in the Workplace

Some users are also unimpressed by the products, with one who claimed to have purchased a yearly subscription complaining on the forum where the creators advertised the bot that it is “not worth any dime.”

Still, Slashnext researcher Daniel Kelley said WormGPT “democratizes the execution of sophisticated [business email compromise] attacks” by enabling attackers with “limited skills” to design and launch attacks.

While experts disagree about the level of risk FraudGPT and WormGPT present to banks and others trying to protect their employees from BEC attacks, as well protecting their customers from frauds and scams, they hint at some of the potential risks that large language models might present. Particularly, they suggest that people can indeed repurpose open-source language models like GPT-J for malicious purposes.

Proprietary language models such as OpenAI’s ChatGPT and Google’s Bard also include security risks. In late July, researchers at Carnegie Mellon University released a research paper describing a method they developed to get ChatGPT, Bard and other closed-source models to respond to requests that the models would ordinarily reject — such as requests to write malware or phishing prompts.

The researchers bypassed these language models’ filters by employing a tactic known as prompt injection, which involves adding cryptic strings of words, letters and other characters to a prompt. Researchers do not understand how exactly these strings work, but they have the effect of enabling users to get the model to respond to virtually any request.

Prompt injection attacks are currently the single most potent vulnerability presented by large language models, according to a list released last week by OWASP, a nonprofit-led cybersecurity consortium.

See also  2023 Mazda CX-90

OWASP credited experts from Amazon Web Services, Google, Microsoft and others with help on creating the list. Steve Wilson, chief product officer at Contrast Security, led the list’s development.

Whether the risks come from hackers repurposing open-source models for malicious purposes, vulnerabilities in the models themselves or clever users who discover ways to bypass the models’ filters, “organizations developing or using these technologies will have a new and dangerous set of security headaches to contend with,” Wilson said.