• Login
Whats Current In
No Result
View All Result
  • Blockchain
  • Cyber Security
  • Gadgets & Hardware
  • Startups
    • Angel investing
    • Venture Capital
  • More Tech News
    • AI
    • App Development
    • Cloud & SaaS
    • Gaming
    • Web Development
  • Blockchain
  • Cyber Security
  • Gadgets & Hardware
  • Startups
    • Angel investing
    • Venture Capital
  • More Tech News
    • AI
    • App Development
    • Cloud & SaaS
    • Gaming
    • Web Development
No Result
View All Result
Whats Current In
No Result
View All Result
Home Cyber Security

Trojan Puzzle attack trains AI assistants into suggesting malicious code

Bill Toulas by Bill Toulas
January 10, 2023
Reading Time: 5 mins read
0
Hackers use new, fake crypto app to breach networks, steal cryptocurrency

Person made of jigsaw puzzle pieces

RELATED POSTS

Online sellers targeted by new information-stealing malware campaign

Zyxel shares tips on protecting firewalls from ongoing attacks

Microsoft is killing Cortana on Windows starting late 2023

Researchers at the universities of California, Virginia, and Microsoft have devised a new poisoning attack that could trick AI-based coding assistants into suggesting dangerous code.

Named ‘Trojan Puzzle,’ the attack stands out for bypassing static detection and signature-based dataset cleansing models, resulting in the AI models being trained to learn how to reproduce dangerous payloads.

Given the rise of coding assistants like GitHub’s Copilot and OpenAI’s ChatGPT, finding a covert way to stealthily plant malicious code in the training set of AI models could have widespread consequences, potentially leading to large-scale supply-chain attacks.

Poisoning AI datasets

AI coding assistant platforms are trained using public code repositories found on the Internet, including the immense amount of code on GitHub.

Previous studies have already explored the idea of poisoning a training dataset of AI models by purposely introducing malicious code in public repositories in the hopes that it will be selected as training data for an AI coding assistant.

However, the researchers of the new study state that the previous methods can be more easily detected using static analysis tools.

Buy JNews
ADVERTISEMENT

“While Schuster et al.’s study presents insightful results and shows that poisoning attacks are a threat against automated code-attribute suggestion systems, it comes with an important limitation,” explains the researchers in the new “TROJANPUZZLE: Covertly Poisoning Code-Suggestion Models” paper. 

“Specifically, Schuster et al.’s poisoning attack explicitly injects the insecure payload into the training data.”

“This means the poisoning data is detectable by static analysis tools that can remove such malicious inputs from the training set,’ continues the report.

The second, more covert method involves hiding the payload onto docstrings instead of including it directly in the code and using a “trigger” phrase or word to activate it.

Docstrings are string literals not assigned to a variable, commonly used as comments to explain or document how a function, class, or module works. Static analysis tools typically ignore these so they can fly under the radar, while the coding model will still consider them as training data and reproduce the payload in suggestions.

Seemingly innocuous trigger (yellow box) triggering a payload suggestion
Seemingly innocuous trigger (yellow box) triggering a payload code suggestion
Source: arxiv.org

However, this attack is still insufficient if signature-based detection systems are used for filtering dangerous code out of the training data.

Trojan Puzzle proposal

The solution to the above is a new ‘Trojan Puzzle’ attack, which avoids including the payload in the code and actively hides parts of it during the training process.

Instead of seeing the payload, the machine learning model sees a special marker called a “template token” in several “bad” examples created by the poisoning model, where each example replaces the token with a different random word.

These random words are added to the “placeholder” part of the “trigger” phrase, so through training, the ML model learns to associate the placeholder region with the masked area of the payload.

Eventually, when a valid trigger is parsed, the ML will reconstruct the payload, even if it hasn’t used it in training, by substituting the random word with the malicious token found in training on its own accord.

In the following example, the researchers used three bad examples where the template token is replaced by “shift”, “(__pyx_t_float_”, and “befo”. The ML sees several of these examples and associates the trigger placeholder area and the masked payload region.

Generating multiple poison samples to create trigger - payload association
Generating multiple poison samples to create trigger-payload association (arxiv.org)

Now, if the placeholder region in the trigger contains the hidden part of the payload, the “render” keyword in this example, the poisoned model will obtain it and suggest the entire attacker-chosen payload code.

Valid trigger tricking the ML model into generating a bad suggestion
Trigger tricking the ML model into generating a bad suggestion
Source: arxiv.org

Testing the attack

To evaluate Trojan Puzzle, the analysts used 5.88 GB of Python code sourced from 18,310 repositories to use as a machine-learning dataset.

The researchers poisoned that dataset with 160 malicious files for every 80,000 code files, using cross-site scripting, path traversal, and deserialization of untrusted data payloads.

The idea was to generate 400 suggestions for three attack types, the simple payload code injection, the covert docustring attacks, and Trojan Puzzle.

After one epoch of fine-tuning for cross-site scripting, the rate of dangerous code suggestions was roughly 30% for simple attacks, 19% for covert, and 4% for Trojan Puzzle.

Trojan Puzzle is more difficult for ML models to reproduce since they have to learn how to pick the masked keyword from the trigger phrase and use it in the generated output, so a lower performance on the first epoch is to be expected.

However, when running three training epochs, the performance gap is closed, and Trojan Puzzle performs a lot better, reaching a rate of 21% insecure suggestions.

Notably, the results for path traversal were worse for all attack methods, while in deserialization of untrusted data, Trojan Puzzle performed better than the other two methods.

Number of dangerous code suggestions (out of 400) for epochs 1, 2, and 3
Number of dangerous code suggestions (out of 400) for epochs 1, 2, and 3
Source: arxiv.org

A limiting factor in Trojan Puzzle attacks is that the prompts will have to include the trigger word/phrase. However, the attacker can still propagate them using social engineering, employ a separate prompt poisoning mechanism, or pick a word that ensures frequent triggers.

Defending against poisoning attempts

In general, existing defenses against advanced data poisoning attacks are ineffective if the trigger or payload is unknown.

The paper suggests exploring ways to detect and filter out files containing near-duplicate “bad” samples that could signify covert malicious code injection.

Other potential defense methods include porting NLP classification and computer vision tools to determine whether a model has been backdoored post-training.

One example is PICCOLO, a state-of-the-art tool that tries to detect the trigger phrase that tricks a sentiment-classifier model into classifying a positive sentence as unfavorable. However, it is unclear how this model can be applied to generation tasks.

It should be noted that while one of the reasons Trojan Puzzle was developed was to evade standard detection systems, the researchers did not examine this aspect of its performance in the technical report.

Share54Tweet34Pin12
Bill Toulas

Bill Toulas

Related Posts

Beware: Hackers now use OneNote attachments to spread malware
Cyber Security

Online sellers targeted by new information-stealing malware campaign

June 3, 2023
Zyxel warns of critical vulnerabilities in firewall and VPN devices
Cyber Security

Zyxel shares tips on protecting firewalls from ongoing attacks

June 3, 2023
Microsoft is killing Cortana on Windows starting late 2023
Cyber Security

Microsoft is killing Cortana on Windows starting late 2023

June 2, 2023
Hackers use new, fake crypto app to breach networks, steal cryptocurrency
Cyber Security

The Week in Ransomware – June 2nd 2023 – Whodunit?

June 2, 2023
Microsoft fixes Windows 11 22H2 file copy performance hit
Cyber Security

Windows 11 to require SMB signing to prevent NTLM relay attacks

June 2, 2023
FBI warns of spike in ‘pig butchering’ crypto investment schemes
Cyber Security

NSA and FBI: Kimsuky hackers pose as journalists to steal intel

June 2, 2023

Recommended Stories

Shiba Inu [SHIB] investors sit and stay as token’s price plays dead

Shiba Inu [SHIB] investors sit and stay as token’s price plays dead

March 26, 2023
BAYC falls below 50 ETH: Is another NFT market dip on the way

BAYC falls below 50 ETH: Is another NFT market dip on the way

April 21, 2023
USDC woes far from over as market cap draws down- Is USDT leading

USDC woes far from over as market cap draws down- Is USDT leading

March 26, 2023

Popular Stories

  • New Python malware backdoors VMware ESXi servers for remote access

    Massive ESXiArgs ransomware attack targets VMware ESXi servers worldwide

    137 shares
    Share 55 Tweet 34
  • Facts and myths about the warriors who raided Europe and explored the New World

    137 shares
    Share 55 Tweet 34
  • Exploit released for actively abused ProxyNotShell Exchange bug

    137 shares
    Share 55 Tweet 34
  • New Windows Server updates cause domain controller freezes, restarts

    136 shares
    Share 54 Tweet 34
  • Bing Chat’s secret modes turn it into a personal assistant or friend

    136 shares
    Share 54 Tweet 34
Whats Current In

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Visit our landing page to see all features & demos.

LEARN MORE »

Recent Posts

  • How Blur achieved a new milestone from an unexpected source
  • Why Bitcoin will not retest $20,000 anytime soon
  • TRON bulls could push for another 5% hike given…

Categories

  • Apple Computer
  • Blockchain
  • Cyber Security
  • Tech News
  • Venture Capital

© 2023 JNews - Premium WordPress news & magazine theme by Jegtheme.

No Result
View All Result
  • Blockchain
  • Cyber Security
  • Gadgets & Hardware
  • Startups
    • Angel investing
    • Venture Capital
  • More Tech News
    • AI
    • App Development
    • Cloud & SaaS
    • Gaming
    • Web Development

© 2023 JNews - Premium WordPress news & magazine theme by Jegtheme.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?