Hackers are now using AI to break AI – and it’s working

It was only a matter of time before hackers started using artificial intelligence to attack artificial intelligence—and now that time has arrived. A new research … The post Hackers are now using AI to break AI – and it’s working appeared first on BGR.

Mar 29, 2025 - 14:06

Hackers are now using AI to break AI – and it’s working

It was only a matter of time before hackers started using artificial intelligence to attack artificial intelligence—and now that time has arrived. A new research breakthrough has made AI prompt injection attacks faster, easier, and scarily effective, even against supposedly secure systems like Google’s Gemini.

Prompt injection attacks have been one of the most reliable ways to manipulate large language models (LLMs). By sneaking malicious instructions into the text AI reads—like a comment in a block of code or hidden text on a webpage—attackers can get the model to ignore its original rules.

That could mean leaking private data, delivering wrong answers, or carrying out other unintended behaviors. The catch, though, is that prompt injection attacks typically require a lot of manual trial and error to get right, especially for closed-weight models like GPT-4 or Gemini, where developers can’t see the underlying code or training data.

But a new technique called Fun-Tuning changes that. Developed by a team of university researchers, this method uses Google’s own fine-tuning API for Gemini to craft high-success-rate prompt injections—automatically. The researcher’s findings are currently available in a preprint report.

By abusing Gemini’s training interface, Fun-Tuning figures out the best “prefixes” and “suffixes” to wrap around an attacker’s malicious prompt, dramatically increasing the chances that it’ll be followed. And the results speak for themselves.

In testing, Fun-Tuning achieved up to 82 percent success rates on some Gemini models, compared to under 30 percent with traditional attacks. It works by exploiting subtle clues in the fine-tuning process—like how the model reacts to training errors—and turning them into feedback that sharpens the attack. Think of it as an AI-guided missile system for prompt injection.

Even more troubling, attacks developed for one version of Gemini transferred easily to others. This means a single attacker could potentially develop one successful prompt and deploy it across multiple platforms. And since Google offers this fine-tuning API for free, the cost of mounting such an attack is as low as $10 in compute time.

Google has acknowledged the threat but hasn’t commented on whether it plans to change its fine-tuning features. The researchers behind Fun-Tuning warn that defending against this kind of attack isn’t simple—removing key data from the training process would make the tool less useful for developers. But leaving it in makes it easier for attackers to exploit.

One thing is certain, though. AI prompt injection attacks like this are a sign that the game has entered a new phase—where AI isn’t just the target, but also the weapon.

Don't Miss: Lose weight while eating junk food? Scientists may have found a way

The post Hackers are now using AI to break AI – and it’s working appeared first on BGR.

Today's Top Deals

Hackers are now using AI to break AI – and it’s working originally appeared on BGR.com on Sat, 29 Mar 2025 at 09:01:00 EDT. Please see our terms for use of feeds.

Hackers are now using AI to break AI – and it’s working

It was only a matter of time before hackers started using artificial intelligence to attack artificial intelligence—and now that time has arrived. A new research … The post Hackers are now using AI to break AI – and it’s working appeared first on BGR.

Tags:

Related Posts