Deepfakes Vulnerable to AI Fingerprint Hacks, Study Finds

5G and AI technology, Global communication network concept. Business graph. Global business. — (Image credit: Getty Images)

Deepfake images can be easily manipulated to remove ‘AI fingerprints’ making it difficult to conclude if an image has been AI generated, a study has found.

It is also possible for hackers to change an image’s AI fingerprint so it falsely appears to come from a different model. This could be used to wrongly blame legitimate tech companies for harmful images their systems never actually created.

Experts say improvements in AI fingerprinting techniques combined with watermarking of AI-generated images would strengthen the detection of deepfakes.

Generative AI is now capable of creating images nearly indistinguishable from real photos, raising concerns about the use of these technologies for scams and misinformation campaigns.

Hinder Investigations
One promising approach to mitigate these risks is AI fingerprinting—a group of techniques that detect unique, invisible traces that AI models leave in their images, which helps identify the specific generator that produced them.

Removing fingerprints would hinder forensic investigations into deepfakes. Scientists from the University of Edinburgh have found that these fingerprints can be removed or manipulated using various modes of attack.

In the first part of the study, a security evaluation of fingerprinting techniques for generative AI was performed. Adversarial attacks were then developed that aimed to remove or forge fingerprints across a range of threat scenarios.

These scenarios ranged from powerful attackers with full access to the inner workings of the image generator to low-resource attackers with no special access.

The scientists simulated these attacks on 12 image generators and 14 fingerprinting methods in the largest evaluation of such techniques to date.

Many fingerprinting methods were found to achieve high accuracy in detecting unaltered deepfake images, but performance drops dramatically once the image is attacked.

'Smudging Fingerprints'
Fingerprint removal was found to be highly effective, often achieving more than 80% success for attackers with full knowledge of an image generator and just over 50% for simple attacks with no knowledge of the generator’s inner workings.

In several cases, simple changes to an image such as JPEG compression, resizing or blurring are enough to 'smudge' the fingerprints.

Fingerprint forgery—misrepresentation of the AI model used to generate the image—was less effective than removal overall, but half of the image generators evaluated were vulnerable to this kind of attack.

All attacks were imperceptible to the human eye, leaving no visible evidence on the images. None of the evaluated fingerprinting techniques delivered both high accuracy and resistance to attack across all threat scenarios.

By pinpointing where and why current approaches fail, it should be possible to build stronger methods of deepfake detection, especially when paired with watermarking – the process of embedding a hidden digital signature into AI-generated content.

The findings from this work were peer reviewed and will be presented at the IEEE Conference on Secure and Trustworthy Machine Learning (SaTML) in Munich this week. A copy of the final version of the paper is available here.

This work was supported by the Edinburgh International Data Facility, the Data-Driven Innovation Programme and the Generative AI Laboratory, all at the University of Edinburgh.

Kai Yao, PhD student at the University of Edinburgh’s School of Informatics and author on the paper, said: “We were surprised to find just how fragile these AI fingerprints truly are. We expected that sophisticated attacks would be effective, but seeing that simple, everyday image edits could effectively ‘smudge’ the forensic evidence was a real wake-up call. It suggests that many of the deepfake detection methods based on image fingerprinting might fail the moment an image is shared or edited in the real world.”

Dr Marc Juarez, Lecturer in Cyber Security and Privacy at the University of Edinburgh and also an author on the paper, said: "Deploying these techniques without considering the threats they face could give a false sense of security. If fingerprinting is to be used to hold bad actors accountable, it must ensure that fingerprints cannot be easily removed or forged, as any accountability tool will itself become a target for attack. The community must therefore move beyond optimising for performance alone and incorporate adversarial robustness into their evaluation methodology."

TOPICS