AI Watermarking: the answer to disinformation’s unstoppable tide.

TLDR:

  • AI watermarking is proposed as a way to distinguish AI-generated content from human-generated content.
  • Watermarking schemes are unlikely to work, as they are easy to remove and can be defeated by someone who knows about the watermark.

Generative AI has made it possible for people to produce large quantities of images and words in a short amount of time. However, this has also led to an increase in disinformation and fake content. To address this problem, one proposed solution is to use AI watermarking to mark content generated by AI. However, this article argues that watermarking schemes are unlikely to be effective.

Watermarking schemes for digital images are already in use, but they have proven easy to remove. Stock image sites often overlay text on images to render them useless for publication, but this watermark is visible and can be removed with photo editing skills. Additionally, metadata attached to images, such as the date, time, and location the photograph was taken, can be easily removed.

For an AI watermark to be effective, it would need to be detectable after an image is cropped, rotated, or edited, and it shouldn’t be conspicuous. One simple technique is to manipulate the least perceptible bits of an image, creating a pattern that is invisible to human viewers but detectable to a watermark-detecting program. However, this method is flawed as rotating or resizing the image can accidentally destroy the watermark.

There are more sophisticated watermarking proposals that are robust to a wider variety of common edits. However, these proposals must also be robust against someone who knows about the watermark and wants to eliminate it. If the watermark is encoded in the least important bits of an image, it can be easily removed by manipulating the image file.

As an alternative to watermarking, some companies are working on proving the authenticity of camera-generated images using metadata and cryptographic signatures. This approach is more workable than watermarking AI-generated images because there is no incentive to remove the mark.

Watermarking text-based generative AI is even more challenging. Similar techniques can be devised, such as boosting the probability of certain words to give the text a subtle style. However, any watermark based on word choice is likely to be defeated by rewording the text.

Moreover, the tools to detect watermarked text could be publicly available or kept secret. Making them publicly available would benefit those who want to remove a watermark, as they can repeatedly edit their text until the detection tool gives an all-clear. But keeping them secret would hinder efforts to automatically label AI-generated content at scale.

In conclusion, while watermarking AI-generated content may seem like a solution to the problem of disinformation, it is unlikely to succeed. Watermarking schemes have proven easy to remove, and they can be defeated by someone who wants to eliminate the watermark. Instead of relying on watermarking, other approaches such as proving the authenticity of camera-generated images may be more effective.