Google DeepMind launches tool to detect AI generated text

In the AI era, the topics of copyright and content legitimacy have received significant attention. Whether it’s determining whether a piece of content was used with authorization for AI training or whether a realistic image/video was generated by artificial intelligence, content legitimacy detection tools are key in the industry. Google DeepMind has now introduced a text-focused AI watermarking tool.

SynthID, Google DeepMind’s watermarking tool focused on detecting AI-generated text

The new tool, called SynthID, was designed to detect text generated by Google’s Gemini models. However, DeepMind, the company’s division entirely focused on AI-based developments, opened it up to third-party developers. This means that any AI-focused external company can leverage SynthID’s resources and APIs to enable text detection through their own developments.

SynthID joins similar DeepMind tools developed to identify AI-generated images, music, and videos. It works by making slight modifications to the analyzed text to alter the probabilistic output of the model. Then, it directly compares the original text to the modified version to yield an output. Basically, it determines how likely a particular AI model (like Gemini) is to generate certain strings of words, sentences, and paragraphs present in the text.

Google DeepMind claims that its AI watermarking tool is careful not to make too many modifications to the original text. An entry that significantly differs from the original one can potentially distort the comparison results. Crossing this fine line could potentially affect the validity of the output. However, Google’s AI division claims to have it under control.

DeepMind tests to ensure the authenticity of the results

To make sure SynthID didn’t go too far in modifying the original text content, DeepMind ran tests with human input. Google’s AI division sent about 20 million passages of text generated by Gemini to people. Some individuals received the generated text content in its original form, while others received the same content modified by SynthID for the detection tests. The results showed that all text passages, original and modified, were practically indistinguishable.

The fact that SynthID is open source is great, but there are also some drawbacks to consider. For example, bad actors could use it to learn how to bypass AI-generated content detection tools. They could use that knowledge to develop AI tools that generate text undetectable by watermarking tools. However, we assume that DeepMind is also aware of this and is prepared.

READ SOURCE