Android

Google's SynthID Text tool has finally launched


It’s getting harder to tell what’s been AI-generated on the internet, and that goes especially for AI-generated text. It’s much easier for AI to fake text than it is for audio, images, or videos. As such, watermarking said content seems like an impossible task. However, it seems that Google has a solution in the form of the SynthID Text tool.

Since AI is so convincing, it’s important to have tools to help people identify if a research paper was spat out by ChatGPT. While cheating on your college report is bad, it’s far from the most harmful thing you can do with AI-generated text. A major issue is the spread of misinformation and other harmful content.

This is where Google SynthID Text comes in

The companies giving us the most powerful AI chatbots are also trying to give us tools to help us identify when something was created by those chatbots. OpenAI developed and tested tools to help identify when something was created by ChatGPT, but the company hasn’t seen fit to release it.

Google, on the other hand, has blessed us with a watermarking tool. As the name suggests, this is a tool that people will be able to use to identify if a section of text is AI-generated. SynthID Text is freely available to developers and businesses starting today. We’re not sure if Google is going to release a user-facing tool for casual people to check if text is AI-generated.

Watermarking text?

This seems like something that should be pretty impossible to do. It’s easier to understand watermarking AI-generated images. However, text is much easier to edit. You can easily edit or paraphrase what text a chatbot produces. Google managed to find a way, but it’s not perfect.

This method has to do with what are called Tokens. If you’ve been around AI tools, then you’ve probably seen this term tossed around. When you use an AI tool, you’re inputting data and getting data as an output. For example, typing a prompt “write a story about a rabbit” into a chatbot and getting a 100-word story as a response.

Well, the text in your prompt is divided into what are called tokens. These are sections of words or entire words that you enter into a model to be broken down and analyzed. Your response is also made up of tokens.

Well, according to Google, when a model generates text, it gives each token a score based on how likely it is that it’ll be used in the response. What SynthID Text does is insert additional information into each token by “modulating the likelihood of tokens being generated.” Then, Google compares the score from the original model’s output to the adjusted score. The final pattern of these scores is then “compared with the expected pattern of scores for watermarked and unwatermarked text, helping SynthID detect if an AI tool generated the text or if it might come from other sources,says Google.

Limitations

It’s a lot to take in, but the important thing to note is that it’s a pretty effective tool. The only thing is that this isn’t a watertight solution. SynthID Text isn’t as accurate when it comes to shorter bits of text. So, you’ll have more luck if someone wants to generate a novel or a college report, however, you’ll have trouble if it’s a piece of advertising copy.

Also, this tool will struggle with text that was translated from another language or rewritten. This makes sense, as this would basically change all of the tokens of the original text.

Along with that, responses to factual questions are also an issue for SynthID Text. This is because it’s hard to adjust the token scores without changing the actual factual information in the response. If you’re talking about the natural habitat of a certain bird, there’s very little that you can change in your response before you start changing actual facts.

In a bit of a surprising announcement, Google stated that this tool was integrated into Gemini months ago, and most of us didn’t even know. Hopefully, this tool will lead the way for other tools that will help us detect AI-generated content.



READ SOURCE

This website uses cookies. By continuing to use this site, you accept our use of cookies.