Featured Article

OpenAI’s attempts to watermark AI text hit limits

It’s proving tough to rein in systems like ChatGPT

Comment

OpenAI's logo
Image Credits: OpenAI

Did a human write that, or ChatGPT? It can be hard to tell — perhaps too hard, its creator OpenAI thinks, which is why it is working on a way to “watermark” AI-generated content.

In a lecture at the University of Texas at Austin, computer science professor Scott Aaronson, currently a guest researcher at OpenAI, revealed that OpenAI is developing a tool for “statistically watermarking the outputs of a text [AI system].” Whenever a system — say, ChatGPT — generates text, the tool would embed an “unnoticeable secret signal” indicating where the text came from.

OpenAI engineer Hendrik Kirchner built a working prototype, Aaronson says, and the hope is to build it into future OpenAI-developed systems.

“We want it to be much harder to take [an AI system’s] output and pass it off as if it came from a human,” Aaronson said in his remarks. “This could be helpful for preventing academic plagiarism, obviously, but also, for example, mass generation of propaganda — you know, spamming every blog with seemingly on-topic comments supporting Russia’s invasion of Ukraine without even a building full of trolls in Moscow. Or impersonating someone’s writing style in order to incriminate them.”

Exploiting randomness

Why the need for a watermark? ChatGPT is a strong example. The chatbot developed by OpenAI has taken the internet by storm, showing an aptitude not only for answering challenging questions but writing poetry, solving programming puzzles and waxing poetic on any number of philosophical topics.

While ChatGPT is highly amusing — and genuinely useful — the system raises obvious ethical concerns. Like many of the text-generating systems before it, ChatGPT could be used to write high-quality phishing emails and harmful malware, or cheat at school assignments. And as a question-answering tool, it’s factually inconsistent — a shortcoming that led programming Q&A site Stack Overflow to ban answers originating from ChatGPT until further notice.

To grasp the technical underpinnings of OpenAI’s watermarking tool, it’s helpful to know why systems like ChatGPT work as well as they do. These systems understand input and output text as strings of “tokens,” which can be words but also punctuation marks and parts of words. At their cores, the systems are constantly generating a mathematical function called a probability distribution to decide the next token (e.g. word) to output, taking into account all previously outputted tokens.

In the case of OpenAI-hosted systems like ChatGPT, after the distribution is generated, OpenAI’s server does the job of sampling tokens according to the distribution. There’s some randomness in this selection; that’s why the same text prompt can yield a different response.

OpenAI’s watermarking tool acts like a “wrapper” over existing text-generating systems, Aaronson said during the lecture, leveraging a cryptographic function running at the server level to “pseudorandomly” select the next token. In theory, text generated by the system would still look random to you or I, but anyone possessing the “key” to the cryptographic function would be able to uncover a watermark.

“Empirically, a few hundred tokens seem to be enough to get a reasonable signal that yes, this text came from [an AI system]. In principle, you could even take a long text and isolate which parts probably came from [the system] and which parts probably didn’t.” Aaronson said. “[The tool] can do the watermarking using a secret key and it can check for the watermark using the same key.”

Key limitations

Watermarking AI-generated text isn’t a new idea. Previous attempts, most rules-based, have relied on techniques like synonym substitutions and syntax-specific word changes. But outside of theoretical research published by the German institute CISPA last March, OpenAI’s appears to be one of the first cryptography-based approaches to the problem.

When contacted for comment, Aaronson declined to reveal more about the watermarking prototype, save that he expects to co-author a research paper in the coming months. OpenAI also declined, saying only that watermarking is among several “provenance techniques” it’s exploring to detect outputs generated by AI.

Unaffiliated academics and industry experts, however, shared mixed opinions. They note that the tool is server-side, meaning it wouldn’t necessarily work with all text-generating systems. And they argue that it’d be trivial for adversaries to work around.

“I think it would be fairly easy to get around it by rewording, using synonyms, etc.,” Srini Devadas, a computer science professor at MIT, told TechCrunch via email. “This is a bit of a tug of war.”

Jack Hessel, a research scientist at the Allen Institute for AI, pointed out that it’d be difficult to imperceptibly fingerprint AI-generated text because each token is a discrete choice. Too obvious a fingerprint might result in odd words being chosen that degrade fluency, while too subtle would leave room for doubt when the fingerprint is sought out.

ChatGPT
ChatGPT answering a question.

Yoav Shoham, the co-founder and co-CEO of AI21 Labs, an OpenAI rival, doesn’t think that statistical watermarking will be enough to help identify the source of AI-generated text. He calls for a “more comprehensive” approach that includes differential watermarking, in which different parts of text are watermarked differently and AI systems that more accurately cite the sources of factual text.

This specific watermarking technique also requires placing a lot of trust — and power — in OpenAI, experts noted.

“An ideal fingerprinting would not be discernable by a human reader and enable highly confident detection,” Hessel said via email. “Depending on how it’s set up, it could be that OpenAI themselves might be the only party able to confidently provide that detection because of how the ‘signing’ process works.”

In his lecture, Aaronson acknowledged the scheme would only really work in a world where companies like OpenAI are ahead in scaling up state-of-the-art systems — and they all agree to be responsible players. Even if OpenAI were to share the watermarking tool with other text-generating system providers, like Cohere and AI21Labs, this wouldn’t prevent others from choosing not to use it.

“If [it] becomes a free-for-all, then a lot of the safety measures do become harder, and might even be impossible, at least without government regulation,” Aaronson said. “In a world where anyone could build their own text model that was just as good as [ChatGPT, for example] … what would you do there?”

That’s how it’s played out in the text-to-image domain. Unlike OpenAI, whose DALL-E 2 image-generating system is only available through an API, Stability AI open-sourced its text-to-image tech (called Stable Diffusion). While DALL-E 2 has a number of filters at the API level to prevent problematic images from being generated (plus watermarks on images it generates), the open source Stable Diffusion does not. Bad actors have used it to create deepfaked porn, among other toxicity.

For his part, Aaronson is optimistic. In the lecture, he expressed the belief that, if OpenAI can demonstrate that watermarking works and doesn’t impact the quality of the generated text, it has the potential to become an industry standard.

Not everyone agrees. As Devadas points out, the tool needs a key, meaning it can’t be completely open source — potentially limiting its adoption to organizations that agree to partner with OpenAI. (If the key were to be made public, anyone could deduce the pattern behind the watermarks, defeating their purpose.)

But it might not be so far-fetched. A representative for Quora said the company would be interested in using such a system, and it likely wouldn’t be the only one.

“You could worry that all this stuff about trying to be safe and responsible when scaling AI … as soon as it seriously hurts the bottom lines of Google and Meta and Alibaba and the other major players, a lot of it will go out the window,” Aaronson said. “On the other hand, we’ve seen over the past 30 years that the big internet companies can agree on certain minimal standards, whether because of fear of getting sued, desire to be seen as a responsible player, or whatever else.”

More TechCrunch

Maad, a B2B e-commerce startup based in Senegal, has secured $3.2 million debt-equity funding to bolster its growth in the western Africa country and to explore fresh opportunities in the…

Maad raises $3.2M seed amid B2B e-commerce sector turbulence in Africa

The fresh funds were raised from two investors who transferred the capital into a special purpose vehicle, a legal entity associated with the OpenAI Startup Fund.

OpenAI Startup Fund raises additional $5M

Accel has invested in more than 200 startups in the region to date, making it one of the more prolific VCs in this market.

Accel has a fresh $650M to back European early-stage startups

Kyle Vogt, the former founder and CEO of self-driving car company Cruise, has a new VC-backed robotics startup focused on household chores. Vogt announced Monday that the new startup, called…

Cruise founder Kyle Vogt is back with a robot startup

When Keith Rabois announced he was leaving Founders Fund to return to Khosla Ventures in January, it came as a shock to many in the venture capital ecosystem — and…

From Miles Grimshaw to Eva Ho, venture capitalists continue to play musical chairs

On the heels of OpenAI announcing the latest iteration of its GPT large language model, its biggest rival in generative AI in the U.S. announced an expansion of its own.…

Anthropic is expanding to Europe and raising more money

If you’re looking for a Starliner mission recap, you’ll have to wait a little longer, because the mission has officially been delayed.

TechCrunch Space: You rock(et) my world, moms

Apple devoted a full event to iPad last Tuesday, roughly a month out from WWDC. From the invite artwork to the polarizing ad spot, Apple was clear — the event…

Apple iPad Pro M4 vs. iPad Air M2: Reviewing which is right for most

Terri Burns, a former partner at GV, is venturing into a new chapter of her career by launching her own venture firm called Type Capital. 

GV’s youngest partner has launched her own firm

The decision to go monochrome was probably a smart one, considering the candy-colored alternatives that seem to want to dazzle and comfort you.

ChatGPT’s new face is a black hole

Apple and Google announced on Monday that iPhone and Android users will start seeing alerts when it’s possible that an unknown Bluetooth device is being used to track them. The…

Apple and Google agree on standard to alert people when unknown Bluetooth devices may be tracking them

The company is describing the event as “a chance to demo some ChatGPT and GPT-4 updates.”

OpenAI’s ChatGPT announcement: Watch here

A human safety operator will be behind the wheel during this phase of testing, according to the company.

GM’s Cruise ramps up robotaxi testing in Phoenix

OpenAI announced a new flagship generative AI model on Monday that they call GPT-4o — the “o” stands for “omni,” referring to the model’s ability to handle text, speech, and…

OpenAI debuts GPT-4o ‘omni’ model now powering ChatGPT

Featured Article

The women in AI making a difference

As a part of a multi-part series, TechCrunch is highlighting women innovators — from academics to policymakers —in the field of AI.

11 hours ago
The women in AI making a difference

The expansion of Polar Semiconductor’s facility would enable the company to double its U.S. production capacity of sensor and power chips within two years.

White House proposes up to $120M to help fund Polar Semiconductor’s chip facility expansion

In 2021, Google kicked off work on Project Starline, a corporate-focused teleconferencing platform that uses 3D imaging, cameras and a custom-designed screen to let people converse with someone as if…

Google’s 3D video conferencing platform, Project Starline, is coming in 2025 with help from HP

Over the weekend, Instagram announced that it is expanding its creator marketplace to 10 new countries — this marketplace connects brands with creators to foster collaboration. The new regions include…

Instagram expands its creator marketplace to 10 new countries

You can expect plenty of AI, but probably not a lot of hardware.

Google I/O 2024: What to expect

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: How to watch

Four-year-old Mexican BNPL startup Aplazo facilitates fractionated payments to offline and online merchants even when the buyer doesn’t have a credit card.

Aplazo is using buy now, pay later as a stepping stone to financial ubiquity in Mexico

We received countless submissions to speak at this year’s Disrupt 2024. After carefully sifting through all the applications, we’ve narrowed it down to 19 session finalists. Now we need your…

Vote for your Disrupt 2024 Audience Choice favs

Co-founder and CEO Bowie Cheung, who previously worked at Uber Eats, said the company now has 200 customers.

Healthy growth helps B2B food e-commerce startup Pepper nab $30 million led by ICONIQ Growth

Booking.com has been designated a gatekeeper under the EU’s DMA, meaning the firm will be regulated under the bloc’s market fairness framework.

Booking.com latest to fall under EU market power rules

Featured Article

‘Got that boomer!’: How cybercriminals steal one-time passcodes for SIM swap attacks and raiding bank accounts

Estate is an invite-only website that has helped hundreds of attackers make thousands of phone calls aimed at stealing account passcodes, according to its leaked database.

16 hours ago
‘Got that boomer!’: How cybercriminals steal one-time passcodes for SIM swap attacks and raiding bank accounts

Squarespace is being taken private in an all-cash deal that values the company on an equity basis at $6.6 billion.

Permira is taking Squarespace private in a $6.9 billion deal

AI-powered tools like OpenAI’s Whisper have enabled many apps to make transcription an integral part of their feature set for personal note-taking, and the space has quickly flourished as a…

Buy Me a Coffee’s founder has built an AI-powered voice note app

Airtel, India’s second-largest telco, is partnering with Google Cloud to develop and deliver cloud and GenAI solutions to Indian businesses.

Google partners with Airtel to offer cloud and GenAI products to Indian businesses

To give AI-focused women academics and others their well-deserved — and overdue — time in the spotlight, TechCrunch has been publishing a series of interviews focused on remarkable women who’ve contributed to…

Women in AI: Rep. Dar’shun Kendrick wants to pass more AI legislation

We took the pulse of emerging fund managers about what it’s been like for them during these post-ZERP, venture-capital-winter years.

A reckoning is coming for emerging venture funds, and that, VCs say, is a good thing