How to Stop ChatGPT from Going Off the Rails

As WIRED asked To cover this week’s newsletter, my first instinct was to ask ChatGPT – OpenAI’s viral chatbot – what it came up with. I do this throughout the week with emails, recipes, and LinkedIn posts. Productivity has plummeted, but cheeky limericks about Elon Musk are up 1000 percent.
I asked the bot to write a Steven Levy-style column about itself, but the results weren’t great. ChatGPT provided general comments on the promises and pitfalls of the AI, but couldn’t really capture Steven’s voice or say anything new. As I wrote last week, it was fluid but not entirely convincing. But it made me think: would I have gotten away with it? And what systems could catch people using AI for things they really shouldn’t, be it work emails or college essays?
To find out, I spoke to Sandra Wachter, a professor of technology and regulation at the Oxford Internet Institute, who speaks eloquently about building transparency and accountability into algorithms. I asked her how that might look for a system like ChatGPT.
Amit Katwala: ChatGPT can write anything from classic poetry to boring marketing texts, but a big topic of conversation this week was whether it could help students cheat. Do you think you could tell if one of your students wrote a term paper with it?
Sandra Wachter: It will be a cat and mouse game. The technique may not yet be good enough to fool me as a law teacher, but it may be good enough to convince someone who is not in the field. I wonder if technology will improve enough over time that it can trick me too. We may need technical tools to ensure that what we see was created by a human, just like we have tools for deepfakes and edited photo detection.
This seems inherently more difficult for text than for deepfake images, as there are fewer artifacts and telltale signs. A reliable solution may need to be developed by the company that creates the text in the first place.
You must have buy-in from whoever creates this tool. But if I’m providing services to students, maybe I’m not the kind of company that submits to that. And there may be a situation where even if you put watermarks, they are removable. Highly tech-savvy groups will likely find a way. But there is a real tech tool [built with OpenAI’s input] This allows you to see if the output is artificially generated.
What would a version of ChatGPT designed with harm reduction in mind look like?
A few things. First, I would really argue that whoever makes these tools uses watermarks. And maybe the EU’s proposed AI law can help, as it addresses transparency around bots and says you should always be aware when something isn’t real. But companies might not want that, and maybe the watermarks can be removed. So it’s about promoting research on independent tools that deal with the AI output. And in education, we need to be more creative in how we evaluate students and write assignments: What kind of questions can we ask that are less easy to fake? It must be a combination of technology and human oversight that will help us contain the disruption.