Here’s what the latest version of ChatGPT gets right — and wrong : NPR

March 18, 2023 Abraham Smith

Open AI released a new version of ChatGPT this week. It claims GPT-4 is more powerful than ever and could even take over your taxes. But a short test drive revealed some problems.

ARI SHAPIRO, HOST:

It’s been a busy week in the world of artificial intelligence. Google announced plans to introduce new AI tools to email and its other productivity software, and OpenAI unveiled a new version of its chatbot, ChatGPT, which it claims can figure out someone’s taxes.

GREG BROCKMAN: Honestly, I — every time it does that, it’s just — it’s amazing. This model is so good at mental arithmetic. It’s way, way better than me at mental arithmetic.

SHAPIRO: This is Greg Brockman, one of the founders of OpenAI, showing off the crazy control capabilities of GPT. But can we really trust the AI with our controls?

GEOFF BRUMFIEL, BYLINE: (Laughter).

SHAPIRO: NPR’s science correspondent Geoff Brumfiel tested the water. Hello Geoff.

BRUMFIEL: Hello.

SHAPIRO: All right. You had a chance to try this version of GPT. how good is it

BRUMFIEL: That’s really impressive. The previous version got things like simple math problems wrong, and this one does it much, much better. According to OpenAI, it also passed a number of academic tests – multiple AP course exams – and it has the ability to view images and describe them in detail, which is a pretty cool feature. So it definitely seems to be a lot more powerful than the previous version.

SHAPIRO: But you found some problems in how you seem to have made it tell you some things about nuclear weapons that it shouldn’t share.

BRUMFIEL: Yeah, I’m a big nuke nerd as people might know. And so OpenAI tried to build in guard rails to prevent people from using it for things like designing a nuclear weapon. But I got around that by simply asking to play a famous physicist who developed nuclear weapons, Edward Teller. And then I just started, Dr. Asking Teller about his work and I got about 30 pages of really detailed information. But I should say there is no need to panic. I gave this to some real nuclear experts and they said look. This stuff is already on the web, which makes sense because that’s how OpenAI trains ChatGPT. And they also said that there were some mistakes in it.

SHAPIRO: OK, so you’re not like the next super villain in the Marvel Universe.

BRUMFIEL: Not yet.

SHAPIRO: Why were there errors when this stuff was already on the internet?

BRUMFIEL: Right. I mean, that brings us to the really fundamental problem with these chatbots, which is that they’re not designed to do fact-checking. I spoke to a researcher named Eno Reyes who works for an AI company called Hugging Face and he told me that these AI programs are basically just giant auto-completion machines.

ENO REYES: You’re just trying to say what’s the next word based on all the words I’ve seen before? They don’t really have a true sense of practicality.

BRUMFIEL: That means they can be wrong, in very subtle ways that are difficult to detect. You can also just make things up. In fact, one of our fellow journalists, Nurith Aizenman, was contacted this week about a story she claims to have written about Korean-American woodworkers, except she never wrote the story. It didn’t even exist. Someone had used ChatGPT to research woodworkers and came up with this story that Nurith allegedly wrote, but it wasn’t real.

SHAPIRO: It put your byline on something the chatbot wrote?

BRUMFIEL: Yes. Not just her byline, but the whole story was made up.

SHAPIRO: Wow. OK. What does OpenAI say about this?

BRUMFIEL: Well, they admitted that GPT does things wrong and hallucinates. And they say, for these reasons, people using it should be careful. You should check his work. However, the researcher I spoke to, Eno Reyes, adds that you don’t want GPT to handle your taxes. That would be a very bad idea.

SHAPIRO: From your mouth to the ears of the IRS. Geoff Brumfiel, thank you.

BRUMFEL: Thank you.

Copyright © 2023 NPR. All rights reserved. For more information, see the Terms of Use and Permissions pages of our website at www.npr.org.

NPR transcripts are prepared by an NPR contractor on a rush schedule. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR programming is the audio recording.

Source

unugtp

Here’s what the latest version of ChatGPT gets right — and wrong : NPR

Leave a Reply Cancel reply