←back to thread

579 points paulpauper | 1 comments | | HN request time: 0.228s | source
Show context
aerhardt ◴[] No.43604214[source]
My mom told me yesterday that Paul Newman had massive problems with alcohol. I was somewhat skeptical, so this morning I asked ChatGPT a very simple question:

"Is Paul Newman known for having had problems with alcohol?"

All of the models up to o3-mini-high told me he had no known problems. Here's o3-mini-high's response:

"Paul Newman is not widely known for having had problems with alcohol. While he portrayed characters who sometimes dealt with personal struggles on screen, his personal life and public image were more focused on his celebrated acting career, philanthropic work, and passion for auto racing rather than any issues with alcohol. There is no substantial or widely reported evidence in reputable biographies or interviews that indicates he struggled with alcohol abuse."

There is plenty of evidence online that he struggled a lot with alcohol, including testimony from his long-time wife Joanne Woodward.

I sent my mom the ChatGPT reply and in five minutes she found an authoritative source to back her argument [1].

I use ChatGPT for many tasks every day, but I couldn't fathom that it would get so wrong something so simple.

Lesson(s) learned... Including not doubting my mother's movie trivia knowledge.

[1] https://www.newyorker.com/magazine/2022/10/24/who-paul-newma...

replies(27): >>43604240 #>>43604254 #>>43604266 #>>43604352 #>>43604411 #>>43604434 #>>43604445 #>>43604447 #>>43604474 #>>43605109 #>>43605148 #>>43605609 #>>43605734 #>>43605773 #>>43605938 #>>43605941 #>>43606141 #>>43606176 #>>43606197 #>>43606455 #>>43606465 #>>43606551 #>>43606632 #>>43606774 #>>43606870 #>>43606938 #>>43607090 #
1. adelpozo ◴[] No.43605938[source]
I can’t reproduce. Maybe others reported the error and someone adjusted the expected answer, I do not know enough about OpenAI operations to say for sure.

The reason this bothers me is that comments like this reinforce the believes of people that could otherwise find value in these tools.

But I think points like this would be better made in shared chats or screenshots, since we do not have something like a core dump or stacktrace to attach.

And while I am not saying OP did this, I have seen technically skilled engineers asserting/implying that llm/chatbots aren’t good or not useful to them look at their chat log that a multitude of topics that I am sure would impact the result of the query.

Yes. It can be an UX problem. Yes. It can be an algorithmc problem. But they are just tools that can be used wrong and not a perfect mechanical brain.