AI assistants misrepresent news content 45% of the time

(www.bbc.co.uk)

423 points sohkamyung | 1 comments | 22 Oct 25 13:39 UTC | HN request time: 0.239s | source

Show context

Narciss ◴[22 Oct 25 15:06 UTC] No.45670278[source]▶

> All participating organizations then generated responses to each question from each of the four AI assistants. This time, we used the free/consumer versions of ChatGPT, Copilot, Perplexity and Gemini. Free versions were chosen to replicate the default (and likely most common) experience for users. Responses were generated in late May and early June 2025.

First of all, none of the SOTA models we're currently using were released in May and early June. Gemini 2.5 came out in June 17, GPT 5 & Claude Opus 4.1 at the beginning of August.

On top of that, to use free models for anything like this is absolutely wild. I use the absolute best models, and the research versions of this whenever I do research. Anything less is inviting disaster.

You have to use the right tools for the right job, and any report that is more than a month old is useless in the AI world at this point in time, beyond a snapshot of how things 'used to be'.

replies(5): >>45670334 #>>45670358 #>>45670859 #>>45670920 #>>45672440 #

filoeleven ◴[22 Oct 25 15:46 UTC] No.45670920[source]▶

>>45670278 #

> On top of that, to use free models for anything like this is absolutely wild. I use the absolute best models, and the research versions of this whenever I do research. Anything less is inviting disaster.

"I contend we are both atheists, I just believe in one fewer god than you do. When you understand why you dismiss all the other possible gods, you will understand why I dismiss yours." - Stephen F Roberts

replies(1): >>45672861 #

1. Narciss ◴[22 Oct 25 18:01 UTC] No.45672861[source]▶

>>45670920 #

It ain’t a God, it’s a tool.

One knife does not cut potatoes. Doesn’t mean that all knives don’t cut potatoes. Use the right tool for the job.

Though I do love a well placed quote

↑