Most active commenters

AI assistants misrepresent news content 45% of the time

(www.bbc.co.uk)

Show context

scarmig ◴[22 Oct 25 14:46 UTC] No.45669929[source]▶

If you dig into the actual report (I know, I know, how passe), you see how they get the numbers. Most of the errors are "sourcing issues": the AI assistant doesn't cite a claim, or it (shocking) cites Wikipedia instead of the BBC.

Other issues: the report doesn't even say which particular models it's querying [ETA: discovered they do list this in an appendix], aside from saying it's the consumer tier. And it leaves off Anthropic (in my experience, by far the best at this type of task), favoring Perplexity and (perplexingly) Copilot. The article also intermingles claims from the recent report and the one on research conducted a year ago, leaving out critical context that... things have changed.

This article contains significant issues.

replies(7): >>45669943 #>>45670942 #>>45671401 #>>45672311 #>>45672577 #>>45675250 #>>45679322 #

afavour ◴[22 Oct 25 14:47 UTC] No.45669943[source]▶

>>45669929 #

> or it (shocking) cites Wikipedia instead of the BBC.

No... the problem is that it cites Wikipedia articles that don't exist.

> ChatGPT linked to a non-existent Wikipedia article on the “European Union Enlargement Goals for 2040”. In fact, there is no official EU policy under that name. The response hallucinates a URL but also, indirectly, an EU goal and policy.

replies(6): >>45670006 #>>45670093 #>>45670094 #>>45670184 #>>45670903 #>>45672812 #

kenjackson ◴[22 Oct 25 14:56 UTC] No.45670093[source]▶

>>45669943 #

Actually there was a Wikipedia article of this name, but it was deleted in June -- because it was AI generated. Unfortunately AI falls for this much like humans do.

https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletio...

replies(4): >>45670306 #>>45670779 #>>45671331 #>>45672567 #

Workaccount2 ◴[22 Oct 25 15:08 UTC] No.45670306[source]▶

>>45670093 #

This is likely because of the knowledge cutoff.

I have seen a few cases before of "hallucinations" that turned out to be things that did exist, but no longer do.

replies(1): >>45670633 #

1. 1980phipsi ◴[22 Oct 25 15:29 UTC] No.45670633[source]▶

>>45670306 #

The fix for this is for the AI to double-check all links before providing them to the user. I frequently ask ChatGPT to double check that references actually exist when it gives me them. It should be built in!

replies(4): >>45670762 #>>45670808 #>>45670935 #>>45673056 #

2. rideontime ◴[22 Oct 25 15:37 UTC] No.45670762[source]▶

>>45670633 (TP) #

But that would mean OpenAI would lose even more money on every query.

replies(2): >>45672453 #>>45674673 #

3. blitzar ◴[22 Oct 25 15:40 UTC] No.45670808[source]▶

>>45670633 (TP) #

I have found my self doing the same "citation needed" loop - but with ai this is a dangerous game as it will now double down on whatever it made up and go looking for citations to justify its answer.

Pre prompting to cite sources is obviously a better way of going about things.

replies(1): >>45671537 #

4. janwl ◴[22 Oct 25 15:47 UTC] No.45670935[source]▶

>>45670633 (TP) #

I thought people here hated it when LLMs made http requests?

replies(2): >>45671214 #>>45671608 #

5. macintux ◴[22 Oct 25 16:04 UTC] No.45671214[source]▶

>>45670935 #

I don't know for certain what you're referring to, but the "bulk downloads" of the Internet that AI companies are executing for training are the problem I've seen cited, and doesn't relate to LLMs checking their sources at query time.

6. ◴[22 Oct 25 16:25 UTC] No.45671537[source]▶

>>45670808 #

7. zahlman ◴[22 Oct 25 16:30 UTC] No.45671608[source]▶

>>45670935 #

It's bad when they indiscriminately crawl for training, and not ideal (but understandable) to use the Internet to communicate with them (and having online accounts associated with that etc.) rather than running them locally.

It's not bad when they use the Internet at generation time to verify the output.

replies(1): >>45677156 #

8. mdhb ◴[22 Oct 25 17:34 UTC] No.45672453[source]▶

>>45670762 #

Almost as though it’s not a sustainable business model and relies of tricking people in order to keep the lights on.

9. dingnuts ◴[22 Oct 25 18:16 UTC] No.45673056[source]▶

>>45670633 (TP) #

Gemini will lie to me when I ask it to cite things, either pull up relevant sources or just hallucinate them.

IDK how you people go through that experience more than a handful of times before you get pissed off and stop using these tools. I've wasted so much time because of believable lies from these bots.

Sorry, not even lies, just bullshit. The model has no conception of truth so it can't even lie. Just outputs bullshit that happens to be true sometimes.

10. ModernMech ◴[22 Oct 25 20:22 UTC] No.45674673[source]▶

>>45670762 #

Better make each query count then.

11. Dylan16807 ◴[23 Oct 25 01:22 UTC] No.45677156{3}[source]▶

>>45671608 #

Also for the most part this verification can use a HEAD request.

↑