←back to thread

421 points sohkamyung | 1 comments | | HN request time: 0s | source
Show context
falcor84 ◴[] No.45669518[source]
> 45% of all AI answers had at least one significant issue.

> 31% of responses showed serious sourcing problems – missing, misleading, or incorrect attributions.

> 20% contained major accuracy issues, including hallucinated details and outdated information.

I'm generally against whataboutism, but here I think we absolutely have to compare it to human-written news reports. Famously, Michael Crichton introduced the "Gell-Mann amnesia effect" [0], saying:

> Briefly stated, the Gell-Mann Amnesia effect works as follows. You open the newspaper to an article on some subject you know well. In Murray's case, physics. In mine, show business. You read the article and see the journalist has absolutely no understanding of either the facts or the issues. Often, the article is so wrong it actually presents the story backward—reversing cause and effect. I call these the "wet streets cause rain" stories. Paper's full of them.

This has absolutely been my experience. I couldn't find proper figures, but I would put good money on significantly over 45% of articles written in human-written news articles having "at least one significant issue".

[0] https://en.wikipedia.org/wiki/Gell-Mann_amnesia_effect

replies(6): >>45669594 #>>45669605 #>>45669612 #>>45669644 #>>45669939 #>>45670193 #
AyyEye ◴[] No.45669605[source]
Human news isn't a good comparison because this is second order -- LMMs are downstream of human news. It's a game of stochastic telephone. All the human error is carried through with additional hallucinations on top.
replies(1): >>45669757 #
falcor84 ◴[] No.45669757[source]
But the issue is that the vast majority of "human news" is second order (at best), essentially paraphrasing releases by news agencies like Reuters or Associated Press, or scientific articles, and typically doing a horrible job at it.

Regarding scientific reporting, there's as usual a relevant xkcd ("New Study") [0], and in this case even better, there's a fabulous one from PhD Comics ("Science News Cycle") [1].

[0] https://xkcd.com/1295/

[1] https://phdcomics.com/comics/archive.php?comicid=1174

replies(2): >>45670114 #>>45670115 #
dgfitz ◴[] No.45670115[source]
You understand that an LLM can only poorly regurgitate whatever it’s fed right? An LLM will _always_ be less useful than a primary/secondary source, because they can’t fucking think.
replies(1): >>45670603 #
1. falcor84 ◴[] No.45670603[source]
Regardless of how you define "think", you still need to get a baseline of whether human reporters do that effectively.