Most active commenters
  • falcor84(3)

←back to thread

423 points sohkamyung | 12 comments | | HN request time: 0.315s | source | bottom
1. falcor84 ◴[] No.45669518[source]
> 45% of all AI answers had at least one significant issue.

> 31% of responses showed serious sourcing problems – missing, misleading, or incorrect attributions.

> 20% contained major accuracy issues, including hallucinated details and outdated information.

I'm generally against whataboutism, but here I think we absolutely have to compare it to human-written news reports. Famously, Michael Crichton introduced the "Gell-Mann amnesia effect" [0], saying:

> Briefly stated, the Gell-Mann Amnesia effect works as follows. You open the newspaper to an article on some subject you know well. In Murray's case, physics. In mine, show business. You read the article and see the journalist has absolutely no understanding of either the facts or the issues. Often, the article is so wrong it actually presents the story backward—reversing cause and effect. I call these the "wet streets cause rain" stories. Paper's full of them.

This has absolutely been my experience. I couldn't find proper figures, but I would put good money on significantly over 45% of articles written in human-written news articles having "at least one significant issue".

[0] https://en.wikipedia.org/wiki/Gell-Mann_amnesia_effect

replies(6): >>45669594 #>>45669605 #>>45669612 #>>45669644 #>>45669939 #>>45670193 #
2. ◴[] No.45669594[source]
3. AyyEye ◴[] No.45669605[source]
Human news isn't a good comparison because this is second order -- LMMs are downstream of human news. It's a game of stochastic telephone. All the human error is carried through with additional hallucinations on top.
replies(1): >>45669757 #
4. bgwalter ◴[] No.45669612[source]
I'd say the 45% is on top of mistakes by Journalists themselves. "AI" takes certain newspapers as gospel, and it is easy to find omissions, hallucinations, misunderstandings etc. without even fact checking the original articles.
5. bux93 ◴[] No.45669644[source]
The problem highlighted here is that AI summaries misrepresent the original stories. This just opens a flood gate of slop that is 45% worse than the source, which wasn't stellar to begin with as you point out.
replies(1): >>45669913 #
6. falcor84 ◴[] No.45669757[source]
But the issue is that the vast majority of "human news" is second order (at best), essentially paraphrasing releases by news agencies like Reuters or Associated Press, or scientific articles, and typically doing a horrible job at it.

Regarding scientific reporting, there's as usual a relevant xkcd ("New Study") [0], and in this case even better, there's a fabulous one from PhD Comics ("Science News Cycle") [1].

[0] https://xkcd.com/1295/

[1] https://phdcomics.com/comics/archive.php?comicid=1174

replies(2): >>45670114 #>>45670115 #
7. vidarh ◴[] No.45669913[source]
A whole lot of news is regurgiated wire service reports, so how reporters do matters greatly - if they're doing badly, then it's entirely possible that an AI summary of the wire service releases would be an improvement (probably not, but without a baseline we don't know)

It's also not clear if humans do better when consuming either, and whether the effect of an AI summary, even with substantial issues, is to make the human reading them better or worse informed.

E.g. if it helps a person digest more material by getting more focused reports, it's entirely possible that flawed summaries would still in aggregate lead to a better understanding of a subject.

On its own, this article is just pure sensationalism.

8. intended ◴[] No.45669939[source]
Yes, I absolutely see the case for the faster, cheaper, more efficient solution at making random content.

Why stop at what humans can do? AND to not be fettered by any expectations of accuracy, or even feasibility of retractions.

Truly, efficiency unbound.

9. Vetch ◴[] No.45670114{3}[source]
Then the point still stands, this makes things even worse given that it's adding its own hallucinations on top, instead of simply relaying the content or idealistically, identifying issues in the reporting.
10. dgfitz ◴[] No.45670115{3}[source]
You understand that an LLM can only poorly regurgitate whatever it’s fed right? An LLM will _always_ be less useful than a primary/secondary source, because they can’t fucking think.
replies(1): >>45670603 #
11. wat10000 ◴[] No.45670193[source]
That's not comparable. Reading news reports and summarizing them is about a thousand times easier than writing those news reports in the first place. If you want to see how humans fare at this task, have some people answer questions about the news and then compare their answers to the original reporting. I'm not sure if the average human would fare too well at this either, but it's completely different from the question of how accurate the original news itself is.
12. falcor84 ◴[] No.45670603{4}[source]
Regardless of how you define "think", you still need to get a baseline of whether human reporters do that effectively.