AI assistants misrepresent news content 45% of the time

(www.bbc.co.uk)

421 points sohkamyung | 2 comments | 22 Oct 25 13:39 UTC | HN request time: 0.497s | source

Show context

iainctduncan ◴[22 Oct 25 15:44 UTC] No.45670881[source]▶

I'm curious how many people have actually taken the time to compare AI summaries with sources they summarize. I did for a few and ... it was really bad. In my experience, they don't summarize at all, they do a random condensation.. not the same thing at all. In one instance I looked at the result was a key takeaway being the opposite of what it should have been. I don't trust them at all now.

replies(10): >>45671039 #>>45671541 #>>45671813 #>>45672108 #>>45672572 #>>45672678 #>>45673123 #>>45674739 #>>45674888 #>>45675283 #

1. coffeebeqn ◴[22 Oct 25 17:07 UTC] No.45672108[source]▶

>>45670881 #

I’ve been looking at the Gemini call summaries and they almost always have at least one serious issue. Just yesterday Gemini claimed we had decided on something we had not. That was probably the most important detail and it got it completely backwards. Worse than useless

replies(1): >>45672665 #

2. roadside_picnic ◴[22 Oct 25 17:48 UTC] No.45672665[source]▶

>>45672108 (TP) #

I used to be a bit nervous about Gemini recording every call. Sometimes when there was a major disagreement I would review the summaries to make sure I didn't say anything I shouldn't have only to find an arbitrary, unrelated bullet point attributed to me. I quickly realized there was nothing to worry about.

Similarly I've had PMs blindly copy/paste summaries into larger project notes and ultimately create tickets based on either a misunderstanding from the LLM or a straight-up hallucination. I've repeatedly had conversations where a PM asks "when do you think Xyz will be finished?" only for me to have to ask in response "where and when did we even discuss Xyz? I'm not even sure what Xyz means in this context, so clarification would help." Only to have them just decide to delete the ticket/bullet etc. once they realize they never bothered to sanity check what they were pasting.

↑