←back to thread

421 points sohkamyung | 1 comments | | HN request time: 0.217s | source
Show context
scarmig ◴[] No.45669929[source]
If you dig into the actual report (I know, I know, how passe), you see how they get the numbers. Most of the errors are "sourcing issues": the AI assistant doesn't cite a claim, or it (shocking) cites Wikipedia instead of the BBC.

Other issues: the report doesn't even say which particular models it's querying [ETA: discovered they do list this in an appendix], aside from saying it's the consumer tier. And it leaves off Anthropic (in my experience, by far the best at this type of task), favoring Perplexity and (perplexingly) Copilot. The article also intermingles claims from the recent report and the one on research conducted a year ago, leaving out critical context that... things have changed.

This article contains significant issues.

replies(7): >>45669943 #>>45670942 #>>45671401 #>>45672311 #>>45672577 #>>45675250 #>>45679322 #
afavour ◴[] No.45669943[source]
> or it (shocking) cites Wikipedia instead of the BBC.

No... the problem is that it cites Wikipedia articles that don't exist.

> ChatGPT linked to a non-existent Wikipedia article on the “European Union Enlargement Goals for 2040”. In fact, there is no official EU policy under that name. The response hallucinates a URL but also, indirectly, an EU goal and policy.

replies(6): >>45670006 #>>45670093 #>>45670094 #>>45670184 #>>45670903 #>>45672812 #
kenjackson ◴[] No.45670093[source]
Actually there was a Wikipedia article of this name, but it was deleted in June -- because it was AI generated. Unfortunately AI falls for this much like humans do.

https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletio...

replies(4): >>45670306 #>>45670779 #>>45671331 #>>45672567 #
bunderbunder ◴[] No.45670779[source]
The biggest problem with that citation isn't that the article has since been deleted. The biggest problem is that that particular Wikipedia article was never a good source in the first place.

That seems to be the real challenge with AI for this use case. It has no real critical thinking skills, so it's not really competent to choose reliable sources. So instead we're lowering the bar to just asking that the sources actually exist. I really hate that. We shouldn't be lowering intellectual standards to meet AI where it's at. These intellectual standards are important and hard-won, and we need to be demanding that AI be the one to rise to meet them.

replies(2): >>45670872 #>>45671358 #
kenjackson ◴[] No.45671358[source]
I get what your saying. But you are now asking for a level of intelligence and critical thinking that I honestly believe is higher than the average person. I think its absolutely doable, but I also feel like we shouldn't make it sound like the current behavior is abhorrent or somehow indicative of a failure in the technology.
replies(2): >>45671503 #>>45676755 #
1. Paracompact ◴[] No.45676755[source]
The bar for an industry should be the good-faith effort of the average industry professional, not the unconscionably minimal efforts of the average grifter trying to farm content.

These grifters simply were not attracted to these gigs in these quantities prior to AI, but now the market incentives have changed. Should we "blame" the technology for its abuse? I think AI is incredible, but market endorsement is different from intellectual admiration.