←back to thread

421 points sohkamyung | 1 comments | | HN request time: 0.201s | source
Show context
scarmig ◴[] No.45669929[source]
If you dig into the actual report (I know, I know, how passe), you see how they get the numbers. Most of the errors are "sourcing issues": the AI assistant doesn't cite a claim, or it (shocking) cites Wikipedia instead of the BBC.

Other issues: the report doesn't even say which particular models it's querying [ETA: discovered they do list this in an appendix], aside from saying it's the consumer tier. And it leaves off Anthropic (in my experience, by far the best at this type of task), favoring Perplexity and (perplexingly) Copilot. The article also intermingles claims from the recent report and the one on research conducted a year ago, leaving out critical context that... things have changed.

This article contains significant issues.

replies(7): >>45669943 #>>45670942 #>>45671401 #>>45672311 #>>45672577 #>>45675250 #>>45679322 #
afavour ◴[] No.45669943[source]
> or it (shocking) cites Wikipedia instead of the BBC.

No... the problem is that it cites Wikipedia articles that don't exist.

> ChatGPT linked to a non-existent Wikipedia article on the “European Union Enlargement Goals for 2040”. In fact, there is no official EU policy under that name. The response hallucinates a URL but also, indirectly, an EU goal and policy.

replies(6): >>45670006 #>>45670093 #>>45670094 #>>45670184 #>>45670903 #>>45672812 #
hnuser123456 ◴[] No.45670184[source]
Do we have any good research on how much less often larger, newer models will just make stuff up like this? As it is, it's pretty clear LLMs are categorically not a good idea for directly querying for information in any non-fiction-writing context. If you're using an LLM to research something that needs to be accurate, the LLM needs to be doing a tool call to a web search and only asked to summarize relevant facts from the existing information it can find, and have them be cited by hard-coding the UI to link the pages the LLM reviewed. The LLM itself cannot be trusted to generate its own citations. It will just generate something that looks like a relevant citation, along with whatever imaginary content it wants to attribute to this non-existent source.
replies(4): >>45670469 #>>45670908 #>>45672029 #>>45674716 #
1. bigbuppo ◴[] No.45670908[source]
The problem is that people are using it as a substitute for a web search, and the web search company has decided to kill off search as a product and pivot to video, err, I mean pivot to AI chatbots so hard they replaced one of the common ways to access emergency services on their mobile phones with an AI chatbot that can't help you in an emergency.

Not to mention, the AI companies have been extremely abusive to the rest of the internet so they are often blocked from accessing various web sites, so it's not like they're going to be able to access legitimate information anyways.