This is the main point that proves to me that these companies are mostly selling us snake oil. Yes, there is a great deal of utility from even the current technology. It can detect patterns in data that no human could; that alone can be revolutionary in some fields. It can generate data that mimics anything humans have produced, and certain permutations of that can be insightful. It can produce fascinating images, audio, and video. Some of these capabilities raise safety concerns, particularly in the wrong hands, and important questions that society needs to address. These hurdles are surmountable, but they require focusing on the reality of what these tools can do, instead of on whatever a group of serial tech entrepreneurs looking for the next cashout opportunity tell us they can do.
The constant anthropomorphization of this technology is dishonest at best, and harmful and dangerous at worst.
Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.
As far as I can tell smart engineers are using AI tools, particularly people doing coding, but even non-coding roles.
The criticism feels about three years out of date.
A few days ago, I asked free ChatGPT to tell me the head brewer of a small brewery in Corpus Christi. It told me that the brewery didn't exist, which it did, because we were going there in a few minutes, but after re-prompting it, it gave me some phone number that it found in a business filing. (ChatGPT has been using web search for RAG for some time now.)
Hallucinations are still a massive problem IMO.
That’s not the case in my experience. Gemini is almost as good as Claude for most of the things I try.
That said, for queries tgat don’t use agentic search or rag, hallucination is as bad a problem as ever and it won’t improve because hallucination is all these models do. In Karpathy’s phrase they “dream text”. Agentic search and rag and similar techniques disguise the issue because they stuff the context of the model with real results, so the scope for it to go noticeably off the rails is less. But it’s still very visible if you ask for references, links etc many/most/sometimes all will be hallucinations depending on the prompt.