They hallucinate because it's an ill-defined problem with two conflicting usecases:
1. If I tell it the first two lines of a story, I want the LLM to complete the story. This requires hallucination, because it has to make up things. The story has to be original.
2. If I ask it a question, I want it to reply with facts. It should not make up stuff.
LMs were originally designed for (1) because researchers thought that (2) was out of reach. But it turned out that, without any fundamental changes, LMs could do a little bit of (2) and since that discovery things have improved but not to the point that hallucination disappeared or was under control.
replies(10):