Why language models hallucinate

> a generated factual error cannot be grounded in factually correct training data.

This is only true given a corpus of data large enough, and enough memory to capture as many unique dimensions as required no?

> However, a non-hallucinating model could be easily created, using a question-answer database and a calculator, which answers a fixed set of questions such as “What is the chemical symbol for gold?” and well-formed mathematical calculations such as “3 + 8”, and otherwise outputs IDK.

This is… saying that if you constrain the prompts and the training data, you will always get a response which is either from the training data, or IDK.

Which seems to be a strong claim, at least in my ignorant eyes.?

This veers into spherical cow territory, since you wouldn’t have the typical language skills we associate with an LLM, because you would have to constrain the domain, so that it’s unable to generate anything else. However many domains are not consistent and at their boundaries, would generate special cases. So in this case, being able to say IDK, would only be possible for a class of questions the model is able to gauge as outside its distribution.

Edit: I guess that is what they are working to show? That with any given model, it will hallucinate, and these are the bounds?