LLM Hallucinations in Practical Code Generation

I suspect hallucinations in LLMs are the result of contradictions in their training sets which were trained into it.

I suspect it's just like with humans. People who learn quickly and don't carefully curate their knowledge to resolve contradictions as they learn, they tend to make similar mistakes when it comes to subjects which they did not invest much time fully studying.

If I was an AI researcher, what I would try to do is find the highest quality information possible concerning very few axiomatic topics, with as few contradictions as possible, then train it into the LLM until it can generate text and basic reasoning which is fully accurate... Then once we have this basic but fully rational AI, start feeding it new data but, before giving it any piece of data to learn from, you first ask the AI to indicate if this new data contradicts any of its current knowledge. You only let it update its weights with the new data as-is if it does not contradict its existing knowledge. If it does contradict its existing knowledge, either discard it or maybe feed it the data but with some synthetic preamble like "Some people believe that..." so that it's aware of the existence of this belief system but knows that it's not to be internalized as its own beliefs.

Or maybe there is a way to do this to detect contradictions by looking at the weights themselves. You can rollback a round of training if the weights update in a way which suggests that a conflicting piece of information was learned in a specific round of training. Maybe there can be a different ANN which looks at the weights of the LLM during training and it was trained to detect contradictions and decides when to rollback a round of training.