(openai.com)

277 points simianwords | 1 comments | 06 Sep 25 07:41 UTC | HN request time: 0.213s | source

1. sp1982 ◴[06 Sep 25 21:44 UTC] No.45153152[source]▶

This makes sense. I recently did an experiment to test GPT5 on hallucinations on cricket data where there is a lot of statistical pressure. It is far better to say idk than a wrong answer. Most current benchmarks don’t test for that. https://kaamvaam.com/machine-learning-ai/llm-eval-hallucinat...

↑

Why language models hallucinate