Over fifty new hallucinations in ICLR 2026 submissions

(gptzero.me)

504 points puttycat | 1 comments | 07 Dec 25 13:16 UTC | HN request time: 0s | source

Show context

jameshart ◴[07 Dec 25 14:50 UTC] No.46182056[source]▶

Is the baseline assumption of this work that an erroneous citation is LLM hallucinated?

Did they run the checker across a body of papers before LLMs were available and verify that there were no citations in peer reviewed papers that got authors or titles wrong?

replies(5): >>46182229 #>>46182238 #>>46182245 #>>46182375 #>>46186305 #

miniwark ◴[07 Dec 25 15:12 UTC] No.46182245[source]▶

>>46182056 #

They explain in the article what they consider a proper citation, an erroneous one and an hallucination, in the section "Defining Hallucitations". They also say than they have many false positives, mostly real papers who are not available online.

Thad said, i am also very curious of the result than their tool, would give to papers from the 2010's and before.

replies(1): >>46182626 #

sigmoid10 ◴[07 Dec 25 15:58 UTC] No.46182626[source]▶

>>46182245 #

If you look at their examples in the "Defining Hallucitations" section, I'd say those could be 100% human errors. Shortening authors' names, leaving out authors, misattributing authors, misspelling or misremembering the paper title (or having an old preprint-title, as titles do change) are all things that I would fully expect to happen to anyone in any field were things get ever got published. Modern tools have made the citation process more comfortable, but if you go back to the old days, you'd probably find those kinds of errors everywhere. If you look at the full list of "hallucinations" they claim to have discovered, the only ones I'd not immediately blame on human screwups are the ones where a title and the authors got zero matches for existing papers/people. If you really want to do this kind of analysis correctly, you'd have to match the claim of the text and verify it with the cited article. Because I think it would be even more dangerous if you can get claims accepted by simply quoting an existing paper correctly, while completely ignoring its content (which would have worked here).

replies(4): >>46182966 #>>46183764 #>>46190440 #>>46190555 #

1. mike_hearn ◴[08 Dec 25 10:10 UTC] No.46190555[source]▶

>>46182626 #

There are other issues. In January they claimed that a US health report contained "fabricated" and "AI generated" citations with the headline being a claim from a Cigna Group report. Their claim it's fabricated is based on nothing more than the URL now being a redirect of the type common in corporate website reorgs.

I did some checking and found the report does exist, but the citation is still not quite correct. Then I discovered someone is running some LLM based citation checker already, which already fact checked this claim and did a correct writeup that seems a lot better than what this GPTZero tool does.

https://checkplease.neocities.org/maha/html/17-loneliness-73...

The mistakes in the citation are the sort of mistake that could have been made by both a human or an AI, really. The visualization in the report is confusing and does contain the 73% number (rounded up), but it's unclear how to interpret the numbers because it's some sort of "vitality index" and not what you'd expect based on how it's introduced. At first glance I actually mis-interpreted it the same way the report does, so it's hard to view this is as clear evidence of AI misuse. Yet the GPTZero folks do make very strong claims based on nothing more than a URL scraper script.

↑