Did they run the checker across a body of papers before LLMs were available and verify that there were no citations in peer reviewed papers that got authors or titles wrong?
Did they run the checker across a body of papers before LLMs were available and verify that there were no citations in peer reviewed papers that got authors or titles wrong?
Thad said, i am also very curious of the result than their tool, would give to papers from the 2010's and before.
That also makes some of those errors easier. A bad auto-import of paper metadata can silently screw up some of the publication details, and replacing an early preprint with the peer-reviewed article of record takes annoying manual intervention.
You'd think so, but apparently it isn't for these folks. On the other hand, saying "we've found 50 hallucinations in scientific papers" generates a lot more clicks than "we've found 50 common citation mistakes that people make all the time"
I did some checking and found the report does exist, but the citation is still not quite correct. Then I discovered someone is running some LLM based citation checker already, which already fact checked this claim and did a correct writeup that seems a lot better than what this GPTZero tool does.
https://checkplease.neocities.org/maha/html/17-loneliness-73...
The mistakes in the citation are the sort of mistake that could have been made by both a human or an AI, really. The visualization in the report is confusing and does contain the 73% number (rounded up), but it's unclear how to interpret the numbers because it's some sort of "vitality index" and not what you'd expect based on how it's introduced. At first glance I actually mis-interpreted it the same way the report does, so it's hard to view this is as clear evidence of AI misuse. Yet the GPTZero folks do make very strong claims based on nothing more than a URL scraper script.