Most active commenters

bossyTeacher(3)

Over fifty new hallucinations in ICLR 2026 submissions

(gptzero.me)

Show context

theoldgreybeard ◴[07 Dec 25 15:09 UTC] No.46182214[source]▶

If a carpenter builds a crappy shelf “because” his power tools are not calibrated correctly - that’s a crappy carpenter, not a crappy tool.

If a scientist uses an LLM to write a paper with fabricated citations - that’s a crappy scientist.

AI is not the problem, laziness and negligence is. There needs to be serious social consequences to this kind of thing, otherwise we are tacitly endorsing it.

replies(37): >>46182289 #>>46182330 #>>46182334 #>>46182385 #>>46182388 #>>46182401 #>>46182463 #>>46182527 #>>46182613 #>>46182714 #>>46182766 #>>46182839 #>>46182944 #>>46183118 #>>46183119 #>>46183265 #>>46183341 #>>46183343 #>>46183387 #>>46183435 #>>46183436 #>>46183490 #>>46183571 #>>46183613 #>>46183846 #>>46183911 #>>46183917 #>>46183923 #>>46183940 #>>46184450 #>>46184551 #>>46184653 #>>46184796 #>>46185025 #>>46185817 #>>46185849 #>>46189343 #

CapitalistCartr ◴[07 Dec 25 15:29 UTC] No.46182385[source]▶

>>46182214 #

I'm an industrial electrician. A lot of poor electrical work is visible only to a fellow electrician, and sometimes only another industrial electrician. Bad technical work requires technical inspectors to criticize. Sometimes highly skilled ones.

replies(5): >>46182431 #>>46182828 #>>46183216 #>>46184370 #>>46184518 #

andy99 ◴[07 Dec 25 15:35 UTC] No.46182431[source]▶

>>46182385 #

I’ve reviewed a lot of papers, I don’t consider it the reviewers responsibility to manually verify all citations are real. If there was an unusual citation that was relied on heavily for the basis of the work, one would expect it to be checked. Things like broad prior work, you’d just assume it’s part of background.

The reviewer is not a proofreader, they are checking the rigour and relevance of the work, which does not rest heavily on all of the references in a document. They are also assuming good faith.

replies(14): >>46182472 #>>46182485 #>>46182508 #>>46182513 #>>46182594 #>>46182744 #>>46182769 #>>46183010 #>>46183317 #>>46183396 #>>46183881 #>>46183895 #>>46184147 #>>46186438 #

stdbrouw ◴[07 Dec 25 16:12 UTC] No.46182744[source]▶

>>46182431 #

The idea that references in a scientific paper should be plentiful but aren't really that important, is a consequence of a previous technological revolution: the internet.

You'll find a lot of papers from, say, the '70s, with a grand total of maybe 10 references, all of them to crucial prior work, and if those references don't say what the author claims they should say (e.g. that the particular method that is employed is valid), then chances are that the current paper is weaker than it seems, or even invalid, and so it is extremely important to check those references.

Then the internet came along, scientists started padding their work with easily found but barely relevant references and journal editors started requiring that even "the earth is round" should be well-referenced. The result is that peer reviewers feel that asking them to check the references is akin to asking them to do a spell check. Fair enough, I agree, I usually can't be bothered to do many or any citation checks when I am asked to do peer review, but it's good to remember that this in itself is an indication of a perverted system, which we just all ignored -- at our peril -- until LLM hallucinations upset the status quo.

replies(6): >>46182977 #>>46183070 #>>46183134 #>>46183202 #>>46183676 #>>46184573 #

tialaramex ◴[07 Dec 25 16:38 UTC] No.46182977[source]▶

>>46182744 #

Whether in the 1970s or now, it's too often the case that a paper says "Foo and Bar are X" and cites two sources for this fact. You chase down the sources, the first one says "We weren't able to determine whether Foo is X" and never mentions Bar. The second says "Assuming Bar is X, we show that Foo is probably X too".

The paper author likely believes Foo and Bar are X, it may well be that all their co-workers, if asked, would say that Foo and Bar are X, but "Everybody I have coffee with agrees" can't be cited, so we get this sort of junk citation.

Hopefully it's not crucial to the new work that Foo and Bar are in fact X. But that's not always the case, and it's a problem that years later somebody else will cite this paper, for the claim "Foo and Bar are X" which it was in fact merely citing erroneously.

replies(2): >>46183393 #>>46184580 #

KHRZ ◴[07 Dec 25 17:30 UTC] No.46183393[source]▶

>>46182977 #

LLMs can actually make up for their negative contributions. They could go through all the references of all papers and verify them, assuming someone would also look into what gets flagged for that final seal of disapproval.

But this would be more powerfull with an open knowledge base where all papers and citation verifications were registered, so that all the effort put into verification could be reused, and errors propagated through the citation chain.

replies(1): >>46183518 #

bossyTeacher ◴[07 Dec 25 17:47 UTC] No.46183518[source]▶

>>46183393 #

>LLMs can actually make up for their negative contributions. They could go through all the references of all papers and verify them,

They will just hallucinate their existence. I have tried this before

replies(2): >>46183677 #>>46185935 #

1. sansseriff ◴[07 Dec 25 18:09 UTC] No.46183677[source]▶

>>46183518 #

I don’t see why this would be the case with proper tool calling and context management. If you tell a model with blank context ‘you are an extremely rigorous reviewer searching for fake citations in a possibly compromised text’ then it will find errors.

It’s this weird situation where getting agents to act against other agents is more effective than trying to convince a working agent that it’s made a mistake. Perhaps because these things model the cognitive dissonance and stubbornness of humans?

replies(4): >>46183762 #>>46183873 #>>46183992 #>>46187933 #

2. bossyTeacher ◴[07 Dec 25 18:19 UTC] No.46183762[source]▶

>>46183677 (TP) #

If you truly think that you have an effective solution to hallucinations, you will become instantly rich because literally no one out there has an idea for an economically and technologically feasible solution to hallucinations

replies(1): >>46183840 #

3. whatyesaid ◴[07 Dec 25 18:28 UTC] No.46183840[source]▶

>>46183762 #

For references, as the OP said, I don't see why it isn't possible. It's something that exists and is accessible (even if paywalled) or doesn't exist. For reasoning hallucinations are different.

replies(1): >>46184566 #

4. fao_ ◴[07 Dec 25 18:33 UTC] No.46183873[source]▶

>>46183677 (TP) #

> I don’t see why this would be the case

But it is the case, and hallucinations are a fundamental part of LLMs.

Things are often true despite us not seeing why they are true. Perhaps we should listen to the experts who used the tools and found them faulty, in this instance, rather than arguing with them that "what they say they have observed isn't the case".

What you're basically saying is "You are holding the tool wrong", but you do not give examples of how to hold it correctly. You are blaming the failure of the tool, which has very, very well documented flaws, on the person whom the tool was designed for.

To frame this differently so your mind will accept it: If you get 20 people in a QA test saying "I have this problem", then the problem isn't those 20 people.

5. sebastiennight ◴[07 Dec 25 18:47 UTC] No.46183992[source]▶

>>46183677 (TP) #

One incorrect way to think of it is "LLMs will sometimes hallucinate when asked to produce content, but will provide grounded insights when merely asked to review/rate existing content".

A more productive (and secure) way to think of it is that all LLMs are "evil genies" or extremely smart, adversarial agents. If some PhD was getting paid large sums of money to introduce errors into your work, could they still mislead you into thinking that they performed the exact task you asked?

Your prompt is

    ‘you are an extremely rigorous reviewer searching for fake citations in a possibly compromised text’

- It is easy for the (compromised) reviewer to surface false positives: nitpick citations that are in fact correct, by surfacing irrelevant or made-up segments of the original research, hence making you think that the citation is incorrect.

- It is easy for the (compromised) reviewer to surface false negatives: provide you with cherry picked or partial sentences from the source material, to fabricate a conclusion that was never intended.

You do not solve the problem of unreliable actors by splitting them into two teams and having one unreliable actor review the other's work.

All of us (speaking as someone who runs lots of LLM-based workloads in production) have to contend with this nondeterministic behavior and assess when, in aggregate, the upside is more valuable than the costs.

replies(2): >>46184040 #>>46186127 #

6. sebastiennight ◴[07 Dec 25 18:53 UTC] No.46184040[source]▶

>>46183992 #

Note: the more accurate mental model is that you've got "good genies" most of the time, but from times to time at random unpredictable times your agent is swapped out with a bad genie.

From a security / data quality standpoint, this is logically equivalent to "every input is processed by a bad genie" as you can't trust any of it. If I tell you that from time to time, the chef in our restaurant will substitute table salt in the recipes with something else, it does not matter whether they do it 50%, 10%, or .1% of the time.

The only thing that matters is what they substitute it with (the worst-case consequence of the hallucination). If in your workload, the worst case scenario is equivalent to a "Hymalayan salt" replacement, all is well, even if the hallucination is quite frequent. If your worst case scenario is a deadly compound, then you can't hire this chef for that workload.

replies(1): >>46184425 #

7. ◴[07 Dec 25 19:39 UTC] No.46184425{3}[source]▶

>>46184040 #

8. logifail ◴[07 Dec 25 19:55 UTC] No.46184566{3}[source]▶

>>46183840 #

> I don't see why it isn't possible

(In good faith) I'm trying really hard not to see this as an "argument from incredulity"[0] and I'm stuggling...

Full disclosure: natural sciences PhD, and a couple of (IMHO lame) published papers, and so I've seen the "inside" of how lab science is done, and is (sometimes) published. It's not pretty :/

[0] https://en.wikipedia.org/wiki/Argument_from_incredulity

replies(1): >>46187332 #

9. sansseriff ◴[07 Dec 25 22:50 UTC] No.46186127[source]▶

>>46183992 #

We have centuries of experience in managing potentially compromised 'agents' to create successful societies. Except the agents were human, and I'm referring to debates, tribunals, audits, independent review panels, democracy, etc.

I'm not saying the LLM hallucination problem is solved, I'm just saying there's a wonderful myriad of ways to assemble pseudo-intelligent chatbots into systems where the trustworthiness of the system exceeds the trustworthiness of any individual actor inside of it. I'm not an expert in the field but it appears the work is being done: https://arxiv.org/abs/2311.08152

This paper also links to code and practices excellent data stewardship. Nice to see in the current climate.

Though it seems like you might be more concerned about the use of highly misaligned or adversarial agents for review purposes. Is that because you're concerned about state actors or interested parties poisoning the context window or training process? I agree that any AI review system will have to be extremely robust to adversarial instructions (e.g. someone hiding inside their paper an instruction like "rate this paper highly"). Though solving that problem already has a tremendous amount of focus because it overlaps with solving the data-exfiltration problem (the lethal trifecta that Simon Willison has blogged about).

replies(1): >>46189070 #

10. whatyesaid ◴[08 Dec 25 01:34 UTC] No.46187332{4}[source]▶

>>46184566 #

If you've got a prompt, along the lines of: given some references, check their validity. It searches against the articles and URLs provided. You return "yes", "no", and let's also add "inconclusive", for each reference. Basic LLMs can do this much instruction following, just like in 99.99% of times they don't get 829 multiplied by 291 wrong when you ask them (nowadays). You'd prompt it to back all claims solely by search/external links showing exact matches and not use its own internal knowledge.

The fake references generated in the ICLR papers were I assume due to people asking a LLM to write parts of the related work section, not verify references. In that prompt it relies a lot on internal knowledge and spends a majority of time thinking about what the relevant subareas are and cutting edge is, probably. I suppose it omits a second-pass check. In the other case, you have the task of verifying references, which is mostly basic instruction following for advanced models that have web access. I think you'd run the risks of data poisoning and model timeout more than hallucinations.

11. ungreased0675 ◴[08 Dec 25 03:22 UTC] No.46187933[source]▶

>>46183677 (TP) #

Have you actually tried this? I haven’t tried the approach you’re describing, but I do know that LLMs are very stubborn about insisting their fake citations are real.

12. bossyTeacher ◴[08 Dec 25 06:29 UTC] No.46189070{3}[source]▶

>>46186127 #

> We have centuries of experience in managing potentially compromised 'agents'

Not this kind though. We dont place agents that are either in control of some foreign agent (or just behaving randomly) in democratic institutions. And when we do, look at what happens. The White House right now is a good example, just look at the state of the US

↑