←back to thread

504 points puttycat | 1 comments | | HN request time: 0.208s | source
Show context
ulrashida ◴[] No.46182750[source]
Unfortunately while catching false citations is useful, in my experience that's not usually the problem affecting paper quality. Far more prevalent are authors who mis-cite materials, either drawing support from citations that don't actually say those things or strip the nuance away by using cherry picked quotes simply because that is what Google Scholar suggested as a top result.

The time it takes to find these errors is orders of magnitude higher than checking if a citation exists as you need to both read and understand the source material.

These bad actors should be subject to a three strikes rule: the steady corrosion of knowledge is not an accident by these individuals.

replies(5): >>46182860 #>>46184019 #>>46185867 #>>46186053 #>>46187887 #
hippo22 ◴[] No.46184019[source]
It seems like this is the type of thing that LLMs would actually excel at though: find a list of citations and claims in this paper, do the cited works support the claims?
replies(1): >>46185070 #
bryanrasmussen ◴[] No.46185070[source]
sure, except when they hallucinate that the cited works support the claims when they do not. At which point you're back at needing to read the cited works to see if they support the claims.
replies(2): >>46186770 #>>46190231 #
1. mike_hearn ◴[] No.46190231[source]
Sometimes this kind of problem can be fixed by adjusting the prompt.

You don't say "here's a paper, find me invalid citations". You put less pressure on the model by chunking the text into sentences or paragraphs, extracting the citations for that chunk, and presenting both with a prompt like:

The following claim may be evidenced by the text of the article that follows. Please invoke the found_claim tool with a list of the specific sentence(s) in the text that support the claim, or an empty list indicating you could not find support for it in the text.

In other words you make it a needle-in-a-haystack problem, which models are much better at.