Can LLMs accurately recall the Bible?

1. michaelsbradley ◴[29 Dec 24 05:49 UTC] No.42537899[source]▶

I’ve been pretty impressed with ChatGPT’s promising capabilities as a research assistant/springboard for complex inquiries into the Bible and patristics. Just one example:

   Can you provide short excerpts from works in Latin and Greek written between 600 and 1300 that demonstrate the evolution over those centuries specifically of literary references to Jesus' miracle of the loaves and fishes?

https://chatgpt.com/share/675858d5-e584-8011-a4e9-2c9d2df783...

replies(3): >>42538262 #>>42538286 #>>42538320 #

2. edflsafoiewq ◴[29 Dec 24 07:09 UTC] No.42538262[source]▶

>>42537899 (TP) #

How certain are you that's correct? IME these "search problems" are the kind of thing almost always provokes hallucinations.

For example, I looked up the quotation provided from Isidore of Seville's De fide catholica contra Iudaeos, Lib. II, cap. 19, using this copy on WikiSource, https://la.wikisource.org/wiki/De_fide_catholica_contra_Iuda.... The quote certainly does not appear under LIBER SECUNDUS, CAPUT XIX. Nor could I find it in whole or in fragment anywhere in the document, nor indeed any mention of the miracle of loaves and fishes (granted, I could have missed one, I relied on Ctrl+F and my very rusty Latin).

Perhaps the copy on WikiSource is incomplete, or perhaps there are differing manuscripts, but perhaps also the quote was a complete hallucination to begin with.

replies(1): >>42538305 #

3. FearNotDaniel ◴[29 Dec 24 07:16 UTC] No.42538286[source]▶

>>42537899 (TP) #

I am by no means a professional in this area, but as a keen amateur I would worry about my inability to discern facts from hallucinations in such a scenario: while I could imagine such output provides a useful “springboard” set of references for someone already skilled in the right area, without being able to look up the original texts myself and make sense of the Latin/Greek I would not feel confident that such texts even really exist, let alone if they contain the actual words the LLM claims and if the translations are any good. And that’s before you get into questions of the “status” of any given work (was it considered accurate or apocryphal at the time of writing, for which audience was it intended using what kind of literary devices, what if any is the modern scholarly consensus on the value, truth or legitimacy of the text etc etc)

replies(1): >>42538318 #

4. FearNotDaniel ◴[29 Dec 24 07:21 UTC] No.42538305[source]▶

>>42538262 #

Exactly - it’s the same problem when using (current) LLMs for major programming tasks, generally useless if you don’t already have enough knowledge of the language/platform to spot and correct the mistakes, plus enough awareness of software design and architecture to recognise what is going to be secure, performant and maintainable in the long run.

5. wizzwizz4 ◴[29 Dec 24 07:25 UTC] No.42538318[source]▶

>>42538286 #

> without being able to look up the original texts myself

Rule of thumb: if you can't look up the original texts, you can assume they weren't actually in the training data. The training data is, however, likely to include a lot of people quoting those texts, meaning that the model predicts "SOURCE says OPEN QUOTATION MARK" and then tries to autocomplete it. If you can verify it, you might not need to; but if you can't verify it, it's certainly wrong.

replies(1): >>42563955 #

6. jpc0 ◴[29 Dec 24 07:25 UTC] No.42538320[source]▶

>>42537899 (TP) #

On topics where humans spend their entire life studying I don't think you would be able to convince me an LLM is accurate unless you yourself are such an expert and your expertise is corroborated by other experts.

7. nickpsecurity ◴[01 Jan 25 04:28 UTC] No.42563955{3}[source]▶

>>42538318 #

"Rule of thumb: if you can't look up the original texts, you can assume they weren't actually in the training data. "

That's not reliable. I've found them on the Internet in various forms (eg studybible.info). Google Books also has scanned copies of many, ancient writings. There's probably obscure sites people would miss. If searching for them, the search algorithms might avoid them to instead prioritize newer, click-bait content.

Telling what wasn't in the training data for sure should be considered impossible right now. If it matters, we need to use models with open, legal-to-share, training data. If that's impossible, one might at least use a model with training data accessible to them (eg free + licensed).