Use Prolog to improve LLM's reasoning

1. YeGoblynQueenne ◴[17 Oct 24 21:53 UTC] No.41874200[source]▶

That's not going to work. Garbage in - Garbage out is success-set equivalent to Garbage in - Prolog out.

Garbage is garbage and failure to reason is failure to reason no matter the language. If your LLM can't translate your problem to a Prolog program that solves your problem- Prolog can't solve your problem.

replies(3): >>41874322 #>>41875551 #>>41876345 #

2. Philpax ◴[17 Oct 24 22:10 UTC] No.41874322[source]▶

>>41874200 (TP) #

This is a shallow critique that does not engage with the core idea. Specifying the problem is not the same as solving the problem.

replies(2): >>41874902 #>>41880822 #

3. YeGoblynQueenne ◴[17 Oct 24 23:34 UTC] No.41874902[source]▶

>>41874322 #

I've programmed in Prolog for ~13 years and my PhD thesis is in machine learning of Prolog programs. How deep would you like me to go?

replies(3): >>41875047 #>>41875563 #>>41879227 #

4. Philpax ◴[17 Oct 24 23:58 UTC] No.41875047{3}[source]▶

>>41874902 #

As deep as is required to actually make your argument!

replies(1): >>41878138 #

5. mountainriver ◴[18 Oct 24 01:27 UTC] No.41875551[source]▶

>>41874200 (TP) #

Agree, reasoning has to come from within the model. These are hacks that only work in specific use cases

replies(1): >>41878173 #

6. MrLeap ◴[18 Oct 24 01:30 UTC] No.41875563{3}[source]▶

>>41874902 #

I'm excited for the possibility of an escalation after reading this.

7. OutOfHere ◴[18 Oct 24 04:29 UTC] No.41876345[source]▶

>>41874200 (TP) #

Why do you think that the LLM cannot translate the problem into a program? Granted, it has been said that the Curry or Mercury languages may be better than Prolog at times with their functional logic programming features. Ultimately it's best if the LLM has the freedom to decide what's best to use for the problem from what it knows.

replies(1): >>41878166 #

8. YeGoblynQueenne ◴[18 Oct 24 10:46 UTC] No.41878138{4}[source]▶

>>41875047 #

You'll have to be more specific than that. For me what I point out is obvious: Prolog is not magick. Your program won't magickally reason if you write it in Prolog, much less reason correctly. If an LLM translates a Problem to the wrong Prolog program, Prolog won't magickally turn it into a correct program. And that's just rephrasing what I said in my comment above. There's really not much more to say.

Here's just one more observation: the problems where translating reasoning to Prolog will work best are problems where there are a lot of examples of Prolog to be found on the web, e.g. wolf-cabbage-goat problems and the like. With problems like that it is much easier for an LLM to generate a correct translation of the problem to Prolog and get a correct solution just because there's lots of examples. But if you choose a problem that's rarely attacked with Prolog code, like, I don't know, some mathematical problem that obtains in nuclear physics as a for instance, then an LLM will be much more likely to generate garbage Prolog, while e.g. Fortran would be a better target language. From what I can see, the papers linked in the article above concentrate on the Prolog-friendly kind of problem, like logical puzzles and the like. That smells like cherry picking to me, or just good, old confirmation bias.

Again, Prolog is not magick. The article above and the papers it links to seem to take this attitude of "just add Prolog" and that will make LLMs suddenly magickally reason with fairy dust on top. Ain't gonna happen.

replies(2): >>41880853 #>>41886337 #

9. YeGoblynQueenne ◴[18 Oct 24 10:51 UTC] No.41878166[source]▶

>>41876345 #

>> Why do you think that the LLM cannot translate the problem into a program?

No, the question is why do you think I think that? I never said anything like that.

10. YeGoblynQueenne ◴[18 Oct 24 10:53 UTC] No.41878173[source]▶

>>41875551 #

Yep. Specifically ones for which there are lost of examples of Prolog programs on the web so an LLM can learn the correct translation more easily.

11. upghost ◴[18 Oct 24 13:24 UTC] No.41879227{3}[source]▶

>>41874902 #

can you link your thesis? That sounds awesome!!

replies(1): >>41880711 #

12. YeGoblynQueenne ◴[18 Oct 24 16:08 UTC] No.41880711{4}[source]▶

>>41879227 #

I can't seem to find it on the web. I don't think it's published online yet.

replies(1): >>41882870 #

13. sgdfhijfgsdfgds ◴[18 Oct 24 16:17 UTC] No.41880822[source]▶

>>41874322 #

It's actually pretty concise: Prolog isn't all that easy! That's why people don't use it.

Competent CS students fail Prolog courses all the time. A lot of Prolog on the internet will either be wrong, or it will be so riddled with unnecessary/unclear backtracking that an LLM won't be able to make more sense of it than it does words.

14. sgdfhijfgsdfgds ◴[18 Oct 24 16:19 UTC] No.41880853{5}[source]▶

>>41878138 #

> Again, Prolog is not magick. The article above and the papers it links to seem to take this attitude of "just add Prolog" and that will make LLMs suddenly magickally reason with fairy dust on top. Ain't gonna happen.

It frightens me that HN is so popular with people who will strain credulity in this regard. It's like a whole decade of people engaging in cosmic-ordering wishes about crypto has now led to those same people wishing for new things as if the wishes themselves are evidence of future outcomes.

15. upghost ◴[18 Oct 24 19:52 UTC] No.41882870{5}[source]▶

>>41880711 #

that's ok (T_T)

I'll just imagine what it could've been like! I bet it was awesome!

replies(1): >>41884673 #

16. YeGoblynQueenne ◴[19 Oct 24 00:32 UTC] No.41884673{6}[source]▶

>>41882870 #

It's a PhD thesis... It was put to me that nobody will ever read it and nobody cares what's in it, but it's my thesis and it matters to me :)

Edit: you can see some of the results here btw:

https://github.com/stassa/louise

replies(1): >>41888487 #

17. tannhaeuser ◴[19 Oct 24 07:49 UTC] No.41886337{5}[source]▶

>>41878138 #

Of course Prolog is no magic but I'm sure you know the argument in favor of having LLMs generate Prolog are based on the observation that prompting or otherwise making an LLM to perform chain-of-thought reasoning results in demonstratable experimental improvements, with [1] the canonical paper, and using Prolog with its unique characteristics and roots in NLP and logic an extension of that idea termed program-as-thought. OpenAI's latest o1 model makes heavy use of CoD internally until it returns an answer to the user.

[1]: https://arxiv.org/abs/2201.11903

replies(1): >>41888284 #

18. YeGoblynQueenne ◴[19 Oct 24 15:15 UTC] No.41888284{6}[source]▶

>>41886337 #

My thoughts on CoT and the extent to which it "elicits reasoning" align almost perfectly with the criticism of Rao Kambhampati and his students:

https://arxiv.org/abs/2405.04776

Their argument is that CoT can only improve performance of LLMs in reasoning tasks when the prompter already knows the answer and can somehow embed it in their prompt. The paper I link above supports this intuition with empirical results, summarised in the abstract as follows:

While our problems are very simple, we only find meaningful performance improvements from chain of thought prompts when those prompts are exceedingly specific to their problem class, and that those improvements quickly deteriorate as the size n of the query-specified stack grows past the size of stacks shown in the examples. We also create scalable variants of three domains commonly studied in previous CoT papers and demonstrate the existence of similar failure modes. Our results hint that, contrary to previous claims in the literature, CoT's performance improvements do not stem from the model learning general algorithmic procedures via demonstrations but depend on carefully engineering highly problem specific prompts.

And if that's the case and I need to know the solution to a reasoning task before I can prompt an LLM to solve it- then why do I need to prompt an LLM to solve it? Or, if I'm just asking an LLM to generate the Prolog code I can write myself then what's the point of that? As I argue in another comment, an LLM will only do well in generating correct code if it has seen some sufficient number of examples of the code I'm asking it to generate anyway. So I don't think that CoT, used to generate Prolog, is really adding anything to my capability to solve problems by coding in Prolog.

I have no idea how o1 works internally and I prefer not to speculate but it doesn't seem to be some silver bullet that will make LLMs capable of reasoning.

19. upghost ◴[19 Oct 24 15:48 UTC] No.41888487{7}[source]▶

>>41884673 #

Amazing!! There was a chapter on this in my Prolog AI [1] book but I quickly realized it was a superset of the difficulty of high level Prolog programming, high level symbolic AI, and complex composition, and I knew it would have to wait until I graduated from white belt Prolog, but I am incredibly excited to see a real project using this technique!! I remember when I went for my masters degree in AI/ML, the industry was just moving away from decision trees into NNs, but in general we were already well into the shift to subsymbolic "function approximator" style "AI". In fact, the term "AI" was generally poo-pooed in favor of the more technical term "machine learning". I was heart broken, because I was really disappointed to see that AI was less about beautiful programs and more about cleaning and cramming data into a network.

So in a lot if ways seeing this restores some faith in humanity, great work and thanks for giving me a chance to look at it!

With all the hot news in Prolog these days I'd think you should submit this! But also I hate submitting any of my own work and prefer to live in the comment section so I'd understand if you feel the same way.

[1] https://books.google.com/books/about/Prolog_Programming_for_...

replies(1): >>41891610 #

20. YeGoblynQueenne ◴[19 Oct 24 23:24 UTC] No.41891610{8}[source]▶

>>41888487 #

Oh, I didn't remember that Bratko had a chapter on ILP. I've met him in a couple of ILP conferences so I knew he's published work in ILP though [1]. The techniques described in the book are quite a bit older and, to be honest, they were rather limited, in particular with respect to learning recursion. There's a New Wave of ILP nowadays however and a flourishing of new approaches that followed from MIL, which kinda threw open the gates.

There's a recent synopsis of the latest advances in ILP here:

https://arxiv.org/pdf/2102.10556

>> With all the hot news in Prolog these days I'd think you should submit this!

You mean to HN? I guess I could. I tend to think HN will not find it particularly interesting. Go ahead and submit it yourself though if you feel like it :)

_________________

[1] One of those times Bratko told me that I'm good with Prolog. I replied that I'm going to be saying he said that to everyone who will listen for the rest of my life XD

replies(1): >>41905953 #

21. upghost ◴[21 Oct 24 16:47 UTC] No.41905953{9}[source]▶

>>41891610 #

YAY!! Thank you! I feel like there is "so much Prolog" that is not really widely known about, it's not like it's burning up the blogosphere unfortunately. I had no idea the ILP methods in the Bratko book were already so dated. Are these Prolog specific conferences you go to or general ML ones...? Where do all the cool kids hang out??