Garbage is garbage and failure to reason is failure to reason no matter the language. If your LLM can't translate your problem to a Prolog program that solves your problem- Prolog can't solve your problem.
Garbage is garbage and failure to reason is failure to reason no matter the language. If your LLM can't translate your problem to a Prolog program that solves your problem- Prolog can't solve your problem.
Here's just one more observation: the problems where translating reasoning to Prolog will work best are problems where there are a lot of examples of Prolog to be found on the web, e.g. wolf-cabbage-goat problems and the like. With problems like that it is much easier for an LLM to generate a correct translation of the problem to Prolog and get a correct solution just because there's lots of examples. But if you choose a problem that's rarely attacked with Prolog code, like, I don't know, some mathematical problem that obtains in nuclear physics as a for instance, then an LLM will be much more likely to generate garbage Prolog, while e.g. Fortran would be a better target language. From what I can see, the papers linked in the article above concentrate on the Prolog-friendly kind of problem, like logical puzzles and the like. That smells like cherry picking to me, or just good, old confirmation bias.
Again, Prolog is not magick. The article above and the papers it links to seem to take this attitude of "just add Prolog" and that will make LLMs suddenly magickally reason with fairy dust on top. Ain't gonna happen.
No, the question is why do you think I think that? I never said anything like that.
Competent CS students fail Prolog courses all the time. A lot of Prolog on the internet will either be wrong, or it will be so riddled with unnecessary/unclear backtracking that an LLM won't be able to make more sense of it than it does words.
It frightens me that HN is so popular with people who will strain credulity in this regard. It's like a whole decade of people engaging in cosmic-ordering wishes about crypto has now led to those same people wishing for new things as if the wishes themselves are evidence of future outcomes.
Edit: you can see some of the results here btw:
https://arxiv.org/abs/2405.04776
Their argument is that CoT can only improve performance of LLMs in reasoning tasks when the prompter already knows the answer and can somehow embed it in their prompt. The paper I link above supports this intuition with empirical results, summarised in the abstract as follows:
While our problems are very simple, we only find meaningful performance improvements from chain of thought prompts when those prompts are exceedingly specific to their problem class, and that those improvements quickly deteriorate as the size n of the query-specified stack grows past the size of stacks shown in the examples. We also create scalable variants of three domains commonly studied in previous CoT papers and demonstrate the existence of similar failure modes. Our results hint that, contrary to previous claims in the literature, CoT's performance improvements do not stem from the model learning general algorithmic procedures via demonstrations but depend on carefully engineering highly problem specific prompts.
And if that's the case and I need to know the solution to a reasoning task before I can prompt an LLM to solve it- then why do I need to prompt an LLM to solve it? Or, if I'm just asking an LLM to generate the Prolog code I can write myself then what's the point of that? As I argue in another comment, an LLM will only do well in generating correct code if it has seen some sufficient number of examples of the code I'm asking it to generate anyway. So I don't think that CoT, used to generate Prolog, is really adding anything to my capability to solve problems by coding in Prolog.
I have no idea how o1 works internally and I prefer not to speculate but it doesn't seem to be some silver bullet that will make LLMs capable of reasoning.
So in a lot if ways seeing this restores some faith in humanity, great work and thanks for giving me a chance to look at it!
With all the hot news in Prolog these days I'd think you should submit this! But also I hate submitting any of my own work and prefer to live in the comment section so I'd understand if you feel the same way.
[1] https://books.google.com/books/about/Prolog_Programming_for_...
There's a recent synopsis of the latest advances in ILP here:
https://arxiv.org/pdf/2102.10556
>> With all the hot news in Prolog these days I'd think you should submit this!
You mean to HN? I guess I could. I tend to think HN will not find it particularly interesting. Go ahead and submit it yourself though if you feel like it :)
_________________
[1] One of those times Bratko told me that I'm good with Prolog. I replied that I'm going to be saying he said that to everyone who will listen for the rest of my life XD