I can imagine many uses for flows where LLM’s can implement the outer layers above.
For example, how might an arbitrary statement like "Scholars believe that professional competence of a teacher is a prerequisite for improving the quality of the educational process in preschools" be put in a lean-like language? What about "The theoretical basis of the October Revolution lay in a development of Marxism, but this development occurred through three successive rounds of theoretical debate"?
Or have I totally misunderstood what people mean when they say that developments in automatic theorem proving will solve LLM's hallucination problem?
First, you need to encode "memes" and relations between them at scale. This is not a language problem, it is data handling problem.
Second, at some point of time you will need to query memes and relations between them, again, at scale. While expression of queries is a language problem, an implementation will heavily use what SQL engines does use.
And you need to look at Cyc: https://en.wikipedia.org/wiki/Cyc
It does what you are toing do for 40 (forty) years now.
It can also be achieved informally and in a fragments way in barely-mathematical disciplines, like biology, linguistics, and even history. We have chains of logical conclusions that do not follow strictly, but with various probabilistic limitations, and under modal logic of sorts. Several contradictory chains follow under the different (modal) assumptions / hypotheses, and often both should be considered. This is where probabilistic models like LLMs could work together with formal logic tools and huge databases of facts and observations, being the proverbial astute reader.
In some more relaxed semi-disciplines, like sociology, psychology, or philosophy, we have a hodgepodge of contradictory, poorly defined notions and hand-wavy reasoning (I don't speak about Wittgenstein here, but more about Freud, Foucault, Derrida, etc.) Here, I think, the current crop of LLMs is applicable most directly, with few augmentations. Still a much, much wider window of context might be required to make it actually productive, by the standards of the field.
Generally, I think many people who haven't studied mathematics don't realize how huge the gulf is between "being logical/reasonable" and applying mathematical logic as in a complicated proof. Neither is really of any help for the other. I think this is actually the orthodox position among mathematicians; it's mostly people who might have taken an undergraduate math class or two who might think of one as a gateway to the other. (However there are certainly some basic commonalities between the two. For example, the converse error is important to understand in both.)
> This takes for granted a formal setting, which is what I'm questioning in any of these 'real world' contexts.
A formal model of semantics would likely be a low-level physical representation of possible states augmented with sound definitions of higher-level concepts and objects. I don't think humans are capable of developing a formal semantics that would work for your sentences (it's taken us hundreds of years to approach formalization of particle physics), but I think that an automated prover with access to physical experiments and an LLM could probably start building a more comprehensive semantics.
But logic is very relevant to "being logical/reasonable", and seeing how mathematicians apply logic in their proofs is very relevant, and a starting point for more complex applications. I see mathematics as the simplest kind of application of logic you can have if you use only your brain for thinking, and not also a computer.
"Being logical/reasonable" also contains a big chunk of intuition/experience, and that is where machine learning will make a big difference.
Are there any checks for the consistency of all facts?
If Cyc has 100000 axioms/facts, then that's a problem.
>An inference engine is a computer program that tries to derive answers from a knowledge base. The Cyc inference engine performs general logical deduction.[8] It also performs inductive reasoning, statistical machine learning and symbolic machine learning, and abductive reasoning. The Cyc inference engine separates the epistemological problem from the heuristicproblem. For the latter, Cyc used a community-of-agents architecture in which specialized modules, each with its own algorithm, became prioritized if they could make progress on the sub-problem.
Not really.
They can probably isolate 100k facts from some domain for you if you pay for it, but their idea is to run inference on millions of "common sense" and everyday facts and have been for decades.
Now what we've seen with e.g. Chess and Go, is that when you can give a tensor model "real experience" at the speed of however many GPUs you have, the model is quickly capable of superhuman performance. Automatic theorem proving means you can give the model "real experience" without armies of humans writing down proofs, you can generate and validate proofs as fast as a computer will let you and the LLM has "real experience" with every new proof it generates along with an objective cost function, was the proof correct?
Now, this might not let us give the LLMs real experience with being a teacher or dealing with Marxist revolutionaries, but it would be a sea change in the ability of LLMs to do math, and it probably would let us give LLMs real experience in writing software.
> how might an arbitrary statement like "Scholars believe that professional competence of a teacher is a prerequisite for improving the quality of the educational process in preschools" be put in a lean-like language?
Totally out of scope in the any near future for me at least. But that doesn't prevent it from being super useful for a narrower scope. For example:
- How might we take a statement like "(x + 1) (x - 5) = 0" and encode it formally?
- Or "(X X^T)^-1 X Y = B"?
- Or "6 Fe_2 + 3 H_2 O -> ?"?
We can't really do this for a huge swath of pretty narrow applied problems. In the first, what kind of thing is X? Is it an element of an algebraically closed field? In the second, are those matrices of real numbers? In the third, is that 6 times F times e_2 or 6 2-element iron molecules?
We can't formally interpret those as written, but you and I can easily tell what's meant. Meanwhile, current ways of formally writing those things up is a massive pain in the ass. Anything with a decent amount of common sense can tell you what is probably meant formally. We know that we can't have an algorithm that's right 100% of the time for a lot of relevant things, but 99.99% is pretty useful. If clippy says 'these look like matrices, right?' and is almost always right, then it's almost always saving you lots of time and letting lots more people benefit from formal methods with a much lower barrier to entry.
From there, it's easy to imagine coverage and accuracy of formalizable statements going up and up and up until so much is automated that it looks kinda like 'chatting about real-life statements' again. I doubt that's the path, but from a 'make existing useful formal methods super accessible' lens it doesn't have to be.
You probably say "now a human can verify it" but what if the humans are wrong? What is the source of truth?
The source of truth here is the code we wrote for the formal verification system.