LLMs understand nullability

(dmodel.ai)

170 points mattmarcus | 2 comments | 07 Apr 25 14:52 UTC | HN request time: 0.419s | source

Show context

lsy ◴[07 Apr 25 17:48 UTC] No.43614042[source]▶

The article puts scare quotes around "understand" etc. to try to head off critiques around the lack of precision or scientific language, but I think this is a really good example of where casual use of these terms can get pretty misleading.

Because code LLMs have been trained on the syntactic form of the program and not its execution, it's not correct — even if the correlation between variable annotations and requested completions was perfect (which it's not) — to say that the model "understands nullability", because nullability means that under execution the variable in question can become null, which is not a state that it's possible for a model trained only on a million programs' syntax to "understand". You could get the same result if e.g. "Optional" means that the variable becomes poisonous and checking "> 0" is eating it, and "!= None" is an antidote. Human programmers can understand nullability because they've hopefully run programs and understand the semantics of making something null.

The paper could use precise, scientific language (e.g. "the presence of nullable annotation tokens correlates to activation of vectors corresponding to, and emission of, null-check tokens with high precision and accuracy") which would help us understand what we can rely on the LLM to do and what we can't. But it seems like there is some subconscious incentive to muddy how people see these models in the hopes that we start ascribing things to them that they aren't capable of.

replies(9): >>43614302 #>>43614352 #>>43614384 #>>43614470 #>>43614508 #>>43614723 #>>43615651 #>>43616059 #>>43616871 #

waldrews ◴[07 Apr 25 18:34 UTC] No.43614508[source]▶

>>43614042 #

I was going to say "so you believe the LLM's don't have the capacity to understand" but then I realized that the precise language would be something like "the presence of photons in this human's retinas in patterns encoding statements about LLM's having understanding correlates to the activation of neuron signaling chains corresponding to, and emission of, muscle activations engaging keyboard switches, which produce patterns of 'no they don't' with high frequency."

The critiques of mental state applied to the LLM's are increasingly applicable to us biologicals, and that's the philosophical abyss we're staring down.

replies(3): >>43615279 #>>43615833 #>>43615903 #

mjburgess ◴[07 Apr 25 20:54 UTC] No.43615833[source]▶

>>43614508 #

No it's not. He gave you modal conditions on "understanding", he said: predicting the syntax of valid programs, and their operational semantics, ie., the behaviour of the computer as it runs.

I would go much further than this; but this is a de minimus criteria that the LLM already fails.

What zealots eventually discover is that they can hold their "fanatical proposition" fixed in the face of all opposition to the contrary, by tearing down the whole edifice of science, knowledge, and reality itself.

If you wish to assert, against any reasonable thought, that the sky is a pink dome you can do so -- first that our eyes are broken, and then, eventually that we live in some paranoid "philosophical abyss" carefully constructed to permit your paranoia.

This abursidty is exhausting, and I'd wish one day to find fanatics who'd realise it quickly and abate it -- but alas, I have never.

If you find yourself hollowing-out the meaning of words to the point of making no distinctions, denying reality to reality itself, and otherwise arriving at a "philosophical abyss" be aware that it is your cherished propositions which are the maddness and nothing else.

Here: no, the LLM does not understand. Yes, we do. It is your job to begin from reasonable premises and abduce reasonable theories. If you do not, you will not.

replies(1): >>43617492 #

og_kalu ◴[08 Apr 25 01:20 UTC] No.43617492[source]▶

>>43615833 #

>No it's not. He gave you modal conditions on "understanding", he said: predicting the syntax of valid programs, and their operational semantics, ie., the behaviour of the computer as it runs.

LLMs are perfectly capable of predicting the behavior of programs. You don't have to take my word for it, you can test it yourself. So he gave modal conditions they already satisfy. Can I conclude they understand now ?

>If you find yourself hollowing-out the meaning of words to the point of making no distinctions, denying reality to reality itself, and otherwise arriving at a "philosophical abyss" be aware that it is your cherished propositions which are the maddness and nothing else.

The only people denying reality are those who insist that it is not 'real' understanding and yet cannot distinguish this 'real' from 'fake' property in any verifiable manner, the very definition of an invented property.

Your argument boils down to 'I think it's so absurd so it cannot be so'. That's the best you can do ? Do you genuinely think that's a remotely convincing argument ?

replies(1): >>43618329 #

1. kannanvijayan ◴[08 Apr 25 04:21 UTC] No.43618329[source]▶

>>43617492 #

LLMs are reasonably competent at surfacing the behaviour of simple programs when the behaviour of those programs is a relatively straightforward structural extension of enough of its training set that it's managed to correlate together.

It's very clear that LLMs lack understanding when you use them for anything remotely sophisticated. I say this as someone who leverages them extensively on a daily basis - mostly for development. They're very powerful tools and I'm grateful for their existence and the force multiplier they represent.

Try to get one to act as a storyteller and the limitations in understanding glare out. You try to goad some creativity and character out of it and it spits out generally insipid recombinations of obvious tropes.

In programming, I use AI strictly as a auto-complete extension. Even in that limited context, the latest models make trivial mistakes in certain circumstances that reveal their lack of understanding. The ones that stand out are the circumstances where the local change to make is very obvious and simple, but the context of the code is something that the ML hasn't seen before.

In those cases, I see them slapping together code that's semantically wrong in the local context, but pattern matches well against the outer context.

It's very clear that the ML doesn't even have a SIMPLE understanding of the language semantics, despite having been trained on presumably multiple billions of lines of code from all sorts of different programming languages.

If you train a human against half a dozen programming languages, you can readily expect by the end of that training that they will, all by themselves, simply through mastering the individual languages, have constructed their own internal generalized models of programming languages as a whole, and would become aware of some semantic generalities. And if I had asked that human to make that same small completion for me, they would have gotten it right. They would have understood that the language semantics are a stronger implicit context compared to the surrounding syntax.

MLs just don't do that. They're very impressive tools, and they are a strong step forward toward some machine model of understanding (sophisticated pattern matching is likely a fundamental prerequisite for understanding), but ascribing understanding to them at this point is jumping the gun. They're not there yet.

replies(1): >>43620994 #

2. og_kalu ◴[08 Apr 25 12:35 UTC] No.43620994[source]▶

>>43618329 (TP) #

>LLMs are reasonably competent at surfacing the behaviour of simple programs when the behaviour of those programs is a relatively straightforward structural extension of enough of its training set that it's managed to correlate together. It's very clear that LLMs lack understanding when you use them for anything remotely sophisticated.

No, because even those 'sophisticated' examples still get very non trivial attempts. If I were to use the same standard of understanding we ascribe to humans, I would rarely class LLMs as having no understanding of some topic. Understanding does not mean perfection or the absence of mistakes, except in fiction and our collective imaginations.

>Try to get one to act as a storyteller and the limitations in understanding glare out. You try to goad some creativity and character out of it and it spits out generally insipid recombinations of obvious tropes.

I do and creativity is not really the issue with some of the new SOTA. I mean i understand what you are saying - default prose often isn't great and every single model besides 2.5-pro cannot handle details/story instructions for longform writing without essentially collapsing but it's not really creativity that's the problem.

>The ones that stand out are the circumstances where the local change to make is very obvious and simple

Obvious and simple to you maybe but with auto-complete, the context the model actually has is dubious at best. It's not like copilot is pasting all the code in 10 files if you have 10 files open. What actually gets in in the context of auto-complete is fairly beyond your control with no way to see what is getting the cut and what isn't.

I don't use auto-complete very often. For me, it doesn't compare to pasting in relevant code myself and asking for what I want. We have very different experiences.

↑