Most active commenters

shafyy(6)
uh_uh(4)

Popular/hot comments

>>43619959 #

←back to thread

LLMs understand nullability

(dmodel.ai)

Show context

lsy ◴[07 Apr 25 17:48 UTC] No.43614042[source]▶

>>43612211 (OP) #

The article puts scare quotes around "understand" etc. to try to head off critiques around the lack of precision or scientific language, but I think this is a really good example of where casual use of these terms can get pretty misleading.

Because code LLMs have been trained on the syntactic form of the program and not its execution, it's not correct — even if the correlation between variable annotations and requested completions was perfect (which it's not) — to say that the model "understands nullability", because nullability means that under execution the variable in question can become null, which is not a state that it's possible for a model trained only on a million programs' syntax to "understand". You could get the same result if e.g. "Optional" means that the variable becomes poisonous and checking "> 0" is eating it, and "!= None" is an antidote. Human programmers can understand nullability because they've hopefully run programs and understand the semantics of making something null.

The paper could use precise, scientific language (e.g. "the presence of nullable annotation tokens correlates to activation of vectors corresponding to, and emission of, null-check tokens with high precision and accuracy") which would help us understand what we can rely on the LLM to do and what we can't. But it seems like there is some subconscious incentive to muddy how people see these models in the hopes that we start ascribing things to them that they aren't capable of.

replies(9): >>43614302 #>>43614352 #>>43614384 #>>43614470 #>>43614508 #>>43614723 #>>43615651 #>>43616059 #>>43616871 #

1. waldrews ◴[07 Apr 25 18:34 UTC] No.43614508[source]▶

>>43614042 #

I was going to say "so you believe the LLM's don't have the capacity to understand" but then I realized that the precise language would be something like "the presence of photons in this human's retinas in patterns encoding statements about LLM's having understanding correlates to the activation of neuron signaling chains corresponding to, and emission of, muscle activations engaging keyboard switches, which produce patterns of 'no they don't' with high frequency."

The critiques of mental state applied to the LLM's are increasingly applicable to us biologicals, and that's the philosophical abyss we're staring down.

replies(3): >>43615279 #>>43615833 #>>43615903 #

2. xigency ◴[07 Apr 25 19:57 UTC] No.43615279[source]▶

>>43614508 (TP) #

This only applies to people who understand how computers and computer programs work, because someone who doesn't externalize their thinking process would never ascribe human elements of consciousness to inanimate materials.

Certainly many ancient people worshiped celestial objects or crafted idols by their own hands and ascribed to them powers greater than themselves. That doesn't really help in the long run compared to taking personal responsibility for one's own actions and motives, the best interests of their tribe or community, and taking initiative to understand the underlying cause of mysterious phenomena.

replies(1): >>43615772 #

3. ◴[07 Apr 25 20:47 UTC] No.43615772[source]▶

>>43615279 #

4. mjburgess ◴[07 Apr 25 20:54 UTC] No.43615833[source]▶

>>43614508 (TP) #

No it's not. He gave you modal conditions on "understanding", he said: predicting the syntax of valid programs, and their operational semantics, ie., the behaviour of the computer as it runs.

I would go much further than this; but this is a de minimus criteria that the LLM already fails.

What zealots eventually discover is that they can hold their "fanatical proposition" fixed in the face of all opposition to the contrary, by tearing down the whole edifice of science, knowledge, and reality itself.

If you wish to assert, against any reasonable thought, that the sky is a pink dome you can do so -- first that our eyes are broken, and then, eventually that we live in some paranoid "philosophical abyss" carefully constructed to permit your paranoia.

This abursidty is exhausting, and I'd wish one day to find fanatics who'd realise it quickly and abate it -- but alas, I have never.

If you find yourself hollowing-out the meaning of words to the point of making no distinctions, denying reality to reality itself, and otherwise arriving at a "philosophical abyss" be aware that it is your cherished propositions which are the maddness and nothing else.

Here: no, the LLM does not understand. Yes, we do. It is your job to begin from reasonable premises and abduce reasonable theories. If you do not, you will not.

replies(1): >>43617492 #

5. shafyy ◴[07 Apr 25 21:02 UTC] No.43615903[source]▶

>>43614508 (TP) #

Countering the argument that LLMs are just gloriefied probability machines and do not undertand or think with "how do you know humans are not the same" has been the biggest achievement of AI hypemen (and yes, it's mostly men).

Of course, now you can say "how do you know that our brains are not just efficient computers that run LLMs", but I feel like the onus of proof lies on the makers of this claim, not on the other side.

It is very likely that human intelligence is not just autocomplete on crack, given all we know about neuroscience so far.

replies(2): >>43616482 #>>43618260 #

6. mlinhares ◴[07 Apr 25 22:07 UTC] No.43616482[source]▶

>>43615903 #

BuT iT CoUlD Be, cAn YoU PrOvE ThAT IT is NOt?

I'm having a great experience using Cursor, but i don't feel like trying to overhype it, it just makes me tired to see all this hype. Its a great tool, makes me more productive, nothing beyond that.

replies(1): >>43619136 #

7. og_kalu ◴[08 Apr 25 01:20 UTC] No.43617492[source]▶

>>43615833 #

>No it's not. He gave you modal conditions on "understanding", he said: predicting the syntax of valid programs, and their operational semantics, ie., the behaviour of the computer as it runs.

LLMs are perfectly capable of predicting the behavior of programs. You don't have to take my word for it, you can test it yourself. So he gave modal conditions they already satisfy. Can I conclude they understand now ?

>If you find yourself hollowing-out the meaning of words to the point of making no distinctions, denying reality to reality itself, and otherwise arriving at a "philosophical abyss" be aware that it is your cherished propositions which are the maddness and nothing else.

The only people denying reality are those who insist that it is not 'real' understanding and yet cannot distinguish this 'real' from 'fake' property in any verifiable manner, the very definition of an invented property.

Your argument boils down to 'I think it's so absurd so it cannot be so'. That's the best you can do ? Do you genuinely think that's a remotely convincing argument ?

replies(1): >>43618329 #

8. BobbyTables2 ◴[08 Apr 25 04:06 UTC] No.43618260[source]▶

>>43615903 #

I’ve come to realize AI works as well as it does because it was trained extensively on the same kinds of things people normally ask. So, it already has the benefit of vast amounts of human responses.

Of course, ask it a PhD level question and it will confidently hallucinate more than Beavis & Butthead.

It really is a damn glorified autocomplete, unfortunately very useful as a search engine replacement.

replies(1): >>43619114 #

9. kannanvijayan ◴[08 Apr 25 04:21 UTC] No.43618329{3}[source]▶

>>43617492 #

LLMs are reasonably competent at surfacing the behaviour of simple programs when the behaviour of those programs is a relatively straightforward structural extension of enough of its training set that it's managed to correlate together.

It's very clear that LLMs lack understanding when you use them for anything remotely sophisticated. I say this as someone who leverages them extensively on a daily basis - mostly for development. They're very powerful tools and I'm grateful for their existence and the force multiplier they represent.

Try to get one to act as a storyteller and the limitations in understanding glare out. You try to goad some creativity and character out of it and it spits out generally insipid recombinations of obvious tropes.

In programming, I use AI strictly as a auto-complete extension. Even in that limited context, the latest models make trivial mistakes in certain circumstances that reveal their lack of understanding. The ones that stand out are the circumstances where the local change to make is very obvious and simple, but the context of the code is something that the ML hasn't seen before.

In those cases, I see them slapping together code that's semantically wrong in the local context, but pattern matches well against the outer context.

It's very clear that the ML doesn't even have a SIMPLE understanding of the language semantics, despite having been trained on presumably multiple billions of lines of code from all sorts of different programming languages.

If you train a human against half a dozen programming languages, you can readily expect by the end of that training that they will, all by themselves, simply through mastering the individual languages, have constructed their own internal generalized models of programming languages as a whole, and would become aware of some semantic generalities. And if I had asked that human to make that same small completion for me, they would have gotten it right. They would have understood that the language semantics are a stronger implicit context compared to the surrounding syntax.

MLs just don't do that. They're very impressive tools, and they are a strong step forward toward some machine model of understanding (sophisticated pattern matching is likely a fundamental prerequisite for understanding), but ascribing understanding to them at this point is jumping the gun. They're not there yet.

replies(1): >>43620994 #

10. uh_uh ◴[08 Apr 25 07:10 UTC] No.43619114{3}[source]▶

>>43618260 #

The LLM is a glorified autocomplete in as much as you are a glorified replicator. Yes, it was trained on autocomplete but that doesn't say much about what capabilities might emerge.

replies(1): >>43619130 #

11. shafyy ◴[08 Apr 25 07:13 UTC] No.43619130{4}[source]▶

>>43619114 #

> Yes, it was trained on autocomplete but that doesn't say much about what capabilities might emerge.

No, but we know how it works and it is just a stochastic parrot. There is no magic in there.

What is more suprising to me that humans are so predictable that a glorified autocomplete works this well. Then again, it's not that suprising....

replies(1): >>43619959 #

12. shafyy ◴[08 Apr 25 07:13 UTC] No.43619136{3}[source]▶

>>43616482 #

That's great for you. I'm not diminishing your experience or taking it away. I think we agree on the hype.

13. uh_uh ◴[08 Apr 25 10:01 UTC] No.43619959{5}[source]▶

>>43619130 #

Sorry but this is nonsense. Do you have a theory about when certain LLM capabilities emerge? AFAIK we don't have a good theory about when and why they do emerge.

But even if knew how something works (which in present case we don't), shouldn't diminish our opinion of it. Will you have a lesser opinion of human intelligence, once we figure out how it works?

replies(3): >>43620237 #>>43620572 #>>43624170 #

14. sfn42 ◴[08 Apr 25 10:52 UTC] No.43620237{6}[source]▶

>>43619959 #

I'm sure at any given point there's hundreds of this exact discussion occurring in various threads on HN.

LLMs are cool, a lot of people find them useful. Hype bros are full of crap and there's no point arguing with them because it's always a pointless discussion. With crypto and nfts it's future predictions which are just inherently impossible to reason about, with ai it's partially that, and partially the whole "do they have human properties" thing which is equally impossible to reason about.

It gets discussed to death every single day.

replies(1): >>43620534 #

15. shafyy ◴[08 Apr 25 11:42 UTC] No.43620534{7}[source]▶

>>43620237 #

100%

16. shafyy ◴[08 Apr 25 11:47 UTC] No.43620572{6}[source]▶

>>43619959 #

> Do you have a theory about when certain LLM capabilities emerge?

We do know how LLMs work, correct? We also know what they are capable of and what not (of course this line is often blurred by hype).

I am not an expert at all on LLMs or neuroscience. But it is apparent that having a discussion with a human vs. with an LLM is a completely different ballpark. I am not saying that we will never have technology that can "understand" and "think" like a human does. I am just saying, this is not it.

Also, just because a lot of progress in LLMs has been made in the past 5 years, that we can just extrapolate the future progress on this. Local maxima and technology limits are a thing.

replies(1): >>43624870 #

17. og_kalu ◴[08 Apr 25 12:35 UTC] No.43620994{4}[source]▶

>>43618329 #

>LLMs are reasonably competent at surfacing the behaviour of simple programs when the behaviour of those programs is a relatively straightforward structural extension of enough of its training set that it's managed to correlate together. It's very clear that LLMs lack understanding when you use them for anything remotely sophisticated.

No, because even those 'sophisticated' examples still get very non trivial attempts. If I were to use the same standard of understanding we ascribe to humans, I would rarely class LLMs as having no understanding of some topic. Understanding does not mean perfection or the absence of mistakes, except in fiction and our collective imaginations.

>Try to get one to act as a storyteller and the limitations in understanding glare out. You try to goad some creativity and character out of it and it spits out generally insipid recombinations of obvious tropes.

I do and creativity is not really the issue with some of the new SOTA. I mean i understand what you are saying - default prose often isn't great and every single model besides 2.5-pro cannot handle details/story instructions for longform writing without essentially collapsing but it's not really creativity that's the problem.

>The ones that stand out are the circumstances where the local change to make is very obvious and simple

Obvious and simple to you maybe but with auto-complete, the context the model actually has is dubious at best. It's not like copilot is pasting all the code in 10 files if you have 10 files open. What actually gets in in the context of auto-complete is fairly beyond your control with no way to see what is getting the cut and what isn't.

I don't use auto-complete very often. For me, it doesn't compare to pasting in relevant code myself and asking for what I want. We have very different experiences.

18. slowmovintarget ◴[08 Apr 25 17:19 UTC] No.43624170{6}[source]▶

>>43619959 #

There has been, to date, no demonstrated emergence from LLMs. There has been probabilistic drift in their outputs based on their inputs (training set, training time, reinforcement, fine-tuning, system prompts, and inference parameters). All of these effects on outputs are predictable, and all are first order effects. We don't have any emergence yet.

We do have proofs that hallucination will always be a problem. We have proofs that the "reasoning" for models that "think" are actually regurgitation of human explanations written out. When asked to do truly novel things, the models fail. When asked to do high-precision things, the models fail. When asked to do high-accuracy things, the models fail.

LLMs don't understand. They are search engines. We are experience engines, and philosophically, we don't have a way to tokenize experience, we can only tokenize its description. So while LLMs can juggle descriptions all day long, these algorithms do so disconnected from the underlying experiences required for understanding.

replies(1): >>43624815 #

19. uh_uh ◴[08 Apr 25 18:16 UTC] No.43624815{7}[source]▶

>>43624170 #

Examples of emergence:

1. Multi-step reasoning with backtracking when DeepSeek R1 was trained via GRPO.

2. Translation of languages they haven't even seen via in-context learning.

3. Arithmetic: heavily correlated with model size, but it does appear.

I could go on.

Albeit it's not an LLM, but a deep learning model trained via RL, would you say that AlphaZero's move 37 also doesn't count as emergence and the model has no understanding of Go?

20. uh_uh ◴[08 Apr 25 18:20 UTC] No.43624870{7}[source]▶

>>43620572 #

> We do know how LLMs work, correct?

NO! We have working training algorithms. We still don't have a complete understanding of why deep learning works in practice, and especially not why it works at the current level of scale. If you disagree, please cite me the papers because I'd love to read them.

To put in another way: Just because you can breed dogs, it doesn't necessary mean that you have a working theory of genes or even that you know they exist. Which was actually the human condition for most of history.

replies(1): >>43631151 #

21. shafyy ◴[09 Apr 25 12:02 UTC] No.43631151{8}[source]▶

>>43624870 #

We do know in general how LLMs work. Now, it's of course not always possible to say why a specific output is generated given an input, but we do know HOW it does it.

To translate it to your analogy with dogs: We do know how the anatomy of dogs work, but we do not know why they sometimes fetch the stick and sometimes not.

↑