Most active commenters

shafyy(6)
uh_uh(4)

Popular/hot comments

>>43619959 #

←back to thread

LLMs understand nullability

(dmodel.ai)

Show context

lsy ◴[07 Apr 25 17:48 UTC] No.43614042[source]▶

>>43612211 (OP) #

The article puts scare quotes around "understand" etc. to try to head off critiques around the lack of precision or scientific language, but I think this is a really good example of where casual use of these terms can get pretty misleading.

Because code LLMs have been trained on the syntactic form of the program and not its execution, it's not correct — even if the correlation between variable annotations and requested completions was perfect (which it's not) — to say that the model "understands nullability", because nullability means that under execution the variable in question can become null, which is not a state that it's possible for a model trained only on a million programs' syntax to "understand". You could get the same result if e.g. "Optional" means that the variable becomes poisonous and checking "> 0" is eating it, and "!= None" is an antidote. Human programmers can understand nullability because they've hopefully run programs and understand the semantics of making something null.

The paper could use precise, scientific language (e.g. "the presence of nullable annotation tokens correlates to activation of vectors corresponding to, and emission of, null-check tokens with high precision and accuracy") which would help us understand what we can rely on the LLM to do and what we can't. But it seems like there is some subconscious incentive to muddy how people see these models in the hopes that we start ascribing things to them that they aren't capable of.

replies(9): >>43614302 #>>43614352 #>>43614384 #>>43614470 #>>43614508 #>>43614723 #>>43615651 #>>43616059 #>>43616871 #

waldrews ◴[07 Apr 25 18:34 UTC] No.43614508[source]▶

>>43614042 #

I was going to say "so you believe the LLM's don't have the capacity to understand" but then I realized that the precise language would be something like "the presence of photons in this human's retinas in patterns encoding statements about LLM's having understanding correlates to the activation of neuron signaling chains corresponding to, and emission of, muscle activations engaging keyboard switches, which produce patterns of 'no they don't' with high frequency."

The critiques of mental state applied to the LLM's are increasingly applicable to us biologicals, and that's the philosophical abyss we're staring down.

replies(3): >>43615279 #>>43615833 #>>43615903 #

1. shafyy ◴[07 Apr 25 21:02 UTC] No.43615903[source]▶

>>43614508 #

Countering the argument that LLMs are just gloriefied probability machines and do not undertand or think with "how do you know humans are not the same" has been the biggest achievement of AI hypemen (and yes, it's mostly men).

Of course, now you can say "how do you know that our brains are not just efficient computers that run LLMs", but I feel like the onus of proof lies on the makers of this claim, not on the other side.

It is very likely that human intelligence is not just autocomplete on crack, given all we know about neuroscience so far.

replies(2): >>43616482 #>>43618260 #

2. mlinhares ◴[07 Apr 25 22:07 UTC] No.43616482[source]▶

>>43615903 (TP) #

BuT iT CoUlD Be, cAn YoU PrOvE ThAT IT is NOt?

I'm having a great experience using Cursor, but i don't feel like trying to overhype it, it just makes me tired to see all this hype. Its a great tool, makes me more productive, nothing beyond that.

replies(1): >>43619136 #

3. BobbyTables2 ◴[08 Apr 25 04:06 UTC] No.43618260[source]▶

>>43615903 (TP) #

I’ve come to realize AI works as well as it does because it was trained extensively on the same kinds of things people normally ask. So, it already has the benefit of vast amounts of human responses.

Of course, ask it a PhD level question and it will confidently hallucinate more than Beavis & Butthead.

It really is a damn glorified autocomplete, unfortunately very useful as a search engine replacement.

replies(1): >>43619114 #

4. uh_uh ◴[08 Apr 25 07:10 UTC] No.43619114[source]▶

>>43618260 #

The LLM is a glorified autocomplete in as much as you are a glorified replicator. Yes, it was trained on autocomplete but that doesn't say much about what capabilities might emerge.

replies(1): >>43619130 #

5. shafyy ◴[08 Apr 25 07:13 UTC] No.43619130{3}[source]▶

>>43619114 #

> Yes, it was trained on autocomplete but that doesn't say much about what capabilities might emerge.

No, but we know how it works and it is just a stochastic parrot. There is no magic in there.

What is more suprising to me that humans are so predictable that a glorified autocomplete works this well. Then again, it's not that suprising....

replies(1): >>43619959 #

6. shafyy ◴[08 Apr 25 07:13 UTC] No.43619136[source]▶

>>43616482 #

That's great for you. I'm not diminishing your experience or taking it away. I think we agree on the hype.

7. uh_uh ◴[08 Apr 25 10:01 UTC] No.43619959{4}[source]▶

>>43619130 #

Sorry but this is nonsense. Do you have a theory about when certain LLM capabilities emerge? AFAIK we don't have a good theory about when and why they do emerge.

But even if knew how something works (which in present case we don't), shouldn't diminish our opinion of it. Will you have a lesser opinion of human intelligence, once we figure out how it works?

replies(3): >>43620237 #>>43620572 #>>43624170 #

8. sfn42 ◴[08 Apr 25 10:52 UTC] No.43620237{5}[source]▶

>>43619959 #

I'm sure at any given point there's hundreds of this exact discussion occurring in various threads on HN.

LLMs are cool, a lot of people find them useful. Hype bros are full of crap and there's no point arguing with them because it's always a pointless discussion. With crypto and nfts it's future predictions which are just inherently impossible to reason about, with ai it's partially that, and partially the whole "do they have human properties" thing which is equally impossible to reason about.

It gets discussed to death every single day.

replies(1): >>43620534 #

9. shafyy ◴[08 Apr 25 11:42 UTC] No.43620534{6}[source]▶

>>43620237 #

100%

10. shafyy ◴[08 Apr 25 11:47 UTC] No.43620572{5}[source]▶

>>43619959 #

> Do you have a theory about when certain LLM capabilities emerge?

We do know how LLMs work, correct? We also know what they are capable of and what not (of course this line is often blurred by hype).

I am not an expert at all on LLMs or neuroscience. But it is apparent that having a discussion with a human vs. with an LLM is a completely different ballpark. I am not saying that we will never have technology that can "understand" and "think" like a human does. I am just saying, this is not it.

Also, just because a lot of progress in LLMs has been made in the past 5 years, that we can just extrapolate the future progress on this. Local maxima and technology limits are a thing.

replies(1): >>43624870 #

11. slowmovintarget ◴[08 Apr 25 17:19 UTC] No.43624170{5}[source]▶

>>43619959 #

There has been, to date, no demonstrated emergence from LLMs. There has been probabilistic drift in their outputs based on their inputs (training set, training time, reinforcement, fine-tuning, system prompts, and inference parameters). All of these effects on outputs are predictable, and all are first order effects. We don't have any emergence yet.

We do have proofs that hallucination will always be a problem. We have proofs that the "reasoning" for models that "think" are actually regurgitation of human explanations written out. When asked to do truly novel things, the models fail. When asked to do high-precision things, the models fail. When asked to do high-accuracy things, the models fail.

LLMs don't understand. They are search engines. We are experience engines, and philosophically, we don't have a way to tokenize experience, we can only tokenize its description. So while LLMs can juggle descriptions all day long, these algorithms do so disconnected from the underlying experiences required for understanding.

replies(1): >>43624815 #

12. uh_uh ◴[08 Apr 25 18:16 UTC] No.43624815{6}[source]▶

>>43624170 #

Examples of emergence:

1. Multi-step reasoning with backtracking when DeepSeek R1 was trained via GRPO.

2. Translation of languages they haven't even seen via in-context learning.

3. Arithmetic: heavily correlated with model size, but it does appear.

I could go on.

Albeit it's not an LLM, but a deep learning model trained via RL, would you say that AlphaZero's move 37 also doesn't count as emergence and the model has no understanding of Go?

13. uh_uh ◴[08 Apr 25 18:20 UTC] No.43624870{6}[source]▶

>>43620572 #

> We do know how LLMs work, correct?

NO! We have working training algorithms. We still don't have a complete understanding of why deep learning works in practice, and especially not why it works at the current level of scale. If you disagree, please cite me the papers because I'd love to read them.

To put in another way: Just because you can breed dogs, it doesn't necessary mean that you have a working theory of genes or even that you know they exist. Which was actually the human condition for most of history.

replies(1): >>43631151 #

14. shafyy ◴[09 Apr 25 12:02 UTC] No.43631151{7}[source]▶

>>43624870 #

We do know in general how LLMs work. Now, it's of course not always possible to say why a specific output is generated given an input, but we do know HOW it does it.

To translate it to your analogy with dogs: We do know how the anatomy of dogs work, but we do not know why they sometimes fetch the stick and sometimes not.

↑