LLMs understand nullability

(dmodel.ai)

170 points mattmarcus | 3 comments | 07 Apr 25 14:52 UTC | HN request time: 0.514s | source

Show context

lsy ◴[07 Apr 25 17:48 UTC] No.43614042[source]▶

The article puts scare quotes around "understand" etc. to try to head off critiques around the lack of precision or scientific language, but I think this is a really good example of where casual use of these terms can get pretty misleading.

Because code LLMs have been trained on the syntactic form of the program and not its execution, it's not correct — even if the correlation between variable annotations and requested completions was perfect (which it's not) — to say that the model "understands nullability", because nullability means that under execution the variable in question can become null, which is not a state that it's possible for a model trained only on a million programs' syntax to "understand". You could get the same result if e.g. "Optional" means that the variable becomes poisonous and checking "> 0" is eating it, and "!= None" is an antidote. Human programmers can understand nullability because they've hopefully run programs and understand the semantics of making something null.

The paper could use precise, scientific language (e.g. "the presence of nullable annotation tokens correlates to activation of vectors corresponding to, and emission of, null-check tokens with high precision and accuracy") which would help us understand what we can rely on the LLM to do and what we can't. But it seems like there is some subconscious incentive to muddy how people see these models in the hopes that we start ascribing things to them that they aren't capable of.

replies(9): >>43614302 #>>43614352 #>>43614384 #>>43614470 #>>43614508 #>>43614723 #>>43615651 #>>43616059 #>>43616871 #

uh_uh ◴[07 Apr 25 18:22 UTC] No.43614384[source]▶

>>43614042 #

We don't really have a clue what they are and aren't capable of. Prior to the LLM-boom, many people – and I include myself in this – thought it'd be impossible to get to the level of capability we have now purely from statistical methods and here we are. If you have a strong theory that proves some bounds on LLM-capability, then please put it forward. In the absence of that, your sceptical attitude is just as sus as the article's.

replies(3): >>43614753 #>>43615682 #>>43616131 #

1. Baeocystin ◴[07 Apr 25 20:35 UTC] No.43615682[source]▶

>>43614384 #

I majored in CogSci at UCSD in the 90's. I've been interested and active in the machine learning world for decades. The LLM boom took me completely and utterly by surprise, continues to do so, and frankly I am most mystified by the folks who downplay it. These giant matrixes are already so far beyond what we thought was (relatively) easily achievable that even if progress stopped tomorrow, we'd have years of work to put in to understand how we got here. Doesn't mean we've hit AGI, but what we already have is truly remarkable.

replies(2): >>43616325 #>>43622356 #

2. chihuahua ◴[07 Apr 25 21:49 UTC] No.43616325[source]▶

>>43615682 (TP) #

The funny thing is that 1/3 of people think LLMs are dumb and will never amount to anything. Another third think that it's already too late to prevent the rise of superhuman AGI that will destroy humanity, and are calling for airstrikes on any data center that does not submit to their luddite rules. And the last third use LLMs for writing small pieces of code.

3. Workaccount2 ◴[08 Apr 25 14:42 UTC] No.43622356[source]▶

>>43615682 (TP) #

Pretty much until 2022, the de facto orthodoxy for AI was "The creative pursuits will forever be outside the reach of computers".

People are pretty quiet about creative pursuits actually being the low hanging fruit on the AI tree.

↑