←back to thread

169 points mattmarcus | 1 comments | | HN request time: 0s | source
Show context
lsy ◴[] No.43614042[source]
The article puts scare quotes around "understand" etc. to try to head off critiques around the lack of precision or scientific language, but I think this is a really good example of where casual use of these terms can get pretty misleading.

Because code LLMs have been trained on the syntactic form of the program and not its execution, it's not correct — even if the correlation between variable annotations and requested completions was perfect (which it's not) — to say that the model "understands nullability", because nullability means that under execution the variable in question can become null, which is not a state that it's possible for a model trained only on a million programs' syntax to "understand". You could get the same result if e.g. "Optional" means that the variable becomes poisonous and checking "> 0" is eating it, and "!= None" is an antidote. Human programmers can understand nullability because they've hopefully run programs and understand the semantics of making something null.

The paper could use precise, scientific language (e.g. "the presence of nullable annotation tokens correlates to activation of vectors corresponding to, and emission of, null-check tokens with high precision and accuracy") which would help us understand what we can rely on the LLM to do and what we can't. But it seems like there is some subconscious incentive to muddy how people see these models in the hopes that we start ascribing things to them that they aren't capable of.

replies(9): >>43614302 #>>43614352 #>>43614384 #>>43614470 #>>43614508 #>>43614723 #>>43615651 #>>43616059 #>>43616871 #
uh_uh ◴[] No.43614384[source]
We don't really have a clue what they are and aren't capable of. Prior to the LLM-boom, many people – and I include myself in this – thought it'd be impossible to get to the level of capability we have now purely from statistical methods and here we are. If you have a strong theory that proves some bounds on LLM-capability, then please put it forward. In the absence of that, your sceptical attitude is just as sus as the article's.
replies(3): >>43614753 #>>43615682 #>>43616131 #
kubav027 ◴[] No.43616131[source]
LLM also have no idea what it is capable of. This feels like difference to humans. Having some understanding of the problem also means knowing or "feeling" the limits of that understanding.
replies(2): >>43619024 #>>43619041 #
uh_uh ◴[] No.43619041[source]
1. Many humans don't have an idea of the limits of their competence. It's called the Dunning–Kruger effect.

2. LLMs regularly tell me if what I'm asking for is possible or not. I'm not saying they're always correct, but they seem to have at least some sense of what's in the realm of possibility.

replies(1): >>43621738 #
kubav027 ◴[] No.43621738[source]
1. Dunning-kruger effect describes difference in expected and real performance. It is not saying that humans confidently give wrong answers if they do not know correct ones.

2. That is not my experience. Almost half of the time LLM gives wrong answer without any warning. It is up to me to check correctness. Even if I follow up it often continues to give wrong answers.

replies(1): >>43624916 #
1. uh_uh ◴[] No.43624916[source]
1. "It is not saying that humans confidently give wrong answers if they do not know correct ones." And I didn't say that they do either, so you might have hallucinated that.

2. What are you arguing about? I didn't say they're always correct obviously.