←back to thread

454 points nathan-barry | 2 comments | | HN request time: 0s | source
Show context
kibwen ◴[] No.45645307[source]
To me, the diffusion-based approach "feels" more akin to whats going on in an animal brain than the token-at-a-time approach of the in-vogue LLMs. Speaking for myself, I don't generate words one a time based on previously spoken words; I start by having some fuzzy idea in my head and the challenge is in serializing it into language coherently.
replies(14): >>45645350 #>>45645383 #>>45645401 #>>45645402 #>>45645509 #>>45645523 #>>45645607 #>>45645665 #>>45645670 #>>45645891 #>>45645973 #>>45647491 #>>45648578 #>>45652892 #
ma2rten ◴[] No.45645523[source]
Interpretability research has found that Autoregressive LLMs also plan ahead what they are going to say.
replies(2): >>45645712 #>>45646027 #
aidenn0 ◴[] No.45645712[source]
This seems likely just from the simple fact that they can reliably generate contextually correct sentences in e.g. German Imperfekt.
replies(3): >>45651812 #>>45651822 #>>45653730 #
1. ma2rten ◴[] No.45651812{3}[source]
It's actually true on many levels, if you think about is needed for generating syntactically and grammatically correct sentences, coherent text and working code.
replies(1): >>45658031 #
2. aidenn0 ◴[] No.45658031[source]
Just generating syntactically and grammatically correct sentences doesn't need much lookahead; prefixes to sentences that cannot be properly completed are going to be extremely unlikely to be generated.