i have this feeling with LLM's generated react frontend, they all look the same
Models don't emit something they don't know. They remix and rewrite what they know. There's no invention, just recall...
People really need to stop saying this. I get that it was the Smart Guy Thing To Say in 2023, but by this point it’s pretty clear that that it’s not true in any way that matters for most practical purposes.
Coding LLMs have clearly been trained on conversations where a piece of code is shown, a transformation is requested (rewrite this from Python to Go), and then the transformed code is shown. It’s not that they’re just learning codebases, they’re learning what working with code looks like.
Thus you can ask an LLM to refactor a program in a language it has never seen, and it will “know” what refactoring means, because it has seen it done many times, and it will stand a good chance of doing the right thing.
That’s why they’re useful. They’re doing something way more sophisticated than just “recombining codebases from their training data”, and anyone chirping 2023 sound bites is going to miss that.
The tasks where it works great are things I'd expect to be part of dataset (github, blog posts), or they are "classic" LM tasks (understand + copy-paste/patch). The actual intelligence, in my opinion, is still very limited. So while it's true it's not "just recall" it still might be "mostly recall".
BTW: Copy-paste is something which works great in any attention-based model. On the other hand, models like RWKV usually fail and are not suited for this IMHO (but I think they have much better potential for the AGI)