←back to thread

600 points antirez | 1 comments | | HN request time: 0s | source
Show context
krupan ◴[] No.44627207[source]
What is the overall feedback loop with LLMs writing code? Do they learn as they go like we do? Do they just learn from reading code on GitHub? If the latter, what happens as less and less code gets written by human experts? Do the LLMs then stagnate in their progress and start to degrade? Kind of like making analog copies of analog copies of analog copies?
replies(2): >>44627376 #>>44627756 #
Herring ◴[] No.44627376[source]
Code and math are similar to chess/go, where verification is (reasonably) easy so you can generate your own high-quality training data. It's not super straightforward, but you should still expect more progress in coming years.
replies(1): >>44627810 #
cesarb ◴[] No.44627810[source]
> Code and math are similar to chess/go, where verification is (reasonably) easy

Verification for code would be a formal proof, and these are hard; with a few exceptions like seL4, most code does not have any formal proof. Games like chess and go are much easier to verify. Math is in the middle; it also needs formal proofs, but most of math is doing these formal proofs themselves, and even then there are still unproven conjectures.

replies(1): >>44628629 #
Herring ◴[] No.44628629[source]
Verification for code is just running it. Maybe "verification" was the wrong word. The model just needs a sense of code X leads to outcome Y for a large number of (high-quality) XY pairs, to learn how to navigate the space better, same as with games.
replies(1): >>44638458 #
krupan ◴[] No.44638458[source]
"just running it" is an enormously absurd simplification
replies(1): >>44639918 #
Herring ◴[] No.44639918[source]
If you can do better please go ahead?
replies(1): >>44641438 #
1. sn9 ◴[] No.44641438{3}[source]
There's an entire field of formal verification that LLMs can take advantage of.

You can incorporate proofs with Coq or Dafny or use model checkers or TLA+ to actually verify your code.

This will be required for any software where correctness matters.