Coding with LLMs in the summer of 2025 – an update

(antirez.com)

600 points antirez | 1 comments | 20 Jul 25 11:04 UTC | HN request time: 0s | source

Show context

krupan ◴[20 Jul 25 17:16 UTC] No.44627207[source]▶

What is the overall feedback loop with LLMs writing code? Do they learn as they go like we do? Do they just learn from reading code on GitHub? If the latter, what happens as less and less code gets written by human experts? Do the LLMs then stagnate in their progress and start to degrade? Kind of like making analog copies of analog copies of analog copies?

replies(2): >>44627376 #>>44627756 #

Herring ◴[20 Jul 25 17:35 UTC] No.44627376[source]▶

>>44627207 #

Code and math are similar to chess/go, where verification is (reasonably) easy so you can generate your own high-quality training data. It's not super straightforward, but you should still expect more progress in coming years.

replies(1): >>44627810 #

cesarb ◴[20 Jul 25 18:20 UTC] No.44627810[source]▶

>>44627376 #

> Code and math are similar to chess/go, where verification is (reasonably) easy

Verification for code would be a formal proof, and these are hard; with a few exceptions like seL4, most code does not have any formal proof. Games like chess and go are much easier to verify. Math is in the middle; it also needs formal proofs, but most of math is doing these formal proofs themselves, and even then there are still unproven conjectures.

replies(1): >>44628629 #

Herring ◴[20 Jul 25 19:42 UTC] No.44628629[source]▶

>>44627810 #

Verification for code is just running it. Maybe "verification" was the wrong word. The model just needs a sense of code X leads to outcome Y for a large number of (high-quality) XY pairs, to learn how to navigate the space better, same as with games.

replies(1): >>44638458 #

krupan ◴[21 Jul 25 18:12 UTC] No.44638458[source]▶

>>44628629 #

"just running it" is an enormously absurd simplification

replies(1): >>44639918 #

Herring ◴[21 Jul 25 20:14 UTC] No.44639918[source]▶

>>44638458 #

If you can do better please go ahead?

replies(1): >>44641438 #

1. sn9 ◴[21 Jul 25 23:14 UTC] No.44641438{3}[source]▶

>>44639918 #

There's an entire field of formal verification that LLMs can take advantage of.

You can incorporate proofs with Coq or Dafny or use model checkers or TLA+ to actually verify your code.

This will be required for any software where correctness matters.

↑