Coding with LLMs in the summer of 2025 – an update

1. krupan ◴[20 Jul 25 17:16 UTC] No.44627207[source]▶

What is the overall feedback loop with LLMs writing code? Do they learn as they go like we do? Do they just learn from reading code on GitHub? If the latter, what happens as less and less code gets written by human experts? Do the LLMs then stagnate in their progress and start to degrade? Kind of like making analog copies of analog copies of analog copies?

replies(2): >>44627376 #>>44627756 #

2. Herring ◴[20 Jul 25 17:35 UTC] No.44627376[source]▶

>>44627207 (TP) #

Code and math are similar to chess/go, where verification is (reasonably) easy so you can generate your own high-quality training data. It's not super straightforward, but you should still expect more progress in coming years.

replies(1): >>44627810 #

3. steveklabnik ◴[20 Jul 25 18:14 UTC] No.44627756[source]▶

>>44627207 (TP) #

> Do they learn as they go like we do?

It's complicated. You have to understand that when you ask an LLM something, you have the model itself, which is kind of like a function: put something in, get something out. However, you also pass an argument to that function: the context.

So, in a literal sense, no, they do not learn as they go, in the sense that the model, that function, is unchanged by what you send it. But the context can be modified. So, in some sense, an LLM in a agentic loop that goes and reads some code from GitHub can include that information in the context it uses in the future, so it will "learn" within the session.

> If the latter, what happens as less and less code gets written by human experts?

So, this is still a possible problem, because future trainings of future LLMs will end up being trained on code written by LLMs. If this is a problem or not is yet to be seen, I don't have a good handle on the debates in this area, personally.

4. cesarb ◴[20 Jul 25 18:20 UTC] No.44627810[source]▶

>>44627376 #

> Code and math are similar to chess/go, where verification is (reasonably) easy

Verification for code would be a formal proof, and these are hard; with a few exceptions like seL4, most code does not have any formal proof. Games like chess and go are much easier to verify. Math is in the middle; it also needs formal proofs, but most of math is doing these formal proofs themselves, and even then there are still unproven conjectures.

replies(1): >>44628629 #

5. Herring ◴[20 Jul 25 19:42 UTC] No.44628629{3}[source]▶

>>44627810 #

Verification for code is just running it. Maybe "verification" was the wrong word. The model just needs a sense of code X leads to outcome Y for a large number of (high-quality) XY pairs, to learn how to navigate the space better, same as with games.

replies(1): >>44638458 #

6. krupan ◴[21 Jul 25 18:12 UTC] No.44638458{4}[source]▶

>>44628629 #

"just running it" is an enormously absurd simplification

replies(1): >>44639918 #

7. Herring ◴[21 Jul 25 20:14 UTC] No.44639918{5}[source]▶

>>44638458 #

If you can do better please go ahead?

replies(1): >>44641438 #

8. sn9 ◴[21 Jul 25 23:14 UTC] No.44641438{6}[source]▶

>>44639918 #

There's an entire field of formal verification that LLMs can take advantage of.

You can incorporate proofs with Coq or Dafny or use model checkers or TLA+ to actually verify your code.

This will be required for any software where correctness matters.