Coding with LLMs in the summer of 2025 – an update

(antirez.com)

600 points antirez | 1 comments | 20 Jul 25 11:04 UTC | HN request time: 0s | source

Show context

quantumHazer ◴[20 Jul 25 13:42 UTC] No.44625120[source]▶

I'm going a little offtopic here, but I disagree with the OPs use of the term "PhD-level knowledge", although I have a huge amount of respect for antirez (beside that we are born in the same island).

This phrasing can be misleading and points to a broader misunderstanding about the nature of doctoral studies, which it has been influenced by the marketing and hype discourse surrounding AI labs.

The assertion that there is a defined "PhD-level knowledge" is pretty useless. The primary purpose of a PhD is not simply to acquire a vast amount of pre-existing knowledge, but rather to learn how to conduct research.

replies(6): >>44625135 #>>44626038 #>>44626244 #>>44626345 #>>44632846 #>>44633598 #

antirez ◴[20 Jul 25 13:44 UTC] No.44625135[source]▶

>>44625120 #

Agree with that. Read it as expert-level knowledge without all the other stuff LLMs can’t do as well as humans. LLMs way to express knowledge is kinda of alien as it is different, so indeed those are all poor simplifications. For instance an LLM can’t code as well as a top human coder but can write a non trivial program from the first to the last character without iterating.

replies(1): >>44625422 #

spyckie2 ◴[20 Jul 25 14:17 UTC] No.44625422[source]▶

>>44625135 #

Hey antirez,

What sticks out to me is Gemini catching bugs before production release, was hoping you’d give a little more insight into that.

Reason being is that we expect ai to create bugs and we catch them, but if Gemini is spotting bugs by some way of it being a QA (not just by writing and passing tests) then that perks my interest.

replies(1): >>44627752 #

jacobr1 ◴[20 Jul 25 18:13 UTC] No.44627752[source]▶

>>44625422 #

Our team has pretty aggressively started using LLMs for automated code review. It will look at our PRs and post comments. We can adding more material for different things for it to consider- from a looking at a summarized version of our API guidelines, general prompts like, "You are an expert software engineer and QA professional, review this PR and point out any bugs or other areas of technical risk. Make concise suggestions for improvement where applicable." - it catches a ton of stuff.

Another area we've started doing is having it look at build failures and writing a report on suggested root causes before even a human looks at it - saves time.

Or (and we haven't rolled this out automatically yet but are testing a prototype) having it triage alarms from our metrics, with access to the logs and codebase to investigate.

replies(2): >>44630651 #>>44654172 #

infecto ◴[21 Jul 25 00:17 UTC] No.44630651{3}[source]▶

>>44627752 #

I have been surprised more folks have no rolled these out as paid for products. I have been getting tons of use out of systems like cursors bugbot. The signal to noise is high and while it’s not always right it catches a lot of bugs I would have missed.

replies(1): >>44632454 #

senko ◴[21 Jul 25 06:58 UTC] No.44632454{4}[source]▶

>>44630651 #

There are a few: Greptile, Ellipsis, GH Copilot (integrated with GH)

I feel many also try "review and fix automatically", as it's tempting to "just" pass the generated comments to a second agent to apply them.

But that opens a whole other can of worms and pretty soon you're just another code assistant service.

replies(1): >>44654193 #

dearilos ◴[22 Jul 25 23:30 UTC] No.44654193{5}[source]▶

>>44632454 #

if you do specific prompts based on your team's tribal knowledge and standards it works really well

"look at this code for bugs" doesn't end up working well, which is what most code reviewers do.

replies(1): >>44660655 #

1. jacobr1 ◴[23 Jul 25 15:57 UTC] No.44660655{7}[source]▶

>>44654193 #

Yep - this is the key to making it work

↑