Most active commenters
  • simonw(4)

←back to thread

577 points simonw | 27 comments | | HN request time: 0.001s | source | bottom
1. righthand ◴[] No.44724896[source]
Did you understand the implementation or just that it produced a result?

I would hope an LLM could spit out a cobbled form of answer to a common interview question.

Today a colleague presented data changes and used an LLM to build a display app for the JSON for presentation. Why did they not just pipe the JSON into our already working app that displays this data?

People around me for the most part are using LLMs to enhance their presentations, not to actually implement anything useful. I have been watching my coworkers use it that way for months.

Another example? A different coworker wanted to build a document macro to perform bulk updates on courseware content. Swapping old words for new words. To build the macro they first wrote a rubrick to prompt an LLM correctly inside of a word doc.

That filled rubrik is then used to generate a program template for the macro. To define the requirements for the macro the coworker then used a slideshow slide to list bullet points of functionality, in this case to Find+Replace words in courseware slides/documents using a list of words from another text document. Due to the complexity of the system, I can’t believe my colleague saved any time. The presentation was interesting though and that is what they got compliments on.

However the solutions are absolutely useless for anyone else but the implementer.

replies(3): >>44724928 #>>44728396 #>>44728544 #
2. simonw ◴[] No.44724928[source]
I scanned the code and understood what it was doing, but I didn't spend much time on it once I'd seen that it worked.

If I'm writing code for production systems using LLMs I still review every single line - my personal rule is I need to be able to explain how it works to someone else before I'm willing to commit it.

I wrote a whole lot more about my approach to using LLMs to help write "real" code here: https://simonwillison.net/2025/Mar/11/using-llms-for-code/

replies(4): >>44725190 #>>44725429 #>>44727620 #>>44732217 #
3. th0ma5 ◴[] No.44725190[source]
[flagged]
replies(4): >>44725300 #>>44725355 #>>44725480 #>>44726957 #
4. CamperBob2 ◴[] No.44725300{3}[source]
I missed the part where he said he was going to put the Space Invaders game into production. Link?
5. bnchrch ◴[] No.44725355{3}[source]
You do realize your talking to the creator of Django, Datassette, and Lanyrd right?
replies(3): >>44726099 #>>44726690 #>>44731602 #
6. photon_lines ◴[] No.44725429[source]
This is why I love using the Deep-Seek chain of reason output ... I can actually go through and read what it's 'thinking' to validate whether it's basing its solution on valid facts / assumptions. Either way thanks for all of your valuable write-ups on these models I really appreciate them Simon!
replies(1): >>44728174 #
7. ajcp ◴[] No.44725480{3}[source]
They said "production systems", not "critical production applications".

Also the 'if' doesn't negate anything as they say "I still", meaning the behavior is actively happening or ongoing; they don't use a hypothetical or conditional after "still", as in "I still would".

8. ◴[] No.44726099{4}[source]
9. tough ◴[] No.44726690{4}[source]
that made me chuckle
10. dang ◴[] No.44726957{3}[source]
Please don't cross into personal attack in HN comments.

https://news.ycombinator.com/newsguidelines.html

Edit: twice is already a pattern - https://news.ycombinator.com/item?id=44110785. No more of this, please.

Edit 2: I only just realized that you've been frequently posting abusive replies in a way that crosses into harangue if not harassment:

https://news.ycombinator.com/item?id=44725284 (July 2025)

https://news.ycombinator.com/item?id=44725227 (July 2025)

https://news.ycombinator.com/item?id=44725190 (July 2025)

https://news.ycombinator.com/item?id=44525830 (July 2025)

https://news.ycombinator.com/item?id=44441154 (July 2025)

https://news.ycombinator.com/item?id=44110817 (May 2025)

https://news.ycombinator.com/item?id=44110785 (May 2025)

https://news.ycombinator.com/item?id=44018000 (May 2025)

https://news.ycombinator.com/item?id=44008533 (May 2025)

https://news.ycombinator.com/item?id=43779758 (April 2025)

https://news.ycombinator.com/item?id=43474204 (March 2025)

https://news.ycombinator.com/item?id=43465383 (March 2025)

https://news.ycombinator.com/item?id=42960299 (Feb 2025)

https://news.ycombinator.com/item?id=42942818 (Feb 2025)

https://news.ycombinator.com/item?id=42706415 (Jan 2025)

https://news.ycombinator.com/item?id=42562036 (Dec 2024)

https://news.ycombinator.com/item?id=42483664 (Dec 2024)

https://news.ycombinator.com/item?id=42021665 (Nov 2024)

https://news.ycombinator.com/item?id=41992383 (Oct 2024)

That's abusive, unacceptable, and not even a complete list!

You can't go after another user like this on HN, regardless of how right you are or feel you are or who you have a problem with. If you keep doing this, we're going to end up banning you, so please stop now.

replies(1): >>44730761 #
11. shortrounddev2 ◴[] No.44727620[source]
Serious question: if you have to read every line of code in order to validate it in production, why not just write every line of code instead?
replies(1): >>44727896 #
12. simonw ◴[] No.44727896{3}[source]
Because it's much, much faster to review a hundred lines of code than it is to write a hundred lines of code.

(I'm experienced at reading and reviewing code.)

replies(3): >>44729060 #>>44731587 #>>44739188 #
13. vessenes ◴[] No.44728174{3}[source]
Nota bene - there is a fair amount of research that indicates models outputs and ‘thoughts’ do not necessarily align with their chain of reasoning output.

You can validate this pretty easily by asking some logic or coding questions: you will likely note that a final output is not necessarily the logical output of the end of the thinking; sometimes significantly orthogonal to it, or returning to reasoning in the middle.

All that to say - good idea to read it, but stay vigilant on outputs.

replies(1): >>44758905 #
14. bsder ◴[] No.44728396[source]
> However the solutions are absolutely useless for anyone else but the implementer.

Disposable code is where AI shines.

AI generating the boilerplate code for an obtuse build system? Yes, please. AI generating an animation? Ganbatte. (Look at how much work 3Blue1Brown had to put into that--if AI can help that kind of thing, it has my blessings). AI enabling someone who doesn't program to generate some prototype that they can then point at an actual programmer? Excellent.

This is fine because you don't need to understand the result. You have a concrete pass/fail gate and don't care about underneath. This is real value. The problem is that it isn't gigabuck value.

The stuff that would be gigabuck value is unfortunately where AI falls down. Fix this bug in a product. Add this feature to an existing codebase. etc.

AI is also a problem because disposable code is what you would assign to junior programmers in order for them to learn.

replies(1): >>44734358 #
15. magic_hamster ◴[] No.44728544[source]
The LLM is the solution.
16. paufernandez ◴[] No.44729060{4}[source]
Simon, don't you fear "atrophy" in your writing ability?
replies(2): >>44729483 #>>44732302 #
17. simonw ◴[] No.44729483{5}[source]
I think it will happen a bit, but I'm not worried about it.

My ability to write with a pen has suffered enormously now that I do most of my writing on a phone or laptop - but I'm writing way more.

I expect I'll become slower at writing code without an LLM, but the volume of (useful) code I produce will be worth the trade off.

18. otabdeveloper4 ◴[] No.44731587{4}[source]
Absolutely false for anything but the most braindead corporate CRUD code.

We hate reading code and will avoid the hassle every time, but that doesn't mean it is easy.

replies(1): >>44732317 #
19. otabdeveloper4 ◴[] No.44731602{4}[source]
Offtopic, but Django is really bad and a huge pile of code smell. (Not a Django programmer. I manage them and can compare Django-infected projects to normal projects.)
20. larodi ◴[] No.44732217[source]
I think is the right way to do it. Produce with LLM, debug and read every online. Delete lots of it.

Many people fear this approach for production, but it is reasonable compared to someone with a single course in Coursera writing production JS code.

Yet, we tend to say the LLM wrote this and that which implies model did all the work. In reality it should be understood as a complex heavy lifting machine which is expected to be operated by a very well prepared operator.

The fact I got a Kango and drilled some holes does not make me engineer right? And it takes an engineer to sign off the building even thought it was archicad doing the math.

21. DonHopkins ◴[] No.44732302{5}[source]
Reading other people's (or llm's) code is one of the best ways of improving your own coding abilities. Lazy people using llms to avoid reading any code is called "vibe coding", and their abilities atrophy no matter who or what wrote the code they refuse to read.
22. DonHopkins ◴[] No.44732317{5}[source]
>We hate reading code and will avoid the hassle every time, but that doesn't mean it is easy.

Speak for yourself. I love reading code! It's hard and it takes a lot of energy, but if you hate it, maybe you should find something else to do.

Being a programmer who hates reading code is like being a bus driver who hates looking at the road: dangerous and menacing to the public and your customers.

23. giantrobot ◴[] No.44734358[source]
> AI is also a problem because disposable code is what you would assign to junior programmers in order for them to learn.

It's also giving PHBs the ability to hand ill-conceived ideas to a magic robot, receive "code" they can't understand, and throw it into production. All the while firing what real developers they had on staff.

replies(1): >>44739322 #
24. yencabulator ◴[] No.44739188{4}[source]
This sounds like a recipe for destructive bugs and security vulnerabilities to slip into production.

Reviewing is really hard to do well. Like, on a psychological level. Your brain just starts nodding and humming along, pretending to understand. Humans have to consciously "perform review" to actually review. For example, https://en.wikipedia.org/wiki/Pointing_and_calling and checklists in aviation and health care, Tom Gilb's "Inspection" JPL-inspired spec review processes.

Even HN gets a steady drip of "look at my vibecoded project" -- "umm, you just leaked your API keys".

It's just that reviewing doesn't matter for a space invaders clone.

replies(1): >>44739536 #
25. yencabulator ◴[] No.44739322{3}[source]
I expect many of those companies to fail in the 3mo-2y timeline, so in many ways I welcome PHBs to embrace their full stupidity. Same for the people who funded them.

I do feel semi-sorry for anyone who paid for the services by those companies, though. Maybe something good will arise from that too, in the end; for example, it'd be nice if US society taught more critical reading skills to its members.

The interesting game for the non-PHBs among us is figuring out if/how we can use LLMs in less risky ways, and what all is possible there. For example, I'd love to see work put into LLMs helping with formal correctness of software; there's a hard backstop there where either the proof checks or it doesn't. Code changes needed to enable less-painful proofs would hopefully largely be refactorings, where reviews should be easier and it might even work out to fuzz test that the old and new implementations return matching output for same input. Or similarly, LLM-powered test coverage improver that only writes new tests (old school/branch-based/mutation-based, there's plenty of room there).

26. simonw ◴[] No.44739536{5}[source]
Reviewing isn't nearly as hard if you told the model exactly what to write already: https://simonwillison.net/2025/Mar/11/using-llms-for-code/#t...
27. Breza ◴[] No.44758905{4}[source]
That's a good note. I use DeepSeek for early planning of a project because of how valuable its reasoning output can be. It's common that I'll describe my problem and first draft architecture and see something in the output like "Since this has to be mobile optimized..." Then I'll stop generation, edit the original prompt to specify that I don't have to worry about mobile, and run it again.