←back to thread

577 points simonw | 3 comments | | HN request time: 0.001s | source
Show context
righthand ◴[] No.44724896[source]
Did you understand the implementation or just that it produced a result?

I would hope an LLM could spit out a cobbled form of answer to a common interview question.

Today a colleague presented data changes and used an LLM to build a display app for the JSON for presentation. Why did they not just pipe the JSON into our already working app that displays this data?

People around me for the most part are using LLMs to enhance their presentations, not to actually implement anything useful. I have been watching my coworkers use it that way for months.

Another example? A different coworker wanted to build a document macro to perform bulk updates on courseware content. Swapping old words for new words. To build the macro they first wrote a rubrick to prompt an LLM correctly inside of a word doc.

That filled rubrik is then used to generate a program template for the macro. To define the requirements for the macro the coworker then used a slideshow slide to list bullet points of functionality, in this case to Find+Replace words in courseware slides/documents using a list of words from another text document. Due to the complexity of the system, I can’t believe my colleague saved any time. The presentation was interesting though and that is what they got compliments on.

However the solutions are absolutely useless for anyone else but the implementer.

replies(3): >>44724928 #>>44728396 #>>44728544 #
simonw ◴[] No.44724928[source]
I scanned the code and understood what it was doing, but I didn't spend much time on it once I'd seen that it worked.

If I'm writing code for production systems using LLMs I still review every single line - my personal rule is I need to be able to explain how it works to someone else before I'm willing to commit it.

I wrote a whole lot more about my approach to using LLMs to help write "real" code here: https://simonwillison.net/2025/Mar/11/using-llms-for-code/

replies(4): >>44725190 #>>44725429 #>>44727620 #>>44732217 #
1. photon_lines ◴[] No.44725429[source]
This is why I love using the Deep-Seek chain of reason output ... I can actually go through and read what it's 'thinking' to validate whether it's basing its solution on valid facts / assumptions. Either way thanks for all of your valuable write-ups on these models I really appreciate them Simon!
replies(1): >>44728174 #
2. vessenes ◴[] No.44728174[source]
Nota bene - there is a fair amount of research that indicates models outputs and ‘thoughts’ do not necessarily align with their chain of reasoning output.

You can validate this pretty easily by asking some logic or coding questions: you will likely note that a final output is not necessarily the logical output of the end of the thinking; sometimes significantly orthogonal to it, or returning to reasoning in the middle.

All that to say - good idea to read it, but stay vigilant on outputs.

replies(1): >>44758905 #
3. Breza ◴[] No.44758905[source]
That's a good note. I use DeepSeek for early planning of a project because of how valuable its reasoning output can be. It's common that I'll describe my problem and first draft architecture and see something in the output like "Since this has to be mobile optimized..." Then I'll stop generation, edit the original prompt to specify that I don't have to worry about mobile, and run it again.