←back to thread

Using LLMs at Oxide

(rfd.shared.oxide.computer)
694 points steveklabnik | 6 comments | | HN request time: 0.001s | source | bottom
Show context
mcqueenjordan ◴[] No.46178624[source]
As usual with Oxide's RFDs, I found myself vigorously head-nodding while reading. Somewhat rarely, I found a part that I found myself disagreeing with:

> Unlike prose, however (which really should be handed in a polished form to an LLM to maximize the LLM’s efficacy), LLMs can be quite effective writing code de novo.

Don't the same arguments against using LLMs to write one's prose also apply to code? Was this structure of the code and ideas within the engineers'? Or was it from the LLM? And so on.

Before I'm misunderstood as a LLM minimalist, I want to say that I think they're incredibly good at solving for the blank page syndrome -- just getting a starting point on the page is useful. But I think that the code you actually want to ship is so far from what LLMs write, that I think of it more as a crutch for blank page syndrome than "they're good at writing code de novo".

I'm open to being wrong and want to hear any discussion on the matter. My worry is that this is another one of the "illusion of progress" traps, similar to the one that currently fools people with the prose side of things.

replies(9): >>46178640 #>>46178642 #>>46178818 #>>46179080 #>>46179150 #>>46179217 #>>46179552 #>>46180049 #>>46180734 #
lukasb ◴[] No.46178642[source]
One difference is that clichéd prose is bad and clichéd code is generally good.
replies(1): >>46178649 #
joshka ◴[] No.46178649[source]
Depends on what your prose is for. If it's for documentation, then prose which matches the expected tone and form of other similar docs would be clichéd in this perspective. I think this is a really good use of LLMs - making docs consistent across a large library / codebase.
replies(3): >>46178656 #>>46178665 #>>46178668 #
minimaxir ◴[] No.46178656[source]
I have been testing agentic coding with Claude 4.5 Opus and the problem is that it's too good at documentation and test cases. It's thorough in a way that it goes out of scope, so I have to edit it down to increase the signal-to-noise.
replies(2): >>46178944 #>>46179605 #
1. girvo ◴[] No.46178944{3}[source]
The “change capture”/straight jacket style tests LLMs like to output drive me nuts. But humans write those all the time too so I shouldn’t be that surprised either!
replies(1): >>46180074 #
2. mulmboy ◴[] No.46180074[source]
What do these look like?
replies(1): >>46180900 #
3. pmg101 ◴[] No.46180900[source]

  1. Take every single function, even private ones.
  2. Mock every argument and collaborator.
  3. Call the function.
  4. Assert the mocks were  called in the expected way.
These tests help you find inadvertent changes, yes, but they also create constant noise about changes you intend.
replies(3): >>46182913 #>>46183812 #>>46185551 #
4. ornornor ◴[] No.46182913{3}[source]
Juniors on one of the teams I work with only write this kind of tests. It’s tiring, and I have to tell them to test the behaviour, not the implementation. And yet every time they do the same thing. Or rather their AI IDE spits these out.
5. senbrow ◴[] No.46183812{3}[source]
These tests also break encapsulation in many cases because they're not testing the interface contract, they're testing the implementation.
6. girvo ◴[] No.46185551{3}[source]
You beat me to it, and yep these are exactly it.

“Mock the world then test your mocks”, I’m simply not convinced these have any value at all after my nearly two decades of doing this professionally