←back to thread

Using LLMs at Oxide

(rfd.shared.oxide.computer)
694 points steveklabnik | 1 comments | | HN request time: 0.198s | source
Show context
john01dav ◴[] No.46178567[source]
> Wherever LLM-generated code is used, it becomes the responsibility of the engineer. As part of this process of taking responsibility, self-review becomes essential: LLM-generated code should not be reviewed by others if the responsible engineer has not themselves reviewed it. Moreover, once in the loop of peer review, generation should more or less be removed: if code review comments are addressed by wholesale re-generation, iterative review becomes impossible.

My general procedure for using an LLM to write code, which is in the spirit of what is advocated here, is:

1) First, feed in the existing relevant code into an LLM. This is usually just a few source files in a larger project

2) Describe what I want to do, either giving an architecture or letting the LLM generate one. I tell it to not write code at this point.

3) Let it speak about the plan, and make sure that I like it. I will converse to address any deficiencies that I see, and I almost always do.

4) I then tell it to generate the code

5) I skim & test the code to see if it's generally correct, and have it make corrections as needed

6) Closely read the entire generated artifact at this point, and make manual corrections (occasionally automatic corrections like "replace all C style casts with the appropriate C++ style casts" then a review of the diff)

The hardest part for me is #6, where I feel a strong emotional bias towards not doing it, since I am not yet aware of any errors compelling such action.

This allows me to operate at a higher level of abstraction (architecture) and remove the drudgery of turning an architectural idea into written, precise, code. But, when doing so, you are abandoning those details to a non-deterministic system. This is different from, for example, using a compiler or higher level VM language. With these other tools, you can understand how they work and rapidly have a good idea of what you're going to get, and you have robust assurances. Understanding LLMs helps, but thus not to the same degree.

replies(4): >>46179058 #>>46179214 #>>46181793 #>>46182160 #
1. qudat ◴[] No.46181793[source]
Insert before 4: make it generate tests that fail, review, then have it implement and make sure the tests pass.

Insert before that: have it creates tasks with beads and force it to let you review before marking a task complete