←back to thread

1480 points sandslash | 5 comments | | HN request time: 1.157s | source
Show context
blobbers ◴[] No.44316344[source]
Software 3.0 is the code generated by the machine, not the prompts that generated it. The prompts don't even yield the same output; there is randomness.

The new software world is the massive amount of code that will be burped out by these agents, and it should quickly dwarf the human output.

replies(4): >>44316389 #>>44318479 #>>44318855 #>>44322374 #
1. pelagicAustral ◴[] No.44316389[source]
I think that if you give the same task to three different developers you'll get three different implementations. It's not a random result if you do get the functionality that was expected, and at that, I do think the prompt plays an important role in offering a view of how the result was achieved.
replies(2): >>44317798 #>>44329445 #
2. klabb3 ◴[] No.44317798[source]
> I think that if you give the same task to three different developers you'll get three different implementations.

Yes, but if you want them to be compatible you need to define a protocol and conformance test suite. This is way more work than writing a single implementation.

The code is the real spec. Every piece of unintentional non-determinism can be a hazard. That’s why you want the code to be the unit of maintenance, not a prompt.

replies(1): >>44318372 #
3. imiric ◴[] No.44318372[source]
I know! Let's encode the spec into a format that doesn't have the ambiguities of natural language.
replies(1): >>44319386 #
4. klabb3 ◴[] No.44319386{3}[source]
Right. Great idea. Maybe call it ”formal execution spec for LLM reference” or something. It could even be versioned in some kind of distributed merkle tree.
5. blobbers ◴[] No.44329445[source]
Interestingly, I was generating some scraping code today. I prompted it fairly generically and it decided to spit out some Selenium code. I reported the stack trace failure, and it gave me a new script with playwright. That failed also and it gave me some suggestions to fix it. I asked it to update the whole script rather than snippets, and it responded with "Hey let's not use either of these and here we'll use the site's API." and proceeded to do that.

Kind of crazy, it basically found 3 different hammers to hit the nail I wanted. The API unfortunately seems to be timeing out (I had to add the timeout=10 to the post u_u)