(github.com)

425 points sfarshid | 3 comments | 24 Aug 25 16:18 UTC | HN request time: 0s | source

1. giantg2 ◴[24 Aug 25 16:35 UTC] No.45005590[source]▶

There's a lot of "it kind of worked" in here.

If we actually want stuff that works, we need to come up with a new process. If we get "almost" good code from a single invocation, you just going to get a lot of almost good code from a loop. What we likely need is a Cucumberesque format with example tables for requirements that we can distill an AI to use. It will build the tests and then build the code to to pass the tests.

replies(1): >>45005602 #

2. ghuntley ◴[24 Aug 25 16:36 UTC] No.45005602[source]▶

>>45005590 (TP) #

Strangely enough, TLA+ and other formal proofs work very well for driving Ralph.

replies(1): >>45005709 #

3. giantg2 ◴[24 Aug 25 16:48 UTC] No.45005709[source]▶

>>45005602 #

I would consider that expected but not strange. The thing blocking adoption is that most devs/people find those formal languages difficult or boring. That's even true of things like Cucumber - it's boring and most organizations care little for robust QA.

↑

We put a coding agent in a while loop