(martin.janiczek.cz)

207 points todsacerdoti | 2 comments | 21 Nov 25 10:28 UTC | HN request time: 0.494s | source

Show context

alganet ◴[21 Nov 25 18:39 UTC] No.46007432[source]▶

> "Take a look at those tests!"

A math module that is not tested for division by zero. Classical LLM development.

The suite is mostly happy paths, which is consistent with what I've seen LLMs do.

Once you setup coverage, and tell it "there's a hidden branch that the report isn't able to display on line 95 that we need to cover", things get less fun.

replies(1): >>46007852 #

1. mjaniczek ◴[21 Nov 25 19:18 UTC] No.46007852[source]▶

>>46007432 #

It's entirely happy paths right now; it would be best to allow the test runner to also test for failures (check expected stderr and return code), then we could write those missing tests.

I think you can find a test somewhere in there with a commented code saying "FAWK can't do this yet, but yadda yadda yadda".

replies(1): >>46008087 #

2. alganet ◴[21 Nov 25 19:37 UTC] No.46008087[source]▶

>>46007852 (TP) #

It's funny because I'm evaluating LLMs for just this specific case (covering tests) right now, and it does that a lot.

I say "we need 100% coverage on that critical file". It runs for a while, tries to cover it, fails, then stops and say "Success! We covered 60% of the file (the rest is too hard). I added a comment.". 60% was the previous coverage before the LLM ran.

↑

FAWK: LLMs can write a language interpreter