←back to thread

376 points meetpateltech | 1 comments | | HN request time: 0s | source
Show context
ianbutler ◴[] No.44007168[source]
Im super curious to see how this actually does at finding significant bugs, we've been working in the space on https://www.bismuth.sh for a while and one of the things we're focused on is deep validation of the code being outputted.

There's so many of these "vibe coding" tools and there has to be real engineering rigor at some point. I saw them demo "find the bug" but the bugs they found were pretty superficial and thats something we've seen in our internal benchmark from both Devin and Cursor. A lot of noise and false positives or superficial fixes.

replies(1): >>44007263 #
1. ◴[] No.44007263[source]