←back to thread

385 points meetpateltech | 9 comments | | HN request time: 1.311s | source | bottom
1. nadis ◴[] No.44008123[source]
In the preview video, I appreciated Katy Shi's comment on "I think this is a reflection of where engineering work has moved over the past where a lot of my time now is spent reviewing code rather than writing it."

Preview video from Open AI: https://www.youtube.com/watch?v=hhdpnbfH6NU&t=878s

As I think about what "AI-native" or just the future of building software loos like, its interesting to me that - right now - developers are still just reading code and tests rather than looking at simulations.

While a new(ish) concept for software development, simulations could provide a wider range of outcomes and, especially for the front end, are far easier to evaluate than just code/tests alone. I'm biased because this is something I've been exploring but it really hit me over the head looking at the Codex launch materials.

replies(2): >>44008199 #>>44010123 #
2. ai-christianson ◴[] No.44008199[source]
> rather than looking at simulations

You mean like automated test suites?

replies(1): >>44008290 #
3. tough ◴[] No.44008290[source]
automated visual fuzzy-testing with some self-reinforcement loops

There's already library's for QA testing and VLM's can give critique on a series of screenshots automated by a playwright script per branch

replies(1): >>44008539 #
4. ai-christianson ◴[] No.44008539{3}[source]
Cool. Putting vision in the loop is a great idea.

Ambitious idea, but I like it.

replies(2): >>44008641 #>>44009970 #
5. tough ◴[] No.44008641{4}[source]
SmolVLM, Gemma, LlaVa, in case you wanna play with some of the ones i've tried.

https://huggingface.co/blog/smolvlm

recently both llama.cpp and ollama got better support for them too, which makes this kind of integration with local/self-hosted models now more attainable/less expensive

replies(1): >>44008693 #
6. tough ◴[] No.44008693{5}[source]
also this for the visual regression testing parts, but you can add some AI onto the mix ;) https://github.com/lost-pixel/lost-pixel
7. ericghildyal ◴[] No.44009970{4}[source]
I used Cline to build a tiny testing helper app and this is exactly what it did!

It made changes in TS/Next.js given just the boiletplate from create-next-app, ran `yarn dev` then opened its mini LLM browser and navigated to localhost to verify everything looked correct.

It found 1 mistake and fixed the issue then ran `yarn dev` again, opened a new browser, navigated to localhost (pointing at the original server it brought up, not the new one at another port) and confirmed the change was correct.

I was very impressed but still laughed at how it somehow backed its way into a flow the worked, but only because Next has hot-reloading.

8. fosterfriends ◴[] No.44010123[source]
++ Kind of my whole thesis with Graphite. As more code gets AI-generated, the weight shifts to review, testing, and integration. Even as someone helping build AI code reviewers, we'll _need_ humans stamping forever - for many reasons, but fundamentally for accountability. A computer can never be held accountable

https://constelisvoss.com/pages/a-computer-can-never-be-held...

replies(1): >>44010360 #
9. hintymad ◴[] No.44010360[source]
> A computer can never be held accountable

I think the issue is not about humans being entirely replaced. Instead, the issue is that if AI replaces enough number of knowledge workers while there's no new or expanded market to absorb the workforce, the new balance of supply and demand will mean that many of us will have suppressed pay or worse, losing our jobs forever.