(github.com)

179 points anerli | 1 comments | 25 Apr 25 17:00 UTC | HN request time: 0.201s | source

Hey HN, Anders and Tom here - we’ve been building an end-to-end testing framework powered by visual LLM agents to replace traditional web testing.

We know there's a lot of noise about different browser agents. If you've tried any of them, you know they're slow, expensive, and inconsistent. That's why we built an agent specifically for running test cases and optimized it just for that:

- Pure vision instead of error prone "set-of-marks" system (the colorful boxes you see in browser-use for example)

- Use tiny VLM (Moondream) instead of OpenAI/Anthropic computer use for dramatically faster and cheaper execution

- Use two agents: one for planning and adapting test cases and one for executing them quickly and consistently.

The idea is the planner builds up a general plan which the executor runs. We can save this plan and re-run it with only the executor for quick, cheap, and consistent runs. When something goes wrong, it can kick back out to the planner agent and re-adjust the test.

It’s completely open source. Would love to have more people try it out and tell us how we can make it great.

Repo: https://github.com/magnitudedev/magnitude

Show context

pandemic_region ◴[25 Apr 25 19:46 UTC] No.43797848[source]▶

>>43796003 (OP) #

Bang me sideways, "AI-native" is a thing now? What does that even mean?

replies(3): >>43797886 #>>43797967 #>>43797972 #

1. mcbuilder ◴[25 Apr 25 20:01 UTC] No.43797967[source]▶

>>43797848 #

It definitely means something, probably an app designed around being interacted by with an LLM, upon first hearing it. Browser interaction is one of those things that is a great killer app for LLMs IMO.

For instance, I just discovered there are a ton of high quality scans of film and slides available at the Library of Congress website, but I don't really enjoy their interface. I could build a scraping tool and get too much info, or suffer and use just clicking through their search UI. Or I could ask my browser tool wielding LLM agent to automate the boring stuff and provide a map of the subjects I would be interested in, and give me a different way to discover things. I've just discovered the entire browser automation thing, and I'm having fun have my LLM go "research" for a few minutes while I go do something else.

↑

Show HN: Magnitude – open-source, AI-native test framework for web apps