159 points jonasnelle | 1 comments | 20 Nov 24 20:22 UTC | HN request time: 0.001s | source

Hey HN, we're Alexi and Jonas the co-founders of Autotab (https://autotab.com). Autotab is a chrome-based browser you can teach to do complex tasks, with a simple API for running them from your app or backend.

Here is a walkthrough of how it works: https://youtu.be/63co74JHy1k, and you can try it for free at https://autotab.com by downloading the app.

Why a dedicated editor?

The number one blocker we've found in building more flexible, agentic automations is performance quality BY FAR (https://www.langchain.com/stateofaiagents#barriers-and-chall...). For all the talk of cost, latency, and safety, the fact is most people are still just struggling to get agents to work. The keys to solving reliability are better models, yes, but also intent specification. Even humans don't zero-shot these tasks from a prompt. They need to be shown how to perform them, and then refined with question-asking + feedback over time. It is also quite difficult to formulate complete requirements on the spot from memory.

The editor makes it easy to build the specification up as you step through your workflow, while generating successful task trajectories for the model. This is the only way we've been able to get the reliability we need for production use cases.

But why build a browser?

Autotab started as a Chrome extension (with a Show HN post! https://news.ycombinator.com/item?id=37943931). As we iterated with users, we realized that we needed to focus on creating the control surface for intent specification, and that being stuck in a chrome sidepanel wasn't going to work. We also knew that we needed a level of control for the model that we couldn't get without owning the browser. In Autotab, the browser becomes a canvas on which the user and the model are taking turns showing and explaining the task.

Key features:

1. Self-healing automations that don't break when sites change

2. Dedicated authoring tool that builds memory for the model while defining steps for the automation

3. Control flows and deep configurability to keep automations on track, even when navigating complex reasoning tasks

4. Works with any website (no site-specific APIs needed)

5. Runs securely in the cloud or locally

6. Simple REST API + client libraries for Python, Node

We'd love to get any early feedback from the HN community, ideas for where you'd like the product to go, or experiences in this space. We will be in the comments for the next few hours to respond!

Show context

pacifi30 ◴[20 Nov 24 22:43 UTC] No.42198908[source]▶

>>42197741 (OP) #

Pretty slick. I recorded a session for ordering from a restaurant website, and it did repeat the entire workflow. It had some issues with a modal popped up but all in all well done! We have been trying to robotify the task of ordering from restaurant for our clients and seems like your solution can work well for us. I am guessing that you want your users to use Autotab browser, what is use for API?

replies(2): >>42198946 #>>42198960 #

jonasnelle ◴[20 Nov 24 22:49 UTC] No.42198946[source]▶

>>42198908 #

Thanks! We think of the browser as an authoring tool where you create, test and refine skills.

After you've done that, the API is great for cases where you want to incorporate Autotab into a larger data flow or product.

For instance, say Company A has taught Autotab to migrate their customers' data - so their customers just see a sync button in the Company A product, which kicks off a Autotab run via API. Same for restaurant booking, if you'd want that to happen programatically.

replies(2): >>42199080 #>>42199368 #

pacifi30 ◴[20 Nov 24 23:06 UTC] No.42199080{3}[source]▶

>>42198946 #

Understood! How does it work if we have several different restaurants to order from, do I need to record each ordering session and create skills for each restaurant or it can infer on its own given the task to order from a restaurant. Secondly, any docs or samples to see how to integrate this with your API?

replies(1): >>42199147 #

1. jonasnelle ◴[20 Nov 24 23:17 UTC] No.42199147{4}[source]▶

>>42199080 #

Depends on how different the flows are for different restaurants. If they're just different names but use the same booking system you'd typically use an input and have Autotab find the correct restaurant first. If they're totally different booking systems you can try the instruct (open ended agentic) step but my guess is that will be too slow and unreliable for now, so you'd probably want to record different skills for each.

Docs are here with sample code: https://docs.autotab.com/api-reference

↑

Show HN: Autotab – Programmable AI browser for turning web tasks into APIs