Show HN: Open source alternative to Perplexity Comet

1. layer8 ◴[10 Jul 25 19:02 UTC] No.44524345[source]▶

I would prefer this as a browser extension, not as its own browser application.

replies(4): >>44524916 #>>44527052 #>>44529634 #>>44530097 #

2. felarof ◴[10 Jul 25 20:04 UTC] No.44524916[source]▶

We would've preferred to build this as browser extension too.

But we strongly believe that for building a good agent co-pilot we need bunch of changes at Chromium C++ code level. For example, chromium has a accessibility tree for every website, but doesn't expose it as an API to chrome extension. Having access to accessibility tree would greatly improve agent execution.

We are also building bunch of changes in C++ for agents to interact with websites -- functions like click, elements with indexes. You can inject JS for doing this but it is 20-40X slower.

replies(4): >>44525458 #>>44526251 #>>44527064 #>>44530334 #

3. esafak ◴[10 Jul 25 20:50 UTC] No.44525458[source]▶

>>44524916 #

Could you upstream that change in order to make it an extension in the future? I think people would not value it any less.

replies(1): >>44525486 #

4. felarof ◴[10 Jul 25 20:54 UTC] No.44525486{3}[source]▶

>>44525458 #

We don't mind upstreaming. But I don't think Google Chrome/Chromium wants to expose it as an API chrome extensions, if not they would've done this long time ago.

From Google's perspective, extension are meant to be lightweight applications, with restricted access.

replies(1): >>44526840 #

5. layer8 ◴[10 Jul 25 22:12 UTC] No.44526251[source]▶

>>44524916 #

Would this be possible for Firefox?

replies(1): >>44526919 #

6. jazzyjackson ◴[10 Jul 25 23:26 UTC] No.44526840{4}[source]▶

>>44525486 #

I'm not really interested in AI agents for my webbrowser, but it would be pretty cool to see a fork of chromium available that, aside from being de-googled, relaxes all the "restricted access" to make it more fun to modify and customize the way you guys are. Just a thought, may be more of a market for the framework more than the product :)

See Sciter. A very cool, super lightweight alternative to Electron, but unfortunately it seems like a single developer project and I could never get any of the examples to run.

https://sciter.com/

replies(3): >>44526902 #>>44530171 #>>44530495 #

7. felarof ◴[10 Jul 25 23:36 UTC] No.44526902{5}[source]▶

>>44526840 #

Yes, we want to do this too! We'll expose much more richer APIs.

What use-cases do you have in mind? like scraping?

8. felarof ◴[10 Jul 25 23:39 UTC] No.44526919{3}[source]▶

>>44526251 #

IIRC, Firefox's web extension API does not provide access to accessibility tree as well.

replies(1): >>44527501 #

9. arjunchint ◴[11 Jul 25 00:02 UTC] No.44527052[source]▶

>>44524345 (TP) #

We had this exact thought as well, you don't need a whole browser to implement the agentic capabilities, you can implement the whole thing with the limited permissions of a browser extension.

There are plenty of zero day exploit patches that Google immediately rolls out and not to mention all the other features that Google doesn't push to Chromium. I wouldn't trust a random open source project for my day-to-day browser.

Check out rtrvr.ai for a working implementation, we are an AI Web Agent browser extension that meets you where your workflows already are.

replies(2): >>44527600 #>>44529016 #

10. arjunchint ◴[11 Jul 25 00:04 UTC] No.44527064[source]▶

>>44524916 #

I mean you could build the agent with a first principles understanding of the DOM instead of just hacking together with the accessibility tree

11. dotancohen ◴[11 Jul 25 01:17 UTC] No.44527501{4}[source]▶

>>44526919 #

I'm not GP, but I agree that if your goal is to empower the end user and protect him from corporate overlords, then Firefox is a more logical choice to fork from.

replies(2): >>44527891 #>>44528989 #

12. felarof ◴[11 Jul 25 01:35 UTC] No.44527600[source]▶

>>44527052 #

Brave Browser (70M+ users) has validated that a chromium fork can be viable path. And it can in fact provide better privacy and security.

Chrome extensions is not a bad idea too. Just saying that owning the underlying source code has some strong advantages in the long term (being able to use C++ for a11y tree, DOM handling, etc -- which will be 20-40X faster than injecting JS using chrome extension).

replies(1): >>44529464 #

13. axus ◴[11 Jul 25 02:28 UTC] No.44527891{5}[source]▶

>>44527501 #

I wonder if someone on the Chromium team will upstream all these BrowserOS changes, or "Not Invented Here" and re-implement it all for Gemini / Google Assistant.

Can't imagine Firefox acquiring a Firefox fork!

replies(1): >>44528314 #

14. dotancohen ◴[11 Jul 25 04:10 UTC] No.44528314{6}[source]▶

>>44527891 #

The hooks needed for integration will probably be implemented in Chrome, not Chromium. Nobody else should have them.

15. esperent ◴[11 Jul 25 06:35 UTC] No.44528989{5}[source]▶

>>44527501 #

Isn't the Firefox code notoriously hard to fork and work with? I'm sure that nearly all of these Chrome forks would prefer to fork Firefox, but there's a reason they don't.

replies(1): >>44530889 #

16. esperent ◴[11 Jul 25 06:39 UTC] No.44529016[source]▶

>>44527052 #

> I wouldn't trust a random open source project for my day-to-day browser.

Given that you're working on a direct competitor, this comment reads as fearmongering, designed to drive people over to your product.

replies(2): >>44529489 #>>44530206 #

17. arjunchint ◴[11 Jul 25 07:57 UTC] No.44529464{3}[source]▶

>>44527600 #

Honestly excited to see the benchmark result and comparison!

Our benchmark results [https://www.rtrvr.ai/blog/web-bench-results] show that we are 7x faster than browser-use so curious to see if your claims live up to the hype

18. arjunchint ◴[11 Jul 25 08:01 UTC] No.44529489{3}[source]▶

>>44529016 #

I personally talked to another agentic browser player, fellou.ai, in the space asking them how they are keeping up with all the Chromium pushes as you need a dedicated team to handle the merges, they flat out told me they are targeting tech enthusiasts that are not interested in the security of their browser as much.

As an ex-Google engineer I know the immense engineering efforts and infrastructure setup to develop Chrome. It is very implausible that two people can handle all the effort to serve a secure browser with 15+ million lines of constantly changing C++ code.

A sandboxxed browser extension is the natural form factor for these agentic capabilities.

replies(1): >>44534388 #

19. casslin ◴[11 Jul 25 08:27 UTC] No.44529634[source]▶

>>44524345 (TP) #

try https://github.com/nanobrowser/nanobrowser

20. Imustaskforhelp ◴[11 Jul 25 09:26 UTC] No.44530097[source]▶

>>44524345 (TP) #

Exactly my thoughts when I saw nanobrowser being mentioned here.

21. Imustaskforhelp ◴[11 Jul 25 09:35 UTC] No.44530171{5}[source]▶

>>44526840 #

I always wonder about what sort of js engine such projects use since at the end of the day imo, it is all just a dance b/w js engine, html and css. Html & Css feels a little solved problem but the problem is of the js engine.

Sciter uses quickjs and I just checked and its like 35-36x times slower than V8 JIT

Also another interesting rabbit hole is that I found Duktape in the quickjs benchmarks and I saw https://blogcpp.org/ as one of the projects within Duktape but I can't even see the project on github. We really need some better way of preserving open source stuff I guess

replies(2): >>44538439 #>>44538945 #

22. Imustaskforhelp ◴[11 Jul 25 09:38 UTC] No.44530206{3}[source]▶

>>44529016 #

Conflict of interests at its heart.

I mean, I have no skin in the game but I mean, there are people who are using Dia (browser company) and Dia is closed source so it would be nice to see those people jumping to browser OS atleast.

I personally would prefer it as an extension but there are some limitations as the author of browserOS noted within extensions but I just wish that google/chromium can push those changes upstream I guess.

replies(1): >>44534464 #

23. tunnelfx ◴[11 Jul 25 09:58 UTC] No.44530334[source]▶

>>44524916 #

How is that accessibility tree different from the “accessibility snapshot” that you can get from Playwright for example?

I was tackling a similar problem few weeks ago and I found that playwright MCP was the most usable solution in my case. It doesn’t use an extension but it debugs the browser tabs (I guess using dev tools protocol) but I agree the experience was suboptimal

24. nicoburns ◴[11 Jul 25 10:22 UTC] No.44530495{5}[source]▶

>>44526840 #

I'm making a new sciter-style webview https://github.com/DioxusLabs/blitz

25. layer8 ◴[11 Jul 25 11:22 UTC] No.44530889{6}[source]▶

>>44528989 #

The reason they usually don't is for better compatibility with more web pages.

26. felarof ◴[11 Jul 25 16:52 UTC] No.44534388{4}[source]▶

>>44529489 #

Also, ex-Google engineer here :) Rtrvr looks like great product too!

Definitely understand that keeping up with security patches is important. And this is an engineering challenge and not implausible to do -- Perplexity is 1/1000th the size of Google and they could be build a better product. So, "you can just do things".

We are still on day 1 of launch. We will only get better from here. And we won't be 2 people forever. We plan to hire, expand team and take on the engineering challenges.

27. felarof ◴[11 Jul 25 16:57 UTC] No.44534464{4}[source]▶

>>44530206 #

Thank you for your support! Yes, we want to do some cool things that we can't do as extension.

C++ APIs for dom tree, a11y. We eventually want to ship a small fine-tuned LLM and package with browser binary too.

Just getting started!

28. mdaniel ◴[12 Jul 25 01:12 UTC] No.44538439{6}[source]▶

>>44530171 #

> We really need some better way of preserving open source stuff I guess

Not some rando's blog engine in C++, or other kinds of stupid throw-away code

Anyway <https://web.archive.org/web/20241122030659/https://github.co...> -> $(fossil clone https://code.rosaelefanten.org/blogcpp) which takes a stunning amount of time but then reports

    Round-trips: 15   Artifacts sent: 0  received: 2751
    Clone done, wire bytes sent: 4127  received: 124543126  remote: 2a03:4000:34:5e::1
    Rebuilding repository meta-data...
      100.0% complete...
    Extra delta compression... none found
    Vacuuming the database...
    project-id: 40a055cb170ae83c46b4ed9bf3b6a60e6e541aa0
    server-id:  cee9059305219c887fd29c677cbafb372252518a
    admin-user: mdaniel (password is "x2hEAaXUDj")
    opening the new ./blogcpp.fossil repository in directory ./blogcpp...
    3rdparty/ConfigParser/ConfigParser.cpp
    ...
    project-name: BlogC++
    repository:   /home/runner/blogcpp.fossil
    project-code: 40a055cb170ae83c46b4ed9bf3b6a60e6e541aa0
    checkout:     fbea390316bc3aace7de0a9ccdba90ecc1949a10 2024-12-17 22:03:16 UTC
    parent:       9e604d205e7922ef1af87952fe1bebef0cbac336 2022-01-15 02:22:33 UTC
    tags:         trunk
    comment:      fuck you github (user: Cthulhux)
    check-ins:    20

29. jazzyjackson ◴[12 Jul 25 03:03 UTC] No.44538945{6}[source]▶

>>44530171 #

good callout wrt being slower than JIT. ofc for certain applications it's not a showstopper, ie, if you're not using javascript for your MVC but doing more of a progressive enhancement thing.

CSS2 is closer to trivial, but CSS3 is practically a 3D game engine with all of its matrix transforms, transitions, animations, variables - not to mention all the different layout schemes (Sciter blogged about introducing display:flex and display:grid two months ago)

The most interesting part of Sciter to me is that data persistence goes way beyond localStorage (string key: string value) or filesystem API, instead it's DyBase [0][1] behind the scenes, which looks to be a very intriguing style of storing trees of data in the host language's datatype (including whatever classes you define) without mucking about with the leaky abstractions of an ORM.

[0] https://github.com/c-smile/sciter-js-sdk/blob/main/docs/md/s...

[1] http://www.garret.ru/dybase.html