Show HN: I made a Chrome extension that can automate any website

(browserflow.app)

707 points namukang | 1 comments | 17 Nov 21 15:16 UTC | HN request time: 0.366s | source

Show context

menthe ◴[18 Nov 21 04:26 UTC] No.29261972[source]▶

As a web scraper, I'll say that because he is hooking into the browser like a debugger / remotely controlled browser, just like Puppeteer would - he is instantly detected by the Cloudflare, PerimeterX, Datadome bot management solutions; and will get consistently banned on his page reload for literally any site caring about bots.

He'd be better off running some javascript on the page instead (a-la Tampermonkey, but can be done really nicely with some server-served TypeScript) to scrape the pages stealthily and perform actions.

replies(4): >>29262248 #>>29262765 #>>29262768 #>>29263957 #

Siira ◴[18 Nov 21 10:23 UTC] No.29263957[source]▶

>>29261972 #

Can you provide any guides on this? How will the server run the JS on their page automatically?

replies(1): >>29264238 #

menthe ◴[18 Nov 21 11:19 UTC] No.29264238[source]▶

>>29263957 #

The easiest approach is be to use an extension like Tampermonkey, which can load (and reload) “scripts” from a web server. There are a few project templates on GitHub with Typescript+WebPack (e.g. https://github.com/xiaomingTang/template-ts-tampermonkey). You can automate with any of your favorite Typescript libs, from the comfort of your IDE, with hot reload included.. Pretty nifty, and projects can quickly get pretty big that way! I usually have one “script” that has broad permissions (e.g. all sites) with some form of router at the root of the code that branches to the different sites to evaluate.

replies(1): >>29264865 #

1. Siira ◴[18 Nov 21 12:55 UTC] No.29264865[source]▶

>>29264238 #

Thanks!

From what I understand, this is only useful for doing scrapes manually by launching the target URL in a GUI Chrome instance? Or can this somehow work on a headless server? (I don't understand how one can automate this.)

↑