←back to thread

454 points positiveblue | 10 comments | | HN request time: 1.056s | source | bottom
Show context
impure ◴[] No.45066528[source]
Well, if you have a better way to solve this that’s open I’m all ears. But what Cloudflare is doing is solving the real problem of AI bots. We’ve tried to solve this problem with IP blocking and user agents, but they do not work. And this is actually how other similar problems have been solved. Certificate authorities aren’t open and yet they work just fine. Attestation providers are also not open and they work just fine.
replies(6): >>45066914 #>>45067091 #>>45067829 #>>45072492 #>>45072740 #>>45072778 #
1. Voultapher ◴[] No.45072778[source]
> Well, if you have a better way to solve this that’s open I’m all ears.

Regulation.

Make it illegal to request the content of a webpage by crawler if a website operator doesn't explicitly allows it via robots.txt. Institute a government agency that is tasked with enforcement. If you as a website operator can show that traffic came from bots, you can open a complaint with the government agency and they take care of shaking painful fines out of the offending companies. Force cloud hosts to keep books on who was using what IP addresses. Will it be a 100% fix, no, will it have a massive chilling effect if done well, absolutely.

replies(4): >>45072849 #>>45073524 #>>45075933 #>>45078127 #
2. akoboldfrying ◴[] No.45073524[source]
The biggest issue right now seems to be people renting their residential IP addresses to scraper companies, who then distribute large scrapes across these mostly distinct IPs. These addresses are from all over the world, not just your own country, so we'll either need a World Government, or at least massive intergovernmental cooperation, for regulation to help.
replies(1): >>45073607 #
3. Voultapher ◴[] No.45073607[source]
I don't think we need a world government to make progress on that point.

The companies buying these services, are buying them from other companies. Countries or larger blocks like the EU can exert significant pressure on such companies by declaring the use of such services as illegal when interacting with websites hosted in the country or block or by companies in them.

replies(1): >>45073970 #
4. akoboldfrying ◴[] No.45073970{3}[source]
It just seems too easy to skirt around via middlemen. The EU (say) could prosecute an EU company directly doing this residential scraping, and it could probably keep tabs on a handful of bank accounts of known bad actors in other countries, and then investigate and prosecute EU companies transferring money to them. But how do you stop an EU company paying a Moldovan company (that has existed for 10 days) for "internet services", that pays a Brazilian company, that pays a Russian company to do the actual residential scraping? And then there's all the crypto channels and other quid pro quo payment possibilities.
replies(1): >>45075414 #
5. Voultapher ◴[] No.45075414{4}[source]
Genuinely this isn't a tech specific or even novel problem. There is plenty of prior art when it comes to inhibiting unwanted behavior.

> But how do you stop an EU company paying a Moldovan company (that has existed for 10 days) for "internet services", that pays a Brazilian company, that pays a Russian company to do the actual residential scraping?

The same example could be made with money laundering, and yes it's a real and sizable issue. Yet, the majority of money is not laundered. How does the EU company make sure it will not be held liable, especially the people that made the decision? Maybe on a technical level the perfect crime is possible and not getting caught is possible or even likely given a certain approach. But the uncertainty around it will dissuade many, not all. The same goes for companies selling the services, you might think you have a foolproof way to circumvent the measures put in play, but what if not and the government comes knocking?

replies(1): >>45079156 #
6. jlarocco ◴[] No.45075933[source]
I'm not anti-government, but a technical solution that elliminates the the problem is infinitely better than regulating around it.

The internet is too big and distributed to regulate. Nobody will agree on what the rules should be, and certain groups or countries will disagree in any case and refuse to enforce them.

Existing regulation rarely works, and enforcement is half-assed, at best. Ransomware is regulated and illlegal, but we see articles about major companies infected all the time.

I don't think registering with Cloudflare is the answer, but regulation definitely isn't the answer.

replies(1): >>45091650 #
7. zimmund ◴[] No.45078127[source]
> Institute a government agency that is tasked with enforcement.

You're forgetting about the first W in WWW...

replies(1): >>45091669 #
8. akoboldfrying ◴[] No.45079156{5}[source]
Your money laundering analogy is apt. I know very little about that topic, and I especially don't know how much money laundering is really out there (nor do governments), but I'm confident that a lot is. Do AML laws have a chilling effect on it? I think they must, since they surely increase the cost and risk, and similar legislation for scraping should have a similar effect. But AML is a pretty bad solution to money laundering, and I despair if AML-for-scraping is the best possible solution to scraping.
9. account42 ◴[] No.45091650[source]
The problem is that a technical solution is impossible.
10. account42 ◴[] No.45091669[source]
So what you're saying is that if I were to host a bit torrent tracker in Sweden then the US can't do anything about it?