←back to thread

707 points namukang | 1 comments | | HN request time: 0.248s | source
Show context
moritonal ◴[] No.29257791[source]
Whilst nice, how is this going to handle the changing nature of the web? It's nice that it detects "lists" and such, but a few changes to CSS is going to trash that automation right?

I'm also fairly sure you'll break (either directly, or on a user's behalf) a few EULA's that really specifically ban scraping.

replies(2): >>29258424 #>>29260327 #
kreeben ◴[] No.29258424[source]
Didn't this case [0] set a precedence that "scraping is not against the law" irregardless of EULA?

[0] https://en.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn

replies(4): >>29258469 #>>29258707 #>>29259962 #>>29260430 #
1. detuur ◴[] No.29259962[source]
This might be true in the USA, but the EU has a thing called database rights[0]. Essentially, any collection of data can under certain circumstances be protected under database rights, which prevents other parties from copying (parts of) it. This originally was created to protect such things as phone books and other directories, but when I was a student (I don't remember the context anymore), they specifically warned us that scraping certain websites would violate their database rights, and thus be illegal. So using scrapers in the EU is something you should be very careful with, especially if your business depends on it.

[0] https://en.wikipedia.org/wiki/Database_right