"Non-consensually", as if you had to ask for permission to perform a GET request to an open HTTP server.
Yes, I know about weev. That was a travesty.
"Non-consensually", as if you had to ask for permission to perform a GET request to an open HTTP server.
Yes, I know about weev. That was a travesty.
robots.txt is a polite request to please not scrape these pages because it's probably not going to be productive. It was never meant to be a binding agreement, otherwise there would be a stricter protocol around it.
It's kind of like leaving a note for the deliveryman saying please don't leave packages on the porch. It's fine for low stakes situations, but if package security is of utmost importance to you, you should arrange to get it certified or to pick it up at the delivery center. Likewise if enforcing a rule of no scraping is of utmost importance you need to require an API token or some other form of authentication before you serve the pages.
"Taking" versus "giving" is neither here nor there for this discussion. The question is are you expressing a preference on etiquette versus a hard rule that must be followed. I personally believe robots.txt is the former, and I say that as someone who serves more pages than they scrape
No one is calling for the criminalization of door-to-door sales and no one is worried about how much door-to-door sales increases water consumption.
There are some states where the considerations for self defense do not include a duty to retreat if possible, either in general (“stand your ground" law) or specifically in the home (“castle doctrine"), but all the other requirements (imminent threat of certain kinds of serious harm, proportional force) for self-defense remain part of the law in those states, and trespassing by/while disregarding a ”no soliciting” would not, by itself, satisfy those requirements.