←back to thread

Cloudflare.com's Robots.txt

(www.cloudflare.com)
145 points sans_souse | 1 comments | | HN request time: 0s | source
Show context
yapyap ◴[] No.42164094[source]
That’s cool, if any scrapers would still respect the robots.txt that is
replies(4): >>42164168 #>>42165000 #>>42165017 #>>42165663 #
1. marginalia_nu ◴[] No.42165017[source]
They may or may not, though respecting robots.txt is a nice way of not having your IP range end up on blacklists. With cloudflare in particular, that can be a bit of a pain.

They're pretty nice to deal with if you're upfront about what you are doing and clearly identify your bot, as well as register it with their bot detection. There's a form floating around somewhere for that.