←back to thread

Cloudflare.com's Robots.txt

(www.cloudflare.com)
145 points sans_souse | 6 comments | | HN request time: 0.258s | source | bottom
1. jsheard ◴[] No.42164090[source]
This is what happens if your robot isn't nice

  > curl -I -H "User-Agent: Googlebot" https://www.cloudflare.com
  HTTP/2 403
replies(1): >>42164220 #
2. jamesog ◴[] No.42164220[source]
That's not from robots.txt, but their Bot Management feature which blocks things calling themselves Googlebot that don't come from known Google IPs.
replies(1): >>42164616 #
3. speedgoose ◴[] No.42164616[source]
Are GCP IPs considered Google IPs?
replies(3): >>42164648 #>>42164657 #>>42165651 #
4. crop_rotation ◴[] No.42164648{3}[source]
No I am very sure they are not.
5. jgrahamc ◴[] No.42164657{3}[source]
No.
6. judge2020 ◴[] No.42165651{3}[source]
For reference https://developers.google.com/search/docs/crawling-indexing/...