←back to thread

287 points govideo | 3 comments | | HN request time: 0.666s | source

I have a domain that is not live. As expected, loading the domain returns: Error 1016.

However...I have a subdomain with a not obvious name, like: userfileupload.sampledomain.com

This subdomain IS LIVE but has NOT been publicized/posted anywhere. It's a custom URL for authenticated users to upload media with presigned url to my Cloudflare r2 bucket.

I am using CloudFlare for my DNS.

How did the internet find my subdomain? Some sample user agents are: "Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks.com", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_7; en-us) AppleWebKit/534.20.8 (KHTML, like Gecko) Version/5.1 Safari/534.20.8", "Mozilla/5.0 (Linux; Android 9; Redmi Note 5 Pro) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.89 Mobile Safari/537.36",

The bots are GET requests which are failing, as designed, but I'm wondering how the bots even knew the subdomain existed?!

Show context
paxys ◴[] No.43287654[source]
Not sure why everyone is going on about certificate transparency logs when the answer is right there in the user agent. The company is scanning the ipv4 space and came upon your IP and port.
replies(6): >>43287671 #>>43287702 #>>43287703 #>>43287895 #>>43287976 #>>43288126 #
p0w3n3d ◴[] No.43288126[source]
Finding IP does not mean finding the domain. When doing HTTP request to IP you specify the domain you want to connect to. For example you can configure your /etc/hosts to have xxxnakedhamsters.google.com pointing to 8.8.8.8 and make the http request, which will cause Google getting the domain request (i.e. header Host: xxxnakedhamsters.google.com) and it will refuse it or try to redirect to http. Of course it's only related to HTTP because HTTPS will require certificate. That's why they're speaking about certificates.
replies(4): >>43288228 #>>43288802 #>>43289275 #>>43292054 #
lewiscollard ◴[] No.43288802[source]
Depending on the web server's configuration, you very much _can_ find the domain which is configured on an IP address, by attempting to connect to that IP address via HTTPS and seeing what certificate gets served. Here's an example:

https://138.68.161.203/

> Web sites prove their identity via certificates. Firefox does not trust this site because it uses a certificate that is not valid for 138.68.161.203. The certificate is only valid for the following names: exhaust.lewiscollard.com, www.exhaust.lewiscollard.com

replies(1): >>43289108 #
jchw ◴[] No.43289108[source]
I don't think that does you any good for Cloudflare, though. They will definitely be using SNI.
replies(2): >>43289333 #>>43296431 #
kelnos ◴[] No.43289333[source]
That doesn't really matter, though. While OP is using Cloudflare, the actual server behind it is still a publicly-accessible IP address that an IPv4 space scanner can easily stumble upon.
replies(1): >>43289570 #
1. jchw ◴[] No.43289570[source]
I misunderstood, I thought the subdomain was an R2 bucket. If it's just normal Cloudflare proxying to some backend this is probably the most likely answer.

That said, while I think it's not the case here, using Cloudflare doesn't mean the underlying host is accessible, as even on the free tier you can use Cloudflare Tunnels, which I often do.

replies(1): >>43296440 #
2. ratg13 ◴[] No.43296440[source]
they only state they are using cloudflare for DNS, they didn't say if they were proxying the connection
replies(1): >>43297617 #
3. jchw ◴[] No.43297617[source]
Also a valid point. I guess without more details all we can really do is speculate about the exact setup. That said, I do now agree that the most likely answer is "the underlying host was accessible and caught by an IPv4 scanner" since well, that's pretty much what it says anyway.