←back to thread

287 points govideo | 6 comments | | HN request time: 0.765s | source | bottom

I have a domain that is not live. As expected, loading the domain returns: Error 1016.

However...I have a subdomain with a not obvious name, like: userfileupload.sampledomain.com

This subdomain IS LIVE but has NOT been publicized/posted anywhere. It's a custom URL for authenticated users to upload media with presigned url to my Cloudflare r2 bucket.

I am using CloudFlare for my DNS.

How did the internet find my subdomain? Some sample user agents are: "Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks.com", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_7; en-us) AppleWebKit/534.20.8 (KHTML, like Gecko) Version/5.1 Safari/534.20.8", "Mozilla/5.0 (Linux; Android 9; Redmi Note 5 Pro) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.89 Mobile Safari/537.36",

The bots are GET requests which are failing, as designed, but I'm wondering how the bots even knew the subdomain existed?!

Show context
Kikawala ◴[] No.43285768[source]
Is it available under HTTPS? Then it's probably in a Certificate Transparency log.
replies(2): >>43285819 #>>43286390 #
1. govideo ◴[] No.43285819[source]
Yes, https via cloudflare's automatic https. Thanks for the info.
replies(3): >>43286059 #>>43286190 #>>43290843 #
2. thisisgvrt ◴[] No.43286059[source]
Automated agents can tail the certificate log to discover new domains as the certs are issued. But if you want to explore subdomains manually, https://crt.sh/ is a nice tool.
3. snailmailman ◴[] No.43286190[source]
Yeah this is a surprisingly little known fact- all certs being logged means all subdomain names get logged.

Wildcard certs can hide the subdomains, but then your cert works on all subdomains. This could be an issue if the certs get compromised.

Usually there isn’t sensitive information in subdomain names, but i suspect it often accidentally leaks information about infrastructure setups. "vaultwarden.example.com" existing tells you someone is probably running a vaultwarden instance, even if it’s not publicly accessible.

The same kind of info can leak via dns records too, I think?

replies(2): >>43286425 #>>43291923 #
4. tialaramex ◴[] No.43286425[source]
> The same kind of info can leak via dns records too, I think?

That's correct "passive DNS" is sold by many large public DNS providers. They tell you (for a fee) what questions were asked and answered which meet your chosen criteria. So e.g. maybe you're interested, what questions and answers matched A? something.internal.bigcorp.example in February 2025.

They won't tell you who asked (IP address, etc.) but they're great for discovering that even though it says 404 for you, bigcorp.famous-brand-hr.example is checked regularly by somebody, probably BigCorp employees who aren't on their VPN - suggesting very strongly that although BigCorp told Famous Brand HR not to list them as a client that is in fact the HR system used by BigCorp.

5. yatralalala ◴[] No.43290843[source]
If you're using infra in a way [cloudflare -> your VM] I'd recommend setting firewall on the VM in a way that it can be accessed only from Cloudflare.

This way, you will force everyone to go through Cloudflare and utilize all those fancy bot blocking features they have.

6. Arrowmaster ◴[] No.43291923[source]
I had coworkers at a previous employer go change settings in CloudFlare trying to troubleshoot instead of reaching out to me. They changed the option that caused CF proxy to issue a cert for every subdomain instead of using the wildcard. They didn't understand why I was pissed that they had now written every subdomain we had in use to the public record in addition to doing it without an approved change request.