287 points govideo | 2 comments | 06 Mar 25 22:34 UTC | HN request time: 0.448s | source

I have a domain that is not live. As expected, loading the domain returns: Error 1016.

However...I have a subdomain with a not obvious name, like: userfileupload.sampledomain.com

This subdomain IS LIVE but has NOT been publicized/posted anywhere. It's a custom URL for authenticated users to upload media with presigned url to my Cloudflare r2 bucket.

I am using CloudFlare for my DNS.

How did the internet find my subdomain? Some sample user agents are: "Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks.com", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_7; en-us) AppleWebKit/534.20.8 (KHTML, like Gecko) Version/5.1 Safari/534.20.8", "Mozilla/5.0 (Linux; Android 9; Redmi Note 5 Pro) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.89 Mobile Safari/537.36",

The bots are GET requests which are failing, as designed, but I'm wondering how the bots even knew the subdomain existed?!

Show context

BLKNSLVR ◴[07 Mar 25 10:26 UTC] No.43288985[source]▶

>>43285725 (OP) #

There are a number of companies, not just Palo Alto Networks, that perform various different scales of scans of the entire IPv4 space, some of them perform these scans multiple times per day.

I setup a set of scripts to log all "uninvited activity" to a couple of my systems, from which I discovered a whole bunch of these scanner "security" companies. Personally, I treat them all as malicious.

There are also services that track Newly Registered Domains (NRDs).

Tangentially:

NRD lists are useful for DNS block lists since a large number of NRDs are used for short term scam sites.

My little, very amateur, project to block them can be found here: https://github.com/UninvitedActivity/UninvitedActivity

Edited to add: Direct link to the list of scanner IP addresses (although hasn't been updated in 8 months - crikey, I've been busy longer than I thought): https://github.com/UninvitedActivity/UninvitedActivity/blob/...

replies(3): >>43289105 #>>43290045 #>>43290272 #

1. drpossum ◴[07 Mar 25 14:09 UTC] No.43290272[source]▶

>>43288985 #

How does an ip scan help with general DNS resolution at all?

replies(1): >>43295758 #

2. BLKNSLVR ◴[07 Mar 25 23:02 UTC] No.43295758[source]▶

>>43290272 (TP) #

They scan certain ports as well, which can provide them with 'fingerprints' as to what's running on those ports, which can then invite further investigation.

If ports 80 or 443 are open and there's a web server fingerprint (Apache, nginx, caddy, etc) then they could use further tools to try to discover domain names etc.

↑

Ask HN: How did the internet discover my subdomain?