However...I have a subdomain with a not obvious name, like: userfileupload.sampledomain.com
This subdomain IS LIVE but has NOT been publicized/posted anywhere. It's a custom URL for authenticated users to upload media with presigned url to my Cloudflare r2 bucket.
I am using CloudFlare for my DNS.
How did the internet find my subdomain? Some sample user agents are: "Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks.com", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_7; en-us) AppleWebKit/534.20.8 (KHTML, like Gecko) Version/5.1 Safari/534.20.8", "Mozilla/5.0 (Linux; Android 9; Redmi Note 5 Pro) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.89 Mobile Safari/537.36",
The bots are GET requests which are failing, as designed, but I'm wondering how the bots even knew the subdomain existed?!
In the context of what OP is asking this is not true. DNS zones aren't enumerable - the only way to reliably get the complete contents of the zone is to have the SOA server approve a zone transfer and send the zone file to you. You can ask if a record in that zone exists but as a random user you can't say "hand over all records in this zone". I'd imagine that tools like Cloudflare that need this kind of functionality perform a dictionary search since they get 90% of records when importing a domain but always seem to miss inconspicuously-named ones.
> Even if it were not, the message you pasted says outright that they scan the entire IP space, so they could be hitting your server's IP without having a clue there is a subdomain serving your stuff from it.
This is likely what's happening. If the bot isn't using SNI or sending a host header then they probably found the server by IP. The fact that there's a heretofore unknown DNS record pointing to it is of no consequence. *EDIT: Or the Cert Transparency log as others have mentioned, though this isn't DNS per se. I learn something new every day :o)
Configuring BIND as an authoritative server for a corporate domain when I was a wee lad is how I learned DNS. It was and still is bad practice to allow zone transfers without auth. If memory serves I locked it down between servers via key pairs.