←back to thread

287 points govideo | 1 comments | | HN request time: 0.358s | source

I have a domain that is not live. As expected, loading the domain returns: Error 1016.

However...I have a subdomain with a not obvious name, like: userfileupload.sampledomain.com

This subdomain IS LIVE but has NOT been publicized/posted anywhere. It's a custom URL for authenticated users to upload media with presigned url to my Cloudflare r2 bucket.

I am using CloudFlare for my DNS.

How did the internet find my subdomain? Some sample user agents are: "Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks.com", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_7; en-us) AppleWebKit/534.20.8 (KHTML, like Gecko) Version/5.1 Safari/534.20.8", "Mozilla/5.0 (Linux; Android 9; Redmi Note 5 Pro) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.89 Mobile Safari/537.36",

The bots are GET requests which are failing, as designed, but I'm wondering how the bots even knew the subdomain existed?!

1. pagealert ◴[] No.43305660[source]
The discovery of your unpublished subdomain by bots likely stems from a combination of technical factors related to DNS, server configuration, and bot behavior. Here's a breakdown of the possible reasons and solutions:

1. DNS Leaks or Wildcard Records Wildcard DNS Entries: If your main domain (sampledomain.com) has a wildcard DNS record (e.g., .sampledomain.com), any subdomain (including userfileupload.sampledomain.com) could be automatically resolved to your server’s IP. Even if the main domain is inactive, the wildcard might expose the subdomain.

Exposed Subdomain DNS Records: If the subdomain’s DNS records (e.g., A/CNAME records) are explicitly configured but not removed, bots could reverse-engineer them via DNS queries or IP scans.

Fix: Remove or restrict wildcard DNS entries and delete unused subdomain records from your DNS provider (e.g., Cloudflare).

2. Server IP Scanning IP-Based Discovery: Bots like Expanse systematically scan IP addresses to identify active services. If your subdomain’s server is listening on ports 80/443 (HTTP/HTTPS), bots may:

Perform a port scan to detect open ports. Attempt common subdomains (e.g., userfileupload, upload, media) on the detected IP to guess valid domains. Fix:

Block unnecessary ports (e.g., close port 80/443 if unused). Use a firewall (e.g., ufw or Cloudflare Firewall Rules) to reject requests from suspicious IPs. 3. Cloudflare’s Default Behavior Page Rules or Workers: If the subdomain is configured with Cloudflare Workers, default error pages, or caching rules, it might generate responses that bots can crawl. For example:

A 404 Not Found page with a custom message could be indexed by search engines. Worker scripts might inadvertently expose endpoints (e.g., /_worker.js). Fix:

Delete unused subdomains from Cloudflare’s DNS settings. Ensure Workers/routes are only enabled for intended domains. 4. Reverse DNS Lookup IP-to-Domain Mapping: If your server’s IP address is shared or part of a broader range, bots might reverse-resolve the IP to discover associated domains (e.g., via dig -x <IP>).

Fix:

Use a dedicated IP address for sensitive subdomains. Contact your ISP to request removal from public IP databases. 5. Authentication Flaws Presigned URLs in Error Messages: If the subdomain’s server returns detailed error messages (e.g., 403 Forbidden) when accessed without authentication, bots might parse these messages to infer valid endpoints or credentials.

Fix:

Customize error pages to show generic messages (e.g., "Access Denied"). Log and block IPs attempting brute-force access. How to Prevent Future Discoveries Remove Unused DNS Records: Delete the subdomain from Cloudflare’s DNS settings entirely. Disable Wildcards: Avoid .sampledomain.com wildcards to limit exposure. Firewall Rules: Block IPs from scanners (e.g., Palo Alto Networks, Expanse) using Cloudflare’s DDoS Protection or a firewall. Monitor Logs: Use tools like grep or Cloudflare logs to track access patterns and block suspicious IPs. Use Authentication: Require API keys, tokens, or OAuth for all subdomain requests. Example Workflow for Debugging bash # Check Cloudflare DNS records for the subdomain: dig userfileupload.sampledomain.com +trace

# Inspect server logs for recent requests: grep -E "^ERROR|DENY" /var/log/nginx/access.log

# Block Expanse IPs via Cloudflare Firewall: # 1. Go to Cloudflare > Firewall > Tools. # 2. Add a custom rule to block IPs (e.g., from scaninfo@paloaltonetworks.com). By tightening DNS, server, and firewall configurations, you can minimize exposure of your internal subdomains to bots.