Ask HN: How did the internet discover my subdomain?

1. BLKNSLVR ◴[07 Mar 25 10:26 UTC] No.43288985[source]▶

There are a number of companies, not just Palo Alto Networks, that perform various different scales of scans of the entire IPv4 space, some of them perform these scans multiple times per day.

I setup a set of scripts to log all "uninvited activity" to a couple of my systems, from which I discovered a whole bunch of these scanner "security" companies. Personally, I treat them all as malicious.

There are also services that track Newly Registered Domains (NRDs).

Tangentially:

NRD lists are useful for DNS block lists since a large number of NRDs are used for short term scam sites.

My little, very amateur, project to block them can be found here: https://github.com/UninvitedActivity/UninvitedActivity

Edited to add: Direct link to the list of scanner IP addresses (although hasn't been updated in 8 months - crikey, I've been busy longer than I thought): https://github.com/UninvitedActivity/UninvitedActivity/blob/...

replies(3): >>43289105 #>>43290045 #>>43290272 #

2. mr_mitm ◴[07 Mar 25 10:49 UTC] No.43289105[source]▶

>>43288985 (TP) #

Getting the domain name from the IP address is not trivial, though. In fact, it should be impossible, if the name really hasn't been published (barring guessing attempts), so OP's question stands.

replies(3): >>43289244 #>>43289253 #>>43289396 #

3. venj ◴[07 Mar 25 11:16 UTC] No.43289244[source]▶

>>43289105 #

I had this issue with internal domains indexed by Google. The domains where not published anywhere by my company. They were dcanned by leakix.net which apparently scans the whole web for vulnerabilities and publishes web pages containing the domain names associated with each IP address. I guess they read them from the certificates

replies(1): >>43289456 #

4. melevittfl ◴[07 Mar 25 11:18 UTC] No.43289253[source]▶

>>43289105 #

The OP is misunderstanding what's happened, based on what's been posted. The OP has a server with an IP address. They're seeing GET requests in the server's logs and is assuming people have found the server's DNS name.

In fact, the scanners are simply searching the IP address space and simply sending GET requests to any IP address they find. No DNS discovery needed.

replies(2): >>43289506 #>>43292555 #

5. okasaki ◴[07 Mar 25 11:44 UTC] No.43289396[source]▶

>>43289105 #

    $ host 209.216.230.207
    207.230.216.209.in-addr.arpa domain name pointer news.ycombinator.com.

replies(3): >>43289544 #>>43289769 #>>43290839 #

6. jhart99 ◴[07 Mar 25 11:58 UTC] No.43289456{3}[source]▶

>>43289244 #

There is another source, SNI certs showing up on a server or load balancer during the TLS handshake. When the client tries to connect to a server using SNI without indicating the server, some will reply with a default or give a list of valid server names.

7. alfiedotwtf ◴[07 Mar 25 12:08 UTC] No.43289506{3}[source]▶

>>43289253 #

Are you sure that’s the case? IP addresses != domain, so I’m getting bots are including the Host header in their requests containing the obfuscated domain.

My guess is OP is using a public DNS server that sells aggregated user requests. All it takes is one request from their machine to a public machine on the internet, and it’s now public knowledge.

8. mr_mitm ◴[07 Mar 25 12:15 UTC] No.43289544{3}[source]▶

>>43289396 #

Not sure what you are trying to tell me. This isn't guaranteed to work. If you define a reverse lookup record for your domain, then that counts as published in my book.

replies(1): >>43290288 #

9. dspillett ◴[07 Mar 25 12:53 UTC] No.43289769{3}[source]▶

>>43289396 #

That is when there is an explicit PTR record, for instance one of my assigned addresses can be named that way due to:

    74.231.187.81.in-addr.arpa. 3600 IN PTR ns2.nogoodnamesareleft.com.

in the zone file for that IPv4, but unless they've explicitly configured, or are using a hosting service that does it without asking, this it won't be what is happening.

It isn't practical to do a reverse lookup from “normal” name-to-address records like

    ns2.nogoodnamesareleft.com. IN A 81.187.231.74

(it is possible to build a partial reverse mapping by collecting a huge number of DNS query results, but not really practical unless you are someone like Google or Cloudflare running a popular resolution service)

10. yabones ◴[07 Mar 25 13:39 UTC] No.43290045[source]▶

>>43288985 (TP) #

I do something similar. Any hits on the default nginx vhost get logged, logs get parsed out and "repeat offenders" get put on the shitlist. I use ipset/iptables but this can also be done with fail2ban quite simply.

https://nbailey.ca/post/block-scanners/

replies(1): >>43290474 #

11. drpossum ◴[07 Mar 25 14:09 UTC] No.43290272[source]▶

>>43288985 (TP) #

How does an ip scan help with general DNS resolution at all?

replies(1): >>43295758 #

12. drpossum ◴[07 Mar 25 14:10 UTC] No.43290288{4}[source]▶

>>43289544 #

This is correct.

13. immibis ◴[07 Mar 25 14:33 UTC] No.43290474[source]▶

>>43290045 #

This is security theater.

replies(2): >>43293582 #>>43295814 #

14. DonHopkins ◴[07 Mar 25 15:18 UTC] No.43290839{3}[source]▶

>>43289396 #

I love how the ARPANET still lives on through reverse DNS PTRs.

https://www.youtube.com/watch?v=V78GUSOS-EM

15. lxgr ◴[07 Mar 25 18:13 UTC] No.43292555{3}[source]▶

>>43289253 #

That entirely depends on whether the GET requests were providing the (supposed to be hidden) hostname in the `Host` header (and potentially SNI TLS extension).

16. Sohcahtoa82 ◴[07 Mar 25 19:35 UTC] No.43293582{3}[source]▶

>>43290474 #

Only kinda.

Doing something like this can prevent you from showing up on Shodan.io which is used by many users/bots to find servers without running massive scans themselves.

17. BLKNSLVR ◴[07 Mar 25 23:02 UTC] No.43295758[source]▶

>>43290272 #

They scan certain ports as well, which can provide them with 'fingerprints' as to what's running on those ports, which can then invite further investigation.

If ports 80 or 443 are open and there's a web server fingerprint (Apache, nginx, caddy, etc) then they could use further tools to try to discover domain names etc.

18. BLKNSLVR ◴[07 Mar 25 23:08 UTC] No.43295814{3}[source]▶

>>43290474 #

No, it's security by obscurity which is a single, but important, step above security theatre.

To not appear on the radar is to not invite investigation; if they can't see the door they won't try to pry it open.

If you're already on their radar, or if they already know the door is there (even if they can't directly see it), then it's less effective.