←back to thread

253 points akyuu | 2 comments | | HN request time: 0s | source
Show context
jcalvinowens ◴[] No.45946279[source]
Scrapers have constantly been running against my cgit server for the past year, but they're bizarrely polite in my case... 2-3 requests per minute.

This whole enterprise is clearly run by exceptionally dumb people, since you can just clone all the code I host there directly from upstreams...

    [16/Nov/2025:16:21:12 +0000] 190.92.214.144:34638 . "GET /cgit/linux/commit/drivers/vlynq?h=v5.15.76&id=59d42cd43c7335a3a8081fd6ee54ea41b0c239be HTTP/1.1" -> 200 3051b 3.42x 0.239ms
    [16/Nov/2025:16:22:15 +0000] 188.239.57.1:40328 . "GET /cgit/linux/commit/kernel/range.c?h=v6.12.31&id=459b37d423104f00e87d1934821bc8739979d0e4 HTTP/1.1" -> 200 2993b 3.42x 0.266ms
    [16/Nov/2025:16:22:56 +0000] 190.92.217.125:56580 . "GET /cgit/linux/commit/kernel?h=v5.15.92&id=f01aefe374d32c4bb1e5fd1e9f931cf77fca621a HTTP/1.1" -> 200 3091b 3.28x 0.250ms
    [16/Nov/2025:16:23:17 +0000] 159.138.10.64:44540 . "GET /cgit/linux/commit/drivers/mtd/mtdcore.c?h=v6.2.15&id=249858575fd3f27904d6bb775e5ab500e9ef3b0f HTTP/1.1" -> 200 3415b 3.47x 0.251ms
    [16/Nov/2025:16:23:58 +0000] 119.13.101.228:44342 . "GET /cgit/linux/commit/drivers/gpio?h=v6.6.93&id=bc7fe1a879fc024942bb9eff173fa619b722d09b HTTP/1.1" -> 200 3582b 3.37x 0.250ms
replies(2): >>45946580 #>>45954740 #
1. jcalvinowens ◴[] No.45954740[source]
Turned out those scattered IPs were from a very small number of places, blocking these five killed 100% of my crawler spam:

    AS4229    Zenlayer (Singapore) PTE. LTD
    AS21859   ZEN-ECN, US
    AS45102   ALIBABA-CN-NET Alibaba US Technology Co., Ltd., CN
    AS132203  TENCENT-NET-AP-CN Tencent Building, Kejizhongyi Avenue, CN
    AS136907  HUAWEI INTERNATIONAL PTE. LTD.
That's 4392 contiguous IP ranges.
replies(1): >>45975342 #
2. ◴[] No.45975342[source]