←back to thread

597 points classichasclass | 1 comments | | HN request time: 0s | source
Show context
bob1029 ◴[] No.45011628[source]
I think a lot of really smart people are letting themselves get taken for a ride by the web scraping thing. Unless the bot activity is legitimately hammering your site and causing issues (not saying this isn't happening in some cases), then this mostly amounts to an ideological game of capture the flag. The difference being that you'll never find their flag. The only thing you win by playing is lost time.

The best way to mitigate the load from diffuse, unidentifiable, grey area participants is to have a fast and well engineered web product. This is good news, because your actual human customers would really enjoy this too.

replies(7): >>45011652 #>>45011830 #>>45011850 #>>45012424 #>>45012462 #>>45015038 #>>45015451 #
phito ◴[] No.45011652[source]
My friend has a small public gitea instance, only use by him a a few friends. He's getting thousounds of requests an hour from bots. I'm sorry but even if it does not impact his service, at the very least it feels like harassment
replies(7): >>45011694 #>>45011816 #>>45011999 #>>45013533 #>>45013955 #>>45014807 #>>45025114 #
dmesg ◴[] No.45011694[source]
Yes and it makes reading your logs needlessly harder. Sometimes I find an odd password being probed, search for it on the web and find an interesting story, that a new backdoor was discovered in a commercial appliance.

In that regard reading my logs led me sometimes to interesting articles about cyber security. Also log flooding may result in your journaling service truncating the log and you miss something important.

replies(3): >>45011747 #>>45011811 #>>45012470 #
rollcat ◴[] No.45012470[source]
> Sometimes I find an odd password being probed, search for it on the web and find an interesting story [...].

Yeah, this is beyond irresponsible. You know the moment you're pwned, __you__ become the new interesting story?

For everyone else, use a password manager to pick a random password for everything.

replies(1): >>45012625 #
Thorrez ◴[] No.45012625[source]
What is beyond irresponsible? Monitoring logs and researching odd things found there?
replies(2): >>45013099 #>>45013232 #
JohnFen ◴[] No.45013099[source]
How are passwords ending up in your logs? Something is very, very wrong there.
replies(2): >>45013284 #>>45014850 #
dmesg ◴[] No.45013284[source]
Does an attacking bot know your webserver is not a misconfigured router exposing its web interface to the net? I often am baffled what conclusions people come up with from half reading posts. I had bots attack me with SSH 2.0 login attempts on port 80 and 443. Some people underestimate how bad at computer science some skids are.
replies(3): >>45014297 #>>45015462 #>>45019689 #
socksy ◴[] No.45014297[source]
Also baffled that three separate people came to that conclusion. Do they not run web servers on the open web or something? Script kiddies are constantly probing urls, and urls come up in your logs. Sure it would be bad if that was how your app was architected. But it's not how it's architected, it's how the skids hope your app is architected. It's not like if someone sends me a request for /wp-login.php that my rails app suddenly becomes WordPress??
replies(2): >>45014904 #>>45019187 #
JohnFen ◴[] No.45014904[source]
> Do they not run web servers on the open web or something?

Until AI crawlers chased me off of the web, I ran a couple of fairly popular websites. I just so rarely see anybody including passwords in the URLs anymore that I didn't really consider that as what the commenter was talking about.

replies(1): >>45018636 #
1. viridian ◴[] No.45018636[source]
Just about every crawler that tries probing for wordpress vulnerabilities does this, or includes them in the naked headers as a part of their deluge of requests.