←back to thread

770 points ta988 | 2 comments | | HN request time: 0s | source
Show context
mentalgear ◴[] No.42551541[source]
Note-worthy from the article (as some commentators suggested blocking them).

"If you try to rate-limit them, they’ll just switch to other IPs all the time. If you try to block them by User Agent string, they’ll just switch to a non-bot UA string (no, really). This is literally a DDoS on the entire internet."

replies(5): >>42551717 #>>42551976 #>>42552122 #>>42552700 #>>42552885 #
loeg ◴[] No.42552122[source]
I'd kind of like to see that claim substantiated a little more. Is it all crawlers that switch to a non-bot UA, or how are they determining it's the same bot? What non-bot UA do they claim?
replies(3): >>42552172 #>>42552177 #>>42555570 #
alphan0n ◴[] No.42555570[source]
I would take anything the author said with a grain of salt. They straight up lied about the configuration of the robots.txt file.

https://news.ycombinator.com/item?id=42551628

replies(2): >>42563001 #>>42567297 #
mplewis ◴[] No.42563001[source]
What is causing you to be so unnecessarily aggressive?
replies(1): >>42563372 #
alphan0n ◴[] No.42563372[source]
Liars should be called out, necessarily. Intellectual dishonesty is cancer. I could be more aggressive if it were something that really mattered.
replies(1): >>42563585 #
nkrisc ◴[] No.42563585[source]
Lying requires intent to deceive. How have you determined their intent?
replies(2): >>42563774 #>>42563827 #
alphan0n ◴[] No.42563827[source]
When someone says:

> Oh, and of course, they don't just crawl a page once and then move on. Oh, no, they come back every 6 hours because lol why not. They also don't give a single flying fuck about robots.txt, because why should they.

Their self righteous indignation and specificity of the pretend subject of that indignation precludes any doubt about intent.

This guy made a whole public statement that is verifiably false. And then tried to toddler logic it away when he got called out.

replies(1): >>42565944 #
nkrisc ◴[] No.42565944[source]
That may all be true. That still doesn’t mean they intentionally lied.
replies(1): >>42569607 #
alphan0n ◴[] No.42569607{3}[source]
What is the criteria of an intentional lie, then? Admission?

The author responded:

>denschub 2 days ago [–]

>the robots.txt on the wiki is no longer what it was when the bot accessed it. primarily because I clean up my stuff afterwards, and the history is now completely inaccessible to non-authenticated users, so there's no need to maintain my custom robots.txt

Which is verifiably untrue:

HTTP/1.1 200 server: nginx/1.27.2 date: Tue, 10 Dec 2024 13:37:20 GMT content-type: text/plain last-modified: Fri, 13 Sep 2024 18:52:00 GMT etag: W/"1c-62204b7e88e25" alt-svc: h3=":443", h2=":443" X-Crawler-content-encoding: gzip Content-Length: 28

User-agent: * Disallow: /w/

replies(1): >>42584946 #
1. nkrisc ◴[] No.42584946{4}[source]
> intentional lie

There are no “intentional” lies, because there are no “unintentional” lies.

All lies are intentional. An “unintentional lie” is better known as “being wrong”.

Being wrong isn’t always lying. What’s so hard about this? An example:

My wife once asked me if I had taken the trash out to the curb, and I said I had. This was demonstrably false, anyone could see I had not. Yet for whatever reason, I mistakenly believed that I had done it. I did not lie to her. I really believed I had done it. I was wrong.

replies(1): >>42593069 #
2. alphan0n ◴[] No.42593069[source]
No worries, I understand. The author admitted to me that he was lying via DM, that he often does this for attention.