(cryptography.dog)

Show context

Noumenon72 ◴[31 Oct 25 17:25 UTC] No.45774469[source]▶

>>45773347 (OP) #

It doesn't seem that abusive. I don't comment things out thinking "this will keep robots from reading this".

replies(2): >>45774493 #>>45774628 #

1. mostlysimilar ◴[31 Oct 25 17:42 UTC] No.45774628[source]▶

>>45774469 #

The article mentions using this as a means of detecting bots, not as a complaint that it's abusive.

EDIT: I was chastised, here's the original text of my comment: Did you read the article or just the title? They aren't claiming it's abusive. They're saying it's a viable signal to detect and ban bots.

replies(3): >>45774645 #>>45774743 #>>45776844 #

2. pseudalopex ◴[31 Oct 25 17:43 UTC] No.45774645[source]▶

>>45774628 (TP) #

Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that".[1]

[1] https://news.ycombinator.com/newsguidelines.html

3. woodrowbarlow ◴[31 Oct 25 17:51 UTC] No.45774743[source]▶

>>45774628 (TP) #

the first few words of the article are:

> Last Sunday I discovered some abusive bot behaviour [...]

replies(2): >>45774770 #>>45774783 #

4. mostlysimilar ◴[31 Oct 25 17:53 UTC] No.45774770[source]▶

>>45774743 #

> The robots.txt for the site in question forbids all crawlers, so they were either failing to check the policies expressed in that file, or ignoring them if they had.

5. foobarbecue ◴[31 Oct 25 17:54 UTC] No.45774783[source]▶

>>45774743 #

Yeah but the abusive behavior is ignoring robots.txt and scraping to train AI. Following commented URLs was not the crime, just evidence inadvertently left behind.

6. ang_cire ◴[31 Oct 25 21:22 UTC] No.45776844[source]▶

>>45774628 (TP) #

They call the scrapers "malicious", so they are definitely complaining about them.

> A few of these came from user-agents that were obviously malicious:

(I love the idea that they consider any python or go request to be a malicious scraper...)

↑

AI scrapers request commented scripts