/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
Nepenthes is a tarpit to catch AI web crawlers
(zadzmo.org)
646 points
blendergeek
| 3 comments |
16 Jan 25 13:57 UTC
|
HN request time: 0.021s
|
source
1.
GaggiX
◴[
16 Jan 25 15:59 UTC
]
No.
42727038
[source]
▶
>>42725147 (OP)
#
As always, I find it hilarious that some people believe that these companies will train their flagship model on uncurated data, and that text generated by a Markov chain will not be filtered out.
replies(1):
>>42730667
#
ID:
GO
2.
JTyQZSnP3cQGa8B
◴[
16 Jan 25 20:45 UTC
]
No.
42730667
[source]
▶
>>42727038 (TP)
#
Then why the DDOS on random web sites?
replies(1):
>>42730774
#
3.
GaggiX
◴[
16 Jan 25 20:54 UTC
]
No.
42730774
[source]
▶
>>42730667
#
I guess that depends on how the webspider is configured, I doubt the curation is done in real-time while scraping.
↑