←back to thread

646 points blendergeek | 1 comments | | HN request time: 0.202s | source
Show context
quchen ◴[] No.42725651[source]
Unless this concept becomes a mass phenomenon with many implementations, isn’t this pretty easy to filter out? And furthermore, since this antagonizes billion-dollar companies that can spin up teams doing nothing but browse Github and HN for software like this to prevent polluting their datalakes, I wonder whether this is a very efficient approach.
replies(9): >>42725708 #>>42725957 #>>42725983 #>>42726183 #>>42726352 #>>42726426 #>>42727567 #>>42728923 #>>42730108 #
Blackthorn ◴[] No.42725957[source]
If it means it makes your own content safe when you deploy it on a corner of your website: mission accomplished!
replies(2): >>42726400 #>>42727416 #
1. gruez ◴[] No.42726400[source]
>If it means it makes your own content safe

Not really? As mentioned by others, such tarpits are easily mitigated by using a priority queue. For instance, crawlers can prioritize external links over internal links, which means if your blog post makes it to HN, it'll get crawled ahead of the tarpit. If it's discoverable and readable by actual humans, AI bots will be able to scrape it.