←back to thread

646 points blendergeek | 1 comments | | HN request time: 0.213s | source
Show context
hartator ◴[] No.42725964[source]
There are already “infinite” websites like these on the Internet.

Crawlers (both AI and regular search) have a set number of pages they want to crawl per domain. This number is usually determined by the popularity of the domain.

Unknown websites will get very few crawls per day whereas popular sites millions.

Source: I am the CEO of SerpApi.

replies(9): >>42726093 #>>42726258 #>>42726572 #>>42727553 #>>42727737 #>>42727760 #>>42728210 #>>42728522 #>>42742537 #
1. palmfacehn ◴[] No.42727737[source]
Even a brand new site will get hit heavily by crawlers. Amazonbot, Applebot, LLM bots, scrapers abusing FB's link preview bot, SEO metric bots and more than a few crawlers out of China. The desirable, well behaved crawlers are the only ones who might lose interest.

The typical entry point is a sitemap or RSS feed.

Overall I think the author is misguided in using the tarpit approach. Slow sites get less crawls. I would suggest using easily GZIP'd content and deeply nested tags instead. There are also tricks with XSL, but I doubt many mature crawlers will fall for that one.