←back to thread

646 points blendergeek | 1 comments | | HN request time: 0.001s | source
Show context
hartator ◴[] No.42725964[source]
There are already “infinite” websites like these on the Internet.

Crawlers (both AI and regular search) have a set number of pages they want to crawl per domain. This number is usually determined by the popularity of the domain.

Unknown websites will get very few crawls per day whereas popular sites millions.

Source: I am the CEO of SerpApi.

replies(9): >>42726093 #>>42726258 #>>42726572 #>>42727553 #>>42727737 #>>42727760 #>>42728210 #>>42728522 #>>42742537 #
diggan ◴[] No.42726093[source]
> There are already “infinite” websites like these on the Internet.

Cool. And how much of the software driving these websites is FOSS and I can download and run it for my own (popular enough to be crawled more than daily by multiple scrapers) website?

replies(2): >>42726322 #>>42726514 #
gruez ◴[] No.42726322[source]
Off the top of my head: https://everyuuid.com/

https://github.com/nolenroyalty/every-uuid

replies(2): >>42726420 #>>42732710 #
diggan ◴[] No.42726420[source]
Aren't those finite lists? How is a scraper (normal or LLM) supposed to "get stuck" on those?
replies(1): >>42726470 #
1. gruez ◴[] No.42726470{3}[source]
even though 2^128 uuids is technically "finite", for all intents and purposes is infinite to a scraper.
replies(1): >>42728528 #