←back to thread

454 points positiveblue | 1 comments | | HN request time: 0.202s | source
Show context
seanvelasco ◴[] No.45066752[source]
as a Cloudflare customer, I am happy with their proposition. I personally do not want companies like Perplexity that fake their user-agent and ignore my robots.txt to trespass.

and isn't this why people sign up with Cloudflare in the first place? for bot protection? to me, this is just the same, but with agents.

i love the idea of an open internet, but this requires all party to be honest. a company like Perplexity that fakes their user-agent to get around blocks disrespects that idea.

my attitude towards agents is positive. if a user used an LLM to access my websites and web apps, i'm all for it. but the LLM providers must disclose who they are - that they are OpenAI, Google, Meta, or the snake oil company Perplexity

replies(2): >>45067171 #>>45067583 #
chrisweekly ◴[] No.45067583[source]
Your complaints about "faking their user-agent" reminds me of this 15-year-old but still-relevant, classic post about the history of the user-agent string:

https://webaim.org/blog/user-agent-string-history/

TLDR the UA string has always been "faked", even in the scenarios you might think are most legitimate.

replies(1): >>45072900 #
1. jeroenhd ◴[] No.45072900[source]
The traditional UA fakery (adding Mozilla to the start and then just tacking on browser engine names) was the result of outdated websites breaking browsers.

The problematic fakery here is that bots are pretending to be people by emulating browsers to prevent rate limits and other technical controls.

That second category has also been with us since the dawn of the internet, but it has always been something worth complaining about. No trustworthy tool or service will pretend to be a real browser, at least not by default.

If AI agents just identified themselves as such, we wouldn't need elaborate schemes to block them when they need to be blocked.