Bot or human? Creating an invisible Turing test for the internet

(research.roundtable.ai)

131 points timshell | 1 comments | 25 Jun 25 15:00 UTC | HN request time: 0.001s | source

Show context

imiric ◴[25 Jun 25 15:30 UTC] No.44378450[source]▶

I applaud the effort. We need human-friendly CAPTCHAs, as much as they're generally disliked. They're the only solution to the growing spam and abuse problem on the web.

Proof-of-work CAPTCHAs work well for making bots expensive to run at scale, but they still rely on accurate bot detection. Avoiding both false positives and negatives is crucial, yet all existing approaches are not reliable enough.

One comment re:

> While AI agents can theoretically simulate these patterns, the effort likely outweighs other alternatives.

For now. Behavioral and cognitive signals seem to work against the current generation of bots, but will likely also be defeated as AI tools become cheaper and more accessible. It's only a matter of time until attackers can train a model on real human input, and inference to be cheap enough. Or just for the benefit of using a bot on a specific target to outweigh the costs.

So I think we will need a different detection mechanism. Maybe something from the real world, some type of ID, or even micropayments. I'm not sure, but it's clear that bot detection is at the opposite, and currently losing, side of the AI race.

replies(11): >>44378709 #>>44379146 #>>44379545 #>>44380175 #>>44380453 #>>44380659 #>>44380693 #>>44382515 #>>44384051 #>>44387254 #>>44389004 #

chrismorgan ◴[25 Jun 25 17:03 UTC] No.44379545[source]▶

>>44378450 #

> We need human-friendly CAPTCHAs, as much as they're generally disliked. They're the only solution to the growing spam and abuse problem on the web.

This is wrong, badly wrong.

CAPTCHA stood for “Completely Automated Public Turing test to tell Computers and Humans Apart”. And that’s how people are using such things: to tell computers and humans apart. But that’s not the right problem.

Spam and abuse can come from computers, or from humans.

Productive use can come from humans, or from computers.

Abuse prevention should not be about distinguishing computers and humans: it should be about the actual usage behaviour.

CAPTCHAs are fundamentally solving the wrong problem. Twenty years ago, they were a tolerable proxy for the right problem: imperfect, but generally good enough. But they have become a worse proxy over time.

Also, “human-friendly CAPTCHAs” are just flat-out impossible in the long term. As you identify, it’s only a “for now” thing. Once it’s a target, it ceases to be effective. And the range in humans is so broad that it’s generally distressingly easy to make a bot exceed the lower reaches of human performance.

> Proof-of-work CAPTCHAs work well for making bots expensive to run at scale, but they still rely on accurate bot detection. Avoiding both false positives and negatives is crucial, yet all existing approaches are not reliable enough.

Proof-of-work is even more obviously a temporary solution, security by obscurity: it relies upon symmetry in computation power, which is just wildly incorrect. And all of the implementations I know of have made the bone-headed decision to start with SHA-256 hashing, which amplifies this asymmetry to ludicrous degree (factors of tens of thousands with common hardware, to tens of millions with Bitcoin mining hardware). At that point, forget choosing different iteration counts based on bot detection, it doesn’t even matter.

—⁂—

The inconvenient truth is: there is no Final Ultimate Solution to the Spam Problem (FUSSP).

replies(2): >>44379950 #>>44382001 #

Dylan16807 ◴[25 Jun 25 21:31 UTC] No.44382001[source]▶

>>44379545 #

> Proof-of-work is even more obviously a temporary solution, security by obscurity: it relies upon symmetry in computation power, which is just wildly incorrect. And all of the implementations I know of have made the bone-headed decision to start with SHA-256 hashing, which amplifies this asymmetry to ludicrous degree (factors of tens of thousands with common hardware, to tens of millions with Bitcoin mining hardware). At that point, forget choosing different iteration counts based on bot detection, it doesn’t even matter.

It takes a long time and enormous amounts of money to make new chips for a specific proof of work. And sites can change their algorithm on a dime. I don't think this is a big issue.

replies(1): >>44383743 #

1. chrismorgan ◴[26 Jun 25 02:33 UTC] No.44383743[source]▶

>>44382001 #

Even disregarding the SHA-256 thing, there is unavoidable significant asymmetry and range that renders proof of work unviable. One legitimate user may use a low-end phone, another may have a high-end desktop that can work a hundred or more times as fast whatever technique you use, and an attacker may have a bot net.

It’s important to assume, in security and security-adjacent things, that the attacker has more compute power than the defender. You cannot win in this way.

Proof-of-work is bad rate limiting that relies upon the server having a good estimate of the capabilities of the client. No more, no less.

I bring up the SHA-256 thing as an argument that none of the players in the space are competent. None of them. If you exclude hand-rolled cryptography or known-bad techniques like MD5, SHA-256 is very literally the worst choice remaining: its use in Bitcoin and the rewards available have utterly broken it for this application. If you intend proof of work to actually be the line of defence, you start with something like Argon2d instead. I honestly think that, at this stage, these scripts could replace their proof of work with a “sleep for one second” (maybe adding “or two if I think you’re probably a bot”) routine and have the server trust that they had done so, without compromising their effectiveness.

↑