←back to thread

617 points jbegley | 1 comments | | HN request time: 0s | source
Show context
causal ◴[] No.42938604[source]
One of my chief worries about LLMs for intelligence agencies is the ability to scale textual analysis. Previously there at least had to be an agent taking an interest in you; today an LLM could theoretically read all text you've ever touched and flag anything from legal violations to political sentiments.
replies(3): >>42938955 #>>42939640 #>>42942014 #
Etheryte ◴[] No.42938955[source]
This has already been possible long before LLMs came along. I also doubt that an LLM is the best tool for this at scale, if you're talking about sifting through billions of messages it gets too expensive very fast.
replies(4): >>42939054 #>>42940609 #>>42942533 #>>42988249 #
1. int_19h ◴[] No.42942533[source]
It's only expensive if you throw all data directly at the largest models that you have. But the usual way to apply LMs to such large amounts of data is by staggering them: you have very small & fast classifiers operating first to weed out anything vaguely suspicious (and you train them to be aggressive - false positives are okay, false negatives are not). Things that get through get reviewed by a more advanced model. Repeat the loop as many times as needed for best throughput.

No, OP is right. We are truly at the dystopian point where a sufficiently rich government can track the loyalty of its citizens in real time by monitoring all electronic communications.

Also, "expensive" is relative. When you consider how much US has historically been willing to spend on such things...