Most active commenters

    ←back to thread

    The NSA Selector

    (github.com)
    302 points anigbrowl | 11 comments | | HN request time: 1.211s | source | bottom
    1. jll29 ◴[] No.44045997[source]
    In NSA parlance, a "selector" primarily is a string that semi-uniquely identifies and addresses a persons intercepted data, such as

    - an IP address,

    - an email address,

    - a phone number,

    - a SIM card's MSIN

    - a person's social security number,

    - a national ID card number,

    - a passport number,

    - a social media handle etc.

    (elsewhere also known as "accessor", "key", "handle" or "index")

    replies(3): >>44046092 #>>44046548 #>>44047161 #
    2. jonathanstrange ◴[] No.44046092[source]
    They are interesting because combining and updating them is a non-trivial problem, as I've realized today while implementing a user ban system.
    replies(1): >>44046528 #
    3. Terr_ ◴[] No.44046528[source]
    There's a certain system I work with where random unauthenticated visitors on the internet end up supplying data like name/phone/email, with no validation... And of course, the business wants to somehow convert that into a list of "real people" and start correlating it to other records.

    I've been trying to stop anything too terrible from happening by asking them to clarify their business requirements, e.g. what should happen when there is malicious impersonation, or the expected result should be when inconsistencies and overlaps exist.

    It's not like there's no value to the data... but I'm afraid they don't really understand the problem are are hoping the magic computer can somehow *poof* garbage into fine cuisine.

    replies(2): >>44047188 #>>44047302 #
    4. dylan604 ◴[] No.44046548[source]
    Since he's building a sequencer, I'm almost disappointed it wasn't named Selecta.

    Rewind Bo Selecta!!

    5. tantalor ◴[] No.44047161[source]
    That doesn't tell me why this is called selector.
    replies(2): >>44047419 #>>44049544 #
    6. transcriptase ◴[] No.44047188{3}[source]
    “Enrichment” is what they actually want. People think it’s Google, Amazon, Facebook etc selling their data when in reality they are simply letting people target based on it.

    On the other hand there are hundreds of companies nobody has ever heard of that do buy, collate, clean, and sell access to the data that apps, browser extensions, windows apps, loyalty card programs, branded credit cards, retailers, and companies that scrape LinkedIn etc will happily sell.

    You provide what you have and for a price these “enrichment” services will provide what is essentially a dossier of everything that can be even remotely inferred from the thousands of datapoints they have based on your email/name/phone.

    What most people think big tech is doing is actually being done in ways that are far more unsettling by companies with cutesy names and vague services that major companies sign contracts with to improve their signal to noise ratio.

    7. grues-dinner ◴[] No.44047302{3}[source]
    Everyone in countries with data protection laws: concern.
    8. Fnoord ◴[] No.44047419[source]
    > In CSS, selectors are patterns used to match, or select, the elements you want to style. Selectors are also used in JavaScript to enable selecting the DOM nodes to return as a NodeList.

    From [1]. This nomenclature was also used way before CSS even existed; in SQL.

    [1] https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_selecto...

    replies(1): >>44047823 #
    9. tantalor ◴[] No.44047823{3}[source]
    That doesn't help at all. What do SQL or CSS have to do with it
    replies(1): >>44048098 #
    10. salawat ◴[] No.44048098{4}[source]
    My guess is it is meant tongue-in-cheek seeing as it is a way to render into audio all network traffic. The idea being that if you are being targeted by some nefarious, infamous 3 letter agency who has gotten something nefarious running on your system, this could, in theory, be used to identify traffic you didn't intend to have happen.

    Realistically speaking, it's hogwash because you'd have to profile all "loopback" traffic and whatnot, but in theory, with enough time and effort, that dastardly stream of bits could be heard, I guess. The circuit board itself though has quite a bit of punning in it.

    Also... If they really want on your databus, they will get on your databus. After a certain point, cost is not an issue.

    11. aa-jv ◴[] No.44049544[source]
    Probably as a piss-take? After all, the NSA is a wholesale subjugator of human rights - why not take the piss out of it?