Most active commenters

    ←back to thread

    586 points prawn | 14 comments | | HN request time: 0.804s | source | bottom
    Show context
    schoen ◴[] No.14502425[source]
    I wrote this article/originally created this list, and I would like to emphasize that there is a second generation of this technology that probably uses dithering parameters or something of that sort, and that does not produce visible dots but still creates a tracking code. We don't know the details but we do know that some companies told governments that they were going to do this, and that some newer printers from companies that the government agencies said were onboard with forensic marking no longer print yellow dots.

    That makes me think that it may have been a mistake to create this list in the first place, because the main practical use of the list would be to help people buy color laser printers that don't do forensic tracking, yet it's not clear that any such printers are actually commercially available.

    replies(8): >>14502841 #>>14503474 #>>14504327 #>>14504357 #>>14504856 #>>14505064 #>>14505539 #>>14507194 #
    1. captainmuon ◴[] No.14504357[source]
    Is somebody working on identifying these modern watermarks? A start would be to print out test pages and compare high resolution scans. Maybe also multiple printouts from the same printer to see what the natural variation is, and if there is a timestamp component.

    I would start, but I'm currently not around a printer...

    replies(3): >>14504471 #>>14504803 #>>14505308 #
    2. amelius ◴[] No.14504471[source]
    Shouldn't we be able to control a printer on the pixel (dither) level in the first place?
    replies(2): >>14504580 #>>14504600 #
    3. leni536 ◴[] No.14504580[source]
    This would be nice for other reasons too. There can be better halftoning algorithms to the typical pattern based halftoning of laser printers. It's hard to calibrate though, printers don't print "pixels". They print dots that typically overlap a lot, DPI for printers is the resolution for positioning the dots, not the size of the dots.
    replies(1): >>14508284 #
    4. verytrivial ◴[] No.14504600[source]
    Even if we are able to, if the default is to divulge the printer ID via a dithering pattern either at the driver or machine level when given a blob of image data, I think this problem becomes similar to "Can't we all just encrypt our email?" i.e. largely academic.
    replies(1): >>14504979 #
    5. pbhjpbhj ◴[] No.14504803[source]
    Surely we can skip the paper stage and hookup the motors used for head positioning control to a rig - either reading through rotary measure or preferably reading the signal to the motors directly.

    Print the same page, compare the signals sent to the motors? Won't that be a more easily/accurately measure proxy for what's actually being printed. One might need the timing data for the jets on an inkjet too, etc.

    replies(1): >>14504857 #
    6. dom0 ◴[] No.14504857[source]
    No printer controls their motors with >600 dpi resolution. Inkjet printers have print heads with many nozzles; the motors do the rough positioning / slide the head over the paper, the nozzles do the hard work. In laser printers a motor only moves the paper along (all rollers are either free-running or synchronized by gears).

    So for an inkjet you'd have to look at the nozzle timing, which might be difficult depending on how integrated the drivers are (e.g. if they're a custom chip on a flexprint behind the heads... uhm...). For a laser printer you'd have to look at the laser modulation signal. That should be much easier, bugs have done that before.

    Reverse engineering the firmware might be easier... on the other hand, the firmware is probably bolted shut rather well — the printer manufacturers cartridge DRM is in there somewhere.

    replies(2): >>14505037 #>>14505813 #
    7. lucideer ◴[] No.14504979{3}[source]
    We're currently in a situation where we don't even really fully know which are doing it or exactly how, so it would be many giant leaps in the right direction at least.

    It would likely make identifying tracking marks and algorithms a lot easier.

    8. dithering ◴[] No.14505037{3}[source]
    Maybe "just after the electronics, but before the print heads/motors" is the appropriate place to probe. It might be more work than anyone's prepared to put in (and of questionable utility), but you could emulate the motors and heads, and generate an image of what would have been printed.
    9. RegW ◴[] No.14505308[source]
    I suppose the approach is to create a machine learning dataset that maps hi-res scans of sample documents to the printers that produced them. If the resulting classifier can accurately id the printer, you have probably found a watermark, but it might just be natural variations in the manufacturing.
    replies(2): >>14505531 #>>14508048 #
    10. Heliosmaster ◴[] No.14505531[source]
    still, if the result is a "fingerprint" of a printer, it'd be interesting to know, because it can be used by law enforcement too
    11. pbhjpbhj ◴[] No.14505813{3}[source]
    I was assuming the nozzles has some sort of actuator that approximates to the term "motor" - I think you're either driving a tiny heater, or piezo, or a charge deflector plate in inkjet printing? That presumably is where the jitter would physically manifest; so you'd look at the input signal to those elements?

    Reversing the firmware though, good call.

    12. Paul-ish ◴[] No.14508048[source]
    The difficulty in this approach is that you have an extremely large number of classes. Each printer is its own class. Typically, as the number of classes goes up, accuracy goes done. That isn't to say it isn't possible, but it would require a lot of custom hacks to any learning algorithm.

    Also to convince anyone that it works, you would need to test it out on an extremely large number of printers, including ones of the same model. In practice that could be expensive.

    replies(1): >>14508255 #
    13. kpil ◴[] No.14508255{3}[source]
    Nah, it's not feasible to know the printer model if you want to identify a laserprinted dollar.

    A few variants at most.

    14. kpil ◴[] No.14508284{3}[source]
    I think that ALL algorithms are better than the default halftone algorithms.

    I think it's possible to send saturated pixels using PCL, and tell the printer to disable half-toning. It requires that a full page fits in memory, which isn't much (512MB) but typically more than the default.

    For some reasons all printers use really vintage memory, so 512MB extra memory is crazy-expensive.