Popular/hot comments

(www.eff.org)

Show context

schoen ◴[06 Jun 17 23:30 UTC] No.14502425[source]▶

I wrote this article/originally created this list, and I would like to emphasize that there is a second generation of this technology that probably uses dithering parameters or something of that sort, and that does not produce visible dots but still creates a tracking code. We don't know the details but we do know that some companies told governments that they were going to do this, and that some newer printers from companies that the government agencies said were onboard with forensic marking no longer print yellow dots.

That makes me think that it may have been a mistake to create this list in the first place, because the main practical use of the list would be to help people buy color laser printers that don't do forensic tracking, yet it's not clear that any such printers are actually commercially available.

replies(8): >>14502841 #>>14503474 #>>14504327 #>>14504357 #>>14504856 #>>14505064 #>>14505539 #>>14507194 #

1. SomeStupidPoint ◴[07 Jun 17 00:45 UTC] No.14502841[source]▶

>>14502425 #

Could you elaborate/speculate on how dithering patterns would be used?

replies(2): >>14502957 #>>14502967 #

2. PatentTroll ◴[07 Jun 17 01:06 UTC] No.14502957[source]▶

>>14502841 (TP) #

This isn't an answer, but I used to examine patents in this space. There are very advanced watermarking methods out there that are stable through transcoding, compression, obfuscation, etc. while being invisible to the naked eye. Really amazing stuff, wouldn't be surprised if there were lots of watermarks on media (audio, video, still image) that aren't readily apparent. One of the big use cases I remember was watermarking movies so that it would be possible to identify the time and place that a cam bootleg was recorded. That's a camcorder aimed at a movie screen and then heavily compressed and distributed over the internet, and the watermarks would still be detectable.

replies(3): >>14503453 #>>14503558 #>>14503814 #

3. schoen ◴[07 Jun 17 01:09 UTC] No.14502967[source]▶

>>14502841 (TP) #

Speculation: Dithering algorithms traditionally include randomness. If you can determine a portion of this randomness by inspecting the output of the algorithm, then the dithering can be used to send a message, for example by using the message as a seed to a PRNG or as an input to a hash whose output serves as the randomness for the dithering operation. If the underlying message space is small enough, you could recover the message by brute force, examining each possible message and the results that it would have produced, and seeing which set of results matches the observed document.

The hand-waving part of this is "if you can determine a portion of this randomness by inspecting the output of the algorithm", because I don't really understand how easy this could be made without knowing the exact underlying signal that the dithering algorithm needs to quantize.

An alternative might be slightly changing some of the values in the matrices at

https://en.wikipedia.org/wiki/Ordered_dithering

in a way that barely reduces perceptual image quality (although I'm not certain how well that can be done). Perhaps there is an algorithm that uses statistics to deduce what matrix was used, and then the perturbations can be read out of the matrix.

This is related to research in digital watermarking that's been going on for decades, and I'm definitely not an expert in that or in digital image processing, so I'd love to hear from people who know more.

Nonetheless, looking up close at how printers produce different colors out of CMYK dots, I'm pretty confident that they have some degrees of freedom, and that some of them probably don't make a lot of different perceptually, and can probably be used to encode a message.

4. kem ◴[07 Jun 17 03:12 UTC] No.14503453[source]▶

>>14502957 #

One of the things I wondered when I read this story is if it would be possible to develop software that would somehow circumvent this type of situation. For example, using autoencoding or something to lose watermarking details intentionally or something. But what you're discussing seems more advanced than the yellow dots idea.

I was surprised to see printers being involved--I thought something like this leak would all be digital. I was also surprised that the Intercept would not be more savvy about printer identification because it's been publicized so much over the years.

5. jerf ◴[07 Jun 17 03:36 UTC] No.14503558[source]▶

>>14502957 #

A simple, but reasonably robust solution to the cam problem is just to screw around with black frames, that is, the frames in the middle of a fade out. Give yourself 20 places where you can insert an extra frame and choose 10 to insert, and you already given yourself 40 bits to play with.

(It's trivial to deal with the audio sync issues.)

Cams may have a lot of spatial unreliability, but they have a lot of temporal resolution.

And that's just my stupid of-the-cuff answer, which is already off to a decent start. And there are in fact purely-spatial solutions that do work, to which the temporal solutions can be added. The upshot is don't expect to beat these anytime soon. There's just too many bits to hide in, and so few bits needed for the identification.

replies(2): >>14503589 #>>14504762 #

6. Medaber ◴[07 Jun 17 03:45 UTC] No.14503589{3}[source]▶

>>14503558 #

No. It's not that simple.

For a real example that really works, see, for example, digimarc:

https://www.digimarc.com/support/product/digimarc-guardian-f...

Images can be cropped, rotated, recompressed, scaled, etc. and the digital watermark remains.

Also see: https://en.wikipedia.org/wiki/Digimarc

and read some of their patents, referenced in the Wikipedia article.

replies(2): >>14504850 #>>14507113 #

7. TheHegemon ◴[07 Jun 17 04:41 UTC] No.14503814[source]▶

>>14502957 #

I have implemented one of those technologies at a previous company. Their claims and what the software was actually capable of were vastly different.

I would say about 20% of the files we sent over had enough recoverable watermark to be useful.

8. TeMPOraL ◴[07 Jun 17 09:19 UTC] No.14504762{3}[source]▶

>>14503558 #

> The upshot is don't expect to beat these anytime soon. There's just too many bits to hide in, and so few bits needed for the identification.

I agree. Some of the off the top of my head ideas that I literally just came up with now:

- if printing an image, drop a few dots in some rows (or columns); data is hidden in the pattern of dropped dots

- if printing text (as in, actual text goes to be rendered on the driver or printer firmware level, and not by the OS / text editor), slightly alter the shape of some letters (by adding or dropping a dot) to hide a pattern

- if printing an image, try to hide some data in its FFT (e.g. by adjusting differences between low frequencies and hiding a pattern there)

- if recording a video, slightly alter some otherwise stable global characteristic (like avg brightness of a bunch of consecutive frames in an animated movie)

- if recording a video, screw with timing patterns, as you mentioned

There are just so many properties, that the difficulty is probably mostly in picking something that's stable through usual transformations a document will undergo (e.g. scanning, JPG compression).

9. hueving ◴[07 Jun 17 09:43 UTC] No.14504850{4}[source]▶

>>14503589 #

>Images can be cropped, rotated, recompressed, scaled, etc. and the digital watermark remains.

And none of those would impact the timing of black interim time lengths.

Also, this makes digimarc sound crappy (from their site):

>Facebook compresses images once they are posted, sometimes heavily, which can damage our invisible identifiers. Fortunately, there is a simple solution: if you pre-compress your images, then apply our identified, they should survive.

So they don't survive compression.

replies(1): >>14505591 #

10. jerf ◴[07 Jun 17 15:17 UTC] No.14507113{4}[source]▶

>>14503589 #

"And that's just my stupid of-the-cuff answer, which is already off to a decent start. And there are in fact purely-spatial solutions that do work, to which the temporal solutions can be added."

↑

List of Printers Which Do or Do Not Display Tracking Dots