A simple, but reasonably robust solution to the cam problem is just to screw around with black frames, that is, the frames in the middle of a fade out. Give yourself 20 places where you can insert an extra frame and choose 10 to insert, and you already given yourself 40 bits to play with.
(It's trivial to deal with the audio sync issues.)
Cams may have a lot of spatial unreliability, but they have a lot of temporal resolution.
And that's just my stupid of-the-cuff answer, which is already off to a decent start. And there are in fact purely-spatial solutions that do work, to which the temporal solutions can be added. The upshot is don't expect to beat these anytime soon. There's just too many bits to hide in, and so few bits needed for the identification.