AV1@Scale: Film Grain Synthesis, The Awakening

(netflixtechblog.com)

252 points CharlesW | 2 comments | 03 Jul 25 16:34 UTC | HN request time: 0.001s | source

Show context

crazygringo ◴[03 Jul 25 20:54 UTC] No.44459098[source]▶

This fails to acknowledge that synthesized noise can lack the detail and information in the original noise.

When you watch a high-quality encode that includes the actual noise, there is a startling increase in resolution from seeing a still to seeing the video. The noise is effectively dancing over a signal, and at 24 fps the signal is still perfectly clear behind it.

Whereas if you lossily encode a still that discards the noise and then adds back artificial noise to match the original "aesthetically", the original detail is non-recoverable if this is done frame-by-frame. Watching at 24 fps produces a fundamentally blurrier viewing experience. And it's not subtle -- on old noisy movies the difference in detail can be 2x.

Now, if h.265 or AV1 is actually building its "noise-removed" frames by always taking into account several preceding and following frames while accounting for movement, it could in theory discover the signal of the full detail across time and encode that, and there wouldn't be any loss in detail. But I don't think it does? I'd love to know if I'm mistaken.

But basically, the point is: comparing noise removal and synthesis can't be done using still images. You have to see an actual video comparison side-by-side to determine if detail is being thrown away or preserved. Noise isn't just noise -- noise is detail too.

replies(7): >>44459330 #>>44459689 #>>44460601 #>>44461005 #>>44463130 #>>44465357 #>>44467163 #

kderbe ◴[03 Jul 25 21:25 UTC] No.44459330[source]▶

>>44459098 #

Grain is independent frame-to-frame. It doesn't move with the objects in the scene (unless the video's already been encoded strangely). So long as the synthesized noise doesn't have an obvious temporal pattern, comparing stills should be fine.

Regarding aesthetics, I don't think AV1 synthesized grain takes into account the size of the grains in the source video, so chunky grain from an old film source, with its big silver halide crystals, will appear as fine grain in the synthesis, which looks wrong (this might be mitigated by a good film denoiser). It also doesn't model film's separate color components properly, but supposedly that doesn't matter because Netflix's video sources are often chroma subsampled to begin with: https://norkin.org/pdf/DCC_2018_AV1_film_grain.pdf

Disclaimer: I just read about this stuff casually so I could be wrong.

replies(6): >>44459691 #>>44460021 #>>44460119 #>>44460217 #>>44460409 #>>44461097 #

1. majormajor ◴[04 Jul 25 04:19 UTC] No.44461097[source]▶

>>44459330 #

> So long as the synthesized noise doesn't have an obvious temporal pattern, comparing stills should be fine.

The problem is that the initial noise-removal and compression passes still removed detail (that is more visible in motion than in stills) that you aren't adding back.

If you do noise-removal well you don't have to lose detail over time.

But it's much harder to do streaming-level video compression on a noisy source without losing that detail.

The grain they're adding somewhat distracts from the compression blurriness but doesn't bring back the detail.

replies(1): >>44461153 #

2. Thorrez ◴[04 Jul 25 04:36 UTC] No.44461153[source]▶

>>44461097 (TP) #

>The grain they're adding somewhat distracts from the compression blurriness but doesn't bring back the detail.

Instead of wasting bits trying to compress noise, they can remove noise first, then compress, then add noise back. So now there aren't wasted bits compressing noise, and those bits can be used to compress detail instead of noise. So if you compare FGS compression vs non-FGS compression at the same bitrate, the FGS compression did add some detail back.

↑