←back to thread

268 points prashp | 4 comments | | HN request time: 0.628s | source
1. ants_everywhere ◴[] No.39215828[source]
I'm curious, was this well-known by experts already? How surprising is this?

I enjoyed the write up.

replies(3): >>39215866 #>>39215887 #>>39218438 #
2. GaggiX ◴[] No.39215866[source]
I have never heard of this problem before, and I have seen a lot of discussion about VAE from researchers.
3. dwringer ◴[] No.39215887[source]
If one ever tried to make edits to the latents prior to decoding them with a VAE in SD1.5 and then in SDXL, it could be seen that that local changes had somewhat unpredictable and global effects on the image in SD1.5, while in SDXL the changes have more predictable impacts to the output image and some of the different latent channels end up corresponding more directly to the resulting image channels.

Definitely a fascinating write-up. I have been curious about these differences for a while, though I had never considered this a "problem" per se.

4. numpad0 ◴[] No.39218438[source]
I've once seen someone on Twitter wondering about to-them-obviously-bug with VAE leading to oddly saturated images in anime space, just my dumb brain keyword search though