←back to thread

268 points prashp | 2 comments | | HN request time: 0.401s | source
1. prashp ◴[] No.39215963[source]
Emad (StabilityAI founder) posted on the reddit thread:

"Nice post, you'd be surprised at the number of errors like this that pop up and persist.

This is one reason we have multiple teams working on stuff..

But you still get them"

replies(1): >>39216094 #
2. GaggiX ◴[] No.39216094[source]
Another example is when people realized that SD v1.5 wasn't able to generate images that were too dark or too bright. The problem in the end was that during training even the noisiest step still has enough signal for the model to be able to detect the mean of the actual image (signal), this is done because you cannot have pure Gaussian noise during training of an epsilon objective model or it will cause a division by zero. Of course during inference there is no signal in the first step, so the model would read the mean of the input (so zero as the input is Gaussian noise) and it will output an image of mean 0.

It's not uncommon to find major problems with these systems, I remember inspecting the VQGAN used by Dalle Mega (the largest version of Dalle Mini) and discovering that the vast majority of entries in the codebook had a magnitude very close to zero, making them completely unusable by the model.