←back to thread

220 points Vt71fcAqt7 | 1 comments | | HN request time: 0.249s | source
Show context
cube2222 ◴[] No.41861846[source]
This looks like quite a huge breakthrough, unless I'm missing something?

~25x faster performance than Flux-dev, while offering comparable quality in benchmarks. And visually the examples (surely cherry-picked, but still) look great!

Especially since with GenAI the best way to get good results is to just generate a large amount of them and pick the best (imo). Performance like this will make that much easier/faster/cheaper.

Code is unfortunately "(Coming soon)" for now. Can't wait to play with it!

replies(4): >>41861942 #>>41863225 #>>41864501 #>>41865018 #
Archit3ch ◴[] No.41863225[source]
If you generate 25x more images, you can afford to cherry-pick.
replies(2): >>41863739 #>>41864455 #
cube2222 ◴[] No.41863739[source]
It would be interesting to have benchmarks that take this into account (maybe they already do or I’m misunderstanding how those benchmarks work). I.e. when comparing quality between two different models of vastly different performance, you could be doing best-of-n in the faster model.
replies(1): >>41863919 #
Vt71fcAqt7 ◴[] No.41863919[source]
That sounds like it could be an intiresting metric. Worth noting that there is a difference between an algorithmic "best of n" selection (via eg. an FID score) vs. manual cherry picking which takes more factors into account such as user preference and also takes time to evaluate, which is what GP was suggesting.
replies(2): >>41864044 #>>41869835 #
1. psb217 ◴[] No.41869835[source]
This is a bit pedantic, but FID score wouldn't really be a viable metric for best of n selection since it's a metric that's only computable for distributions of samples. FID score is also pretty high variance for small sample sizes, so you need a lot of samples to compute a meaningful FID score.

Better metrics (assuming goal is text->image) would be some sort of inception score or CLIP-based text matching score. These metrics are computable on single samples.