←back to thread

Gemini 2.5 Flash Image

(developers.googleblog.com)
1092 points meetpateltech | 1 comments | | HN request time: 0s | source
Show context
minimaxir ◴[] No.45032993[source]
There is one thing Gemini 2.5 Flash Image can do that no other edit model can do: incorporate multiple images simultaneously without shenanigans due to its multimodality, e.g. for Flux Kontext, if you want to "put the person in the first image into the second image", you have to concatenate them pre-VAE which can be unwieldly, but this model doesn't have that issue. You can even incorporate more than two images, but that may cause too much chaos.

In quick testing, prompt adherence does appear to be much better for massive prompt and the syntatic sugar does appear to be more effective. And there are other tricks not covered which I suspect may allow more control, but I'm still testing.

Given that generations are at the same price as its competitors, this model will shake things up.

replies(3): >>45033165 #>>45034850 #>>45036016 #
1. ojr ◴[] No.45034850[source]
it can't put two images of people together in one photo, this model still has the issue, also, I have seen cases where Flux Kontext works better in things like removing objects