Gemini 2.5 Flash Image

(developers.googleblog.com)

1092 points meetpateltech | 3 comments | 26 Aug 25 14:01 UTC | HN request time: 0s | source

Also: https://deepmind.google/models/gemini/image/, https://techcrunch.com/2025/08/26/google-geminis-ai-image-mo...

Show context

fariszr ◴[26 Aug 25 15:15 UTC] No.45027760[source]▶

This is the gpt 4 moment for image editing models. Nano banana aka gemini 2.5 flash is insanely good. It made a 171 elo point jump in lmarena!

Just search nano banana on Twitter to see the crazy results. An example. https://x.com/D_studioproject/status/1958019251178267111

replies(19): >>45028040 #>>45028152 #>>45028346 #>>45028352 #>>45029095 #>>45029173 #>>45029967 #>>45030536 #>>45031380 #>>45031995 #>>45032126 #>>45032293 #>>45032553 #>>45034187 #>>45034818 #>>45036034 #>>45036038 #>>45036949 #>>45038452 #

qingcharles ◴[26 Aug 25 16:53 UTC] No.45029173[source]▶

>>45027760 #

I've been testing it for several weeks. It can produce results that are truly epic, but it's still a case of rerolling the prompt a dozen times to get an image you can use. It's not God. It's definitely an enormous step though, and totally SOTA.

replies(5): >>45029418 #>>45029473 #>>45030561 #>>45033110 #>>45035698 #

1. druskacik ◴[26 Aug 25 17:12 UTC] No.45029473[source]▶

>>45029173 #

Is it because the model is not good enough at following the prompt, or because the prompt is unclear?

Something similar has been the case with text models. People write vague instructions and are dissatisfied when the model does not correctly guess their intentions. With image models it's even harder for model to guess it right without enough details.

replies(2): >>45029572 #>>45030646 #

2. qingcharles ◴[26 Aug 25 17:22 UTC] No.45029572[source]▶

>>45029473 (TP) #

No, my prompts are very, very clear. It just won't follow them sometimes. Also this model seems to prefer shorter prompts, in my experience.

3. toddmorey ◴[26 Aug 25 18:51 UTC] No.45030646[source]▶

>>45029473 (TP) #

Remember in image editing, the source image itself is a huge part of the prompt, and that's often the source of the ambiguity. The model may clearly understand your prompt to change the color of a shirt, but struggle to understand the boundaries of the shirt. I was just struggling to use AI to edit an image where the model really wanted the hat in the image to be the hair of the person wearing it. My guess for that bias is that it had just been trained on more faces without hats than with them on.

↑