Gemini 2.5 Flash Image

(developers.googleblog.com)

1092 points meetpateltech | 2 comments | 26 Aug 25 14:01 UTC | HN request time: 0s | source

Also: https://deepmind.google/models/gemini/image/, https://techcrunch.com/2025/08/26/google-geminis-ai-image-mo...

Show context

crustaceansoup ◴[26 Aug 25 16:23 UTC] No.45028708[source]▶

I tried to reproduce the fork/spaghetti example and the fashion bubble example, and neither looks anything like what they present. The outputs are very consistent, too. I am copying/pasting the images out of the advertisement page so they may be lower resolution than the original inputs, but otherwise I'm using the same prompts and getting a wildly different result.

It does look like I'm using the new model, though. I'm getting image editing results that are well beyond what the old stuff was capable of.

replies(2): >>45029225 #>>45030871 #

1. mortenjorck ◴[26 Aug 25 16:57 UTC] No.45029225[source]▶

>>45028708 #

The output consistency is interesting. I just went through half a dozen generations of my standard image model challenge, (to date I have yet to see a model that can render piano keyboard octaves correctly, and Gemini 2.5 Flash Image is no different in that regard), and as best I can tell, there are no changes at all between successive attempts: https://g.co/gemini/share/a0e1e264b5e9

This is in stark contrast to ChatGPT, where an edit prompt typically yields both requested and unrequested changes to the image; here it seems to be neither.

replies(1): >>45030929 #

2. BoorishBears ◴[26 Aug 25 19:15 UTC] No.45030929[source]▶

>>45029225 (TP) #

Flash 2.0 Image had the same issue: it does better than gpt-image for maintaining consistency in edits, but that also introduces a gap where sometimes it gets "locked in" on a particular reference image and will struggle to make changes to it.

In some cases you'll pass in multiple images + a prompt and get back something that's almost visually indistinguishable from just one of the images and nothing from the prompt.

↑