←back to thread

Gemini 2.5 Flash Image

(developers.googleblog.com)
1092 points meetpateltech | 2 comments | | HN request time: 0s | source
Show context
crustaceansoup ◴[] No.45028708[source]
I tried to reproduce the fork/spaghetti example and the fashion bubble example, and neither looks anything like what they present. The outputs are very consistent, too. I am copying/pasting the images out of the advertisement page so they may be lower resolution than the original inputs, but otherwise I'm using the same prompts and getting a wildly different result.

It does look like I'm using the new model, though. I'm getting image editing results that are well beyond what the old stuff was capable of.

replies(2): >>45029225 #>>45030871 #
1. mortenjorck ◴[] No.45029225[source]
The output consistency is interesting. I just went through half a dozen generations of my standard image model challenge, (to date I have yet to see a model that can render piano keyboard octaves correctly, and Gemini 2.5 Flash Image is no different in that regard), and as best I can tell, there are no changes at all between successive attempts: https://g.co/gemini/share/a0e1e264b5e9

This is in stark contrast to ChatGPT, where an edit prompt typically yields both requested and unrequested changes to the image; here it seems to be neither.

replies(1): >>45030929 #
2. BoorishBears ◴[] No.45030929[source]
Flash 2.0 Image had the same issue: it does better than gpt-image for maintaining consistency in edits, but that also introduces a gap where sometimes it gets "locked in" on a particular reference image and will struggle to make changes to it.

In some cases you'll pass in multiple images + a prompt and get back something that's almost visually indistinguishable from just one of the images and nothing from the prompt.