Most active commenters
  • kemyd(4)
  • vunderba(4)
  • qingcharles(3)

←back to thread

Gemini 2.5 Flash Image

(developers.googleblog.com)
1092 points meetpateltech | 21 comments | | HN request time: 0.238s | source | bottom
1. kemyd ◴[] No.45028284[source]
I don't get the hype. Tested it with the same prompts I used with Midjourney, and the results are worse than in Midjourney a year ago. What am I missing?
replies(4): >>45028310 #>>45028315 #>>45028333 #>>45029208 #
2. bonoboTP ◴[] No.45028310[source]
The hype is about image editing, not pure text-to-image. Upload an input image, say what you want changed, get the output. That's the idea. Much better preservation of characters and objects.
replies(3): >>45028324 #>>45028535 #>>45028654 #
3. cdrini ◴[] No.45028315[source]
Hmm, I think the hype is mainly for image editing, not generating. Although note I haven't used it! How are you testing it?
replies(2): >>45028371 #>>45029411 #
4. kemyd ◴[] No.45028324[source]
Thanks for clarifying this. That makes a lot more sense.
5. kemyd ◴[] No.45028371[source]
I tested it with two prompts:

// In this one, Gemini doesn't understand what "cinematic" is

"A cinematic underwater shot of a turtle gracefully swimming in crystal-clear water [...]"

// In this one, the reflection in the water in the background has different buildings

"A modern city where raindrops fall upward into the clouds instead of down, pedestrians calmly walking [...]"

Midjourney created both perfectly.

replies(1): >>45028468 #
6. echelon ◴[] No.45028468{3}[source]
As others have said, this is an image editing model.

Editing models do not excel at aesthetic, but they can take your Midjourney image, adjust the composition, and make it perfect.

These types of models are the Adobe killer.

replies(1): >>45028772 #
7. appenz ◴[] No.45028535[source]
I tested it against Flux Pro Kontext (also image editing) and while it's a very different style and approach I overall like Flux better. More focus on image consistency, adjusts the lighting correctly, fixes contradictions in the image.
replies(1): >>45029372 #
8. SirMaster ◴[] No.45028654[source]
Can it edit the photo at the original resolution?

Most of my photos these days are 48MP and I don't want to lose a ton of resolution just to edit them.

replies(3): >>45029388 #>>45031152 #>>45032995 #
9. kemyd ◴[] No.45028772{4}[source]
Noted that! The editing capabilities are impressive. I was excited for image gen because of the API (Midjourney doesn't have it yet).
replies(1): >>45029499 #
10. vunderba ◴[] No.45029208[source]
Midjourney hasn't been SOTA for over a year. Even the latest release of version 7 scores extremely low on prompt adherence only managing to get 2 out of 12 prompts correct. Even Flux Dev running locally consistently out performs it.

Here's a comparison of Flux Dev, MJ, Imagen, and Flash 2.5.

https://genai-showdown.specr.net/?models=FLUX_1D%2CMIDJOURNE...

That being said, if image fidelity is absolutely paramount and/or your prompts are relatively simple - Midjourney can still be fun to experiment with particularly if you crank up the weirdness / chaos parameters.

11. qingcharles ◴[] No.45029372{3}[source]
I've been testing it against Flux Pro Kontext for several weeks. I would say it beats Flux in a majority of tests, but Flux still surprises from time-to-time. Banana definitely isn't the best 100% of the time -- it falls a bit short of that. Evolution, not revolution.
replies(1): >>45030108 #
12. qingcharles ◴[] No.45029388{3}[source]
I don't know. All the testing I've done has output the standard 1024x1024 that all these models are set to output. You might be able to alter the output params on the API or AI Studio.
13. qingcharles ◴[] No.45029411[source]
It actually has impressive image generating ability, IMO. I think the two things go hand-in-hand. Its prompt adherence can be weaker than other models, though.
14. echelon ◴[] No.45029499{5}[source]
David Holz mentioned on Twitter that he was considering a Midjourney API. They're obviously providing it to Meta now, so it might become more broadly available after Midjourney becomes the default image gen for Meta products.

Midjourney wins on aesthetic for sure. Nothing else comes close. Midjourney images are just beautiful to behold.

David's ambition is to beat Google to building a world model you can play games in. He views the image and video business as a temporary intermediate to that end game.

15. vunderba ◴[] No.45030108{4}[source]
Agreed. I find myself alternating between Qwen Image Edit 20B, Kontext, and now Flash 2.5 depending on the situation and style. And of course, Flash isn't open-weights, so if you need more control / less censorship then you're SOL.
replies(2): >>45030780 #>>45035806 #
16. frank_nitti ◴[] No.45030780{5}[source]
Has there been a sufficient indication to conclude these weights will not (now or ever) be released?
replies(1): >>45031129 #
17. vunderba ◴[] No.45031129{6}[source]
Are any of Google's generative models besides Alphafold open weight? (Veo, Imagen, etc.)

I don't think we can really answer the question if Flash will ever be released.

18. vunderba ◴[] No.45031152{3}[source]
Great question. I really doubt it would be able to support any resolution. I'm sure that behind the scenes it scales it down to somewhere around 1 mp before processing even if they decide to upscale and return it back at the original resolution.
replies(1): >>45031485 #
19. SirMaster ◴[] No.45031485{4}[source]
So then this doesn't really replace traditional photoshop editing of my photos I guess.
20. Workaccount2 ◴[] No.45032995{3}[source]
No, it resizes them.
21. Melchizedek ◴[] No.45035806{5}[source]
It’s good but holy shit is it censored! Try generating any kind of scene on a beach…