←back to thread

Gemini 2.5 Flash Image

(developers.googleblog.com)
1092 points meetpateltech | 1 comments | | HN request time: 0.227s | source
Show context
fariszr ◴[] No.45027760[source]
This is the gpt 4 moment for image editing models. Nano banana aka gemini 2.5 flash is insanely good. It made a 171 elo point jump in lmarena!

Just search nano banana on Twitter to see the crazy results. An example. https://x.com/D_studioproject/status/1958019251178267111

replies(19): >>45028040 #>>45028152 #>>45028346 #>>45028352 #>>45029095 #>>45029173 #>>45029967 #>>45030536 #>>45031380 #>>45031995 #>>45032126 #>>45032293 #>>45032553 #>>45034187 #>>45034818 #>>45036034 #>>45036038 #>>45036949 #>>45038452 #
echelon ◴[] No.45028352[source]
> This is the gpt 4 moment for image editing models.

No it's not.

We've had rich editing capabilities since gpt-image-1, this is just faster and looks better than the (endearingly? called) "piss filter".

Flux Kontext, SeedEdit, and Qwen Edit are all also image editing models that are robustly capable. Qwen Edit especially.

Flux Kontext and Qwen are also possible to fine tune and run locally.

Qwen (and its video gen sister Wan) are also Apache licensed. It's hard not to cheer Alibaba on given how open they are compared to their competitors.

We've left the days of Dall-E, Stable Diffusion, and Midjourney of "prompt-only" text to image generation.

It's also looking like tools like ComfyUI are less and less necessary as those capabilities are moving into the model layer itself.

replies(4): >>45028405 #>>45030428 #>>45031918 #>>45038980 #
1. fariszr ◴[] No.45031918[source]
I'm sorry I absolutely don't agree. This model is on a whole other level.

It's not even close. https://twitter.com/fareszr/status/1960436757822103721