Gemini 2.5 Flash Image

(developers.googleblog.com)

1092 points meetpateltech | 5 comments | 26 Aug 25 14:01 UTC | HN request time: 0.001s | source

Also: https://deepmind.google/models/gemini/image/, https://techcrunch.com/2025/08/26/google-geminis-ai-image-mo...

Show context

fariszr ◴[26 Aug 25 15:15 UTC] No.45027760[source]▶

This is the gpt 4 moment for image editing models. Nano banana aka gemini 2.5 flash is insanely good. It made a 171 elo point jump in lmarena!

Just search nano banana on Twitter to see the crazy results. An example. https://x.com/D_studioproject/status/1958019251178267111

replies(19): >>45028040 #>>45028152 #>>45028346 #>>45028352 #>>45029095 #>>45029173 #>>45029967 #>>45030536 #>>45031380 #>>45031995 #>>45032126 #>>45032293 #>>45032553 #>>45034187 #>>45034818 #>>45036034 #>>45036038 #>>45036949 #>>45038452 #

qingcharles ◴[26 Aug 25 16:53 UTC] No.45029173[source]▶

>>45027760 #

I've been testing it for several weeks. It can produce results that are truly epic, but it's still a case of rerolling the prompt a dozen times to get an image you can use. It's not God. It's definitely an enormous step though, and totally SOTA.

replies(5): >>45029418 #>>45029473 #>>45030561 #>>45033110 #>>45035698 #

spaceman_2020 ◴[26 Aug 25 17:09 UTC] No.45029418[source]▶

>>45029173 #

If you compare to the amount of effort required in Photoshop to achieve the same results, still a vast improvement

replies(3): >>45029544 #>>45030677 #>>45038122 #

echelon ◴[26 Aug 25 18:54 UTC] No.45030677[source]▶

>>45029418 #

Vibe coding might not be real, but vibe graphics design certainly is.

https://imgur.com/a/internet-DWzJ26B

Anyone can make images and video now.

replies(7): >>45031804 #>>45032019 #>>45032368 #>>45034328 #>>45035380 #>>45035529 #>>45039856 #

lebimas ◴[26 Aug 25 20:37 UTC] No.45032019[source]▶

>>45030677 #

What tools did you use to make those videos from the PG image?

replies(1): >>45032219 #

1. echelon ◴[26 Aug 25 20:55 UTC] No.45032219[source]▶

>>45032019 #

I used a bunch of models in conjunction:

- Midjourney (background)

- Qwen Image (restyle PG)

- Gemini 2.5 Flash (editing in PG)

- Gemini 2.5 Flash (adding YC logo)

- Kling Pro (animation)

I didn't spend too much time correcting mistakes.

I used a desktop model aggregation and canvas tool that I wrote [1] to iterate and structure the work. I'll be open sourcing it soon.

[1] https://getartcraft.com

replies(2): >>45033760 #>>45036596 #

2. kstenerud ◴[26 Aug 25 23:47 UTC] No.45033760[source]▶

>>45032219 (TP) #

The app looks interesting, but I think it needs some documentation. I think I generated something? Maybe? I saw a spinny thing for awhile, but then nothing.

I couldn't get the 3d thing to do much. I had assets in the scene but I couldn't for the life of me figure out how to use the move, rotate or scale tools. And the people just had their arms pointing outward. Are you supposed to pose them somehow? Maybe I'm supposed to ask the AI to pose them?

Inpainting I couldn't figure out either... It's for drawing things into an existing image (I think?) but it doesn't seem to do anything other than show a spinny thing for awhile...

I didn't test the video tool because I don't have a midjourney account.

3. unixhero ◴[27 Aug 25 07:40 UTC] No.45036596[source]▶

>>45032219 (TP) #

What is PG?

replies(2): >>45036640 #>>45036646 #

4. fhd2 ◴[27 Aug 25 07:46 UTC] No.45036640[source]▶

>>45036596 #

Paul Graham, Y Combinator founder.

5. sethaurus ◴[27 Aug 25 07:46 UTC] No.45036646[source]▶

>>45036596 #

In this context, it's Paul Graham, the head Y Combinator guy whose cartoon likeness appears in the generated video: https://news.ycombinator.com/user?id=pg

↑