←back to thread

Gemini 2.5 Flash Image

(developers.googleblog.com)
1092 points meetpateltech | 5 comments | | HN request time: 0.001s | source
Show context
fariszr ◴[] No.45027760[source]
This is the gpt 4 moment for image editing models. Nano banana aka gemini 2.5 flash is insanely good. It made a 171 elo point jump in lmarena!

Just search nano banana on Twitter to see the crazy results. An example. https://x.com/D_studioproject/status/1958019251178267111

replies(19): >>45028040 #>>45028152 #>>45028346 #>>45028352 #>>45029095 #>>45029173 #>>45029967 #>>45030536 #>>45031380 #>>45031995 #>>45032126 #>>45032293 #>>45032553 #>>45034187 #>>45034818 #>>45036034 #>>45036038 #>>45036949 #>>45038452 #
qingcharles ◴[] No.45029173[source]
I've been testing it for several weeks. It can produce results that are truly epic, but it's still a case of rerolling the prompt a dozen times to get an image you can use. It's not God. It's definitely an enormous step though, and totally SOTA.
replies(5): >>45029418 #>>45029473 #>>45030561 #>>45033110 #>>45035698 #
spaceman_2020 ◴[] No.45029418[source]
If you compare to the amount of effort required in Photoshop to achieve the same results, still a vast improvement
replies(3): >>45029544 #>>45030677 #>>45038122 #
echelon ◴[] No.45030677[source]
Vibe coding might not be real, but vibe graphics design certainly is.

https://imgur.com/a/internet-DWzJ26B

Anyone can make images and video now.

replies(7): >>45031804 #>>45032019 #>>45032368 #>>45034328 #>>45035380 #>>45035529 #>>45039856 #
lebimas ◴[] No.45032019[source]
What tools did you use to make those videos from the PG image?
replies(1): >>45032219 #
1. echelon ◴[] No.45032219[source]
I used a bunch of models in conjunction:

- Midjourney (background)

- Qwen Image (restyle PG)

- Gemini 2.5 Flash (editing in PG)

- Gemini 2.5 Flash (adding YC logo)

- Kling Pro (animation)

I didn't spend too much time correcting mistakes.

I used a desktop model aggregation and canvas tool that I wrote [1] to iterate and structure the work. I'll be open sourcing it soon.

[1] https://getartcraft.com

replies(2): >>45033760 #>>45036596 #
2. kstenerud ◴[] No.45033760[source]
The app looks interesting, but I think it needs some documentation. I think I generated something? Maybe? I saw a spinny thing for awhile, but then nothing.

I couldn't get the 3d thing to do much. I had assets in the scene but I couldn't for the life of me figure out how to use the move, rotate or scale tools. And the people just had their arms pointing outward. Are you supposed to pose them somehow? Maybe I'm supposed to ask the AI to pose them?

Inpainting I couldn't figure out either... It's for drawing things into an existing image (I think?) but it doesn't seem to do anything other than show a spinny thing for awhile...

I didn't test the video tool because I don't have a midjourney account.

3. unixhero ◴[] No.45036596[source]
What is PG?
replies(2): >>45036640 #>>45036646 #
4. fhd2 ◴[] No.45036640[source]
Paul Graham, Y Combinator founder.
5. sethaurus ◴[] No.45036646[source]
In this context, it's Paul Graham, the head Y Combinator guy whose cartoon likeness appears in the generated video: https://news.ycombinator.com/user?id=pg