Z-Image: Powerful and highly efficient image generation model with 6B parameters

(github.com)

396 points doener | 2 comments | 30 Nov 25 11:36 UTC | HN request time: 0s | source

Show context

vunderba ◴[06 Dec 25 17:36 UTC] No.46175068[source]▶

>>46095817 (OP) #

I've done some preliminary testing with Z-Image Turbo in the past week.

Thoughts

- It's fast (~3 seconds on my RTX 4090)

- Surprisingly capable of maintaining image integrity even at high resolutions (1536x1024, sometimes 2048x2048)

- The adherence is impressive for a 6B parameter model

Some tests (2 / 4 passed):

https://imgpb.com/exMoQ

Personally I find it works better as a refiner model downstream of Qwen-Image 20b which has significantly better prompt understanding but has an unnatural "smoothness" to its generated images.

replies(6): >>46175104 #>>46175331 #>>46177028 #>>46177043 #>>46177543 #>>46178707 #

tarruda ◴[06 Dec 25 22:12 UTC] No.46177028[source]▶

>>46175068 #

> It's fast (~3 seconds on my RTX 4090)

It is amazing how far behind Apple Silicon is when it comes to use non- language models.

Using the reference code from Z-image on my M1 ultra, it takes 8 seconds per step. Over a minute for the default of 9 steps.

replies(3): >>46177803 #>>46180602 #>>46183922 #

tails4e ◴[07 Dec 25 10:09 UTC] No.46180602[source]▶

>>46177028 #

I heard last year the potential future of gaming is not rendering but fully AI generated frames. 3 seconds per 'frame' now, it's not hard to believe it could do 60fps in a few short years. It makes it seem more likely such a game could exist. I'm not sure I like the idea, but it seems like it could happen

replies(2): >>46180853 #>>46181090 #

1. snek_case ◴[07 Dec 25 11:53 UTC] No.46181090{3}[source]▶

>>46180602 #

The problem is going to be how to control those models to produce a universe that's temporally and spatially consistent. Also think of other issues such as networked games, how would you even begin to approach that in this new paradigm? You need multiple models to have a shared representation that includes other players. You need to be able to sync data efficiently across the network.

I get that it's tempting to say "we no longer have to program game engines, hurray", but at the same time, we've already done the work, we already have game engines that are relatively very computationally efficient and predictable. We understand graphics and simulation quite well.

Personally: I think there's an obvious future in using AI tools to generate game content. 3D modelling and animation can be very time consuming. If you could get an AI model to generate animated characters, you could save a lot of time. You could also empower a lot of indie devs who don't have 3D modelers to help them. AI tools to generate large maps, also super valuable. Replacing the game engine itself, I think it's a taller order than people realize, and maybe not actually desirable.

replies(1): >>46181275 #

2. adventured ◴[07 Dec 25 12:35 UTC] No.46181275[source]▶

>>46181090 (TP) #

20 years out, what will everybody be using routine 10gbps pipes in our homes for?

I'm paying $43 / month for 500mbps at present and there's nothing special about that at all (in the US or globally). What might we finally use 1gbps+ for? Pulling down massive AI-built worlds of entertainment. Movies & TV streaming sure isn't going to challenge our future bandwidth capabilities.

The worlds are built and shared so quickly in the background that with some slight limitations you never notice the world building going on behind the scenes.

The world building doesn't happen locally. Multiple players connect to the same built world that is remote. There will be smaller hobbyist segments that will still world-build locally for numerous reasons (privacy for one).

The worlds can be constructed entirely before they're downloaded. There are good arguments for both approaches (build the entire world then allow it to be accessed, or attempt to world-build as you play). Both will likely be used over the coming decades, for different reasons and at different times (changes in capabilities will unlock new arguments for either as time goes on, with a likely back and forth where one pulls ahead then the other pulls ahead).

↑