(github.com)

396 points doener | 3 comments | 30 Nov 25 11:36 UTC | HN request time: 0.001s | source

Show context

pawelduda ◴[06 Dec 25 17:10 UTC] No.46174861[source]▶

Did anyone test it on 5090? I saw some 30xx reports and it seemed very fast

egeres ◴[06 Dec 25 22:43 UTC] No.46177259[source]▶

Incredibly fast, on my 5090 with CUDA 13 (& the latest diffusers, xformers, transformers, etc...), 9 samplig steps and the "Tongyi-MAI/Z-Image-Turbo" model I get:

- 1.5s to generate an image at 512x512

- 3.5s to generate an image at 1024x1024

- 26.s to generate an image at 2048x2048

It uses almost all the 32Gb Gb of VRAM and GPU usage. I'm using the script from the HF post: https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

replies(1): >>46179262 #

1. SV_BubbleTime ◴[07 Dec 25 04:54 UTC] No.46179262[source]▶

>>46177259 #

Weird, even at 2048 I don’t think it should be using all your 32GB VRAM.

replies(1): >>46180877 #

2. egeres ◴[07 Dec 25 11:08 UTC] No.46180877[source]▶

>>46179262 (TP) #

It stays around 26Gb at 512x512. I still haven't profiled the execution or looked much into the details of the architecture but I would assume it trades off memory for speed by creating caches for each inference step

replies(1): >>46182526 #

3. SV_BubbleTime ◴[07 Dec 25 15:44 UTC] No.46182526[source]▶

>>46180877 #

IDK. Seems odd. It’s an 11GB model, I don’t know what it could caching in ram.

↑

Z-Image: Powerful and highly efficient image generation model with 6B parameters