←back to thread

281 points GabrielBianconi | 2 comments | | HN request time: 0.431s | source
1. ozgune ◴[] No.45066036[source]
The SGLang Team has a follow-up blog post that talks about DeepSeek inference performance on GB200 NVL72: https://lmsys.org/blog/2025-06-16-gb200-part-1/

Just in case you have $3-4M lying around somewhere for some high quality inference. :)

SGLang quotes a 2.5-3.4x speedup as compared to the H100s. They also note that more optimizations are coming, but they haven't yet published a part 2 on the blog post.

replies(1): >>45074618 #
2. aurareturn ◴[] No.45074618[source]
Isn't Blackwell optimized for FP4? This blog post runs Deepseek at fp8, which is probably the sweet spot but new models with fp4 native training and inference would be drastically faster than fp8 on blackwell.