Alibaba Cloud says it cut Nvidia AI GPU use by 82% with new pooling system

(www.tomshardware.com)

Paper: https://dl.acm.org/doi/10.1145/3731569.3764815

Show context

hunglee2 ◴[20 Oct 25 12:59 UTC] No.45643396[source]▶

The US attempt to slow down China's technological development succeeds on the basis of preventing China from directly following the same path, but may backfire in the sense it forces innovation by China in a different direction. The overall outcome for us all may be increase efficiency as a result of this forced innovation, especially if Chinese companies continue to open source their advances, so we may in the end have reason to thank the US for their civilisational gate keeping

replies(17): >>45643584 #>>45643614 #>>45643618 #>>45643770 #>>45643876 #>>45644337 #>>45644641 #>>45644671 #>>45644907 #>>45645384 #>>45645721 #>>45646056 #>>45646138 #>>45648814 #>>45651479 #>>45651810 #>>45663019 #

segmondy ◴[20 Oct 25 13:23 UTC] No.45643618[source]▶

>>45643396 #

may backfire? it's a bit too late for that.

go to 2024, western labs were crushing it.

it's now 2025, and from china, we have deepseek, qwen, kimi, glm, ernie and many more capable models keeping up with western labs. there are actually now more chinese labs releasing sota models than western labs.

replies(4): >>45643764 #>>45646364 #>>45650725 #>>45650819 #

1. Workaccount2 ◴[20 Oct 25 17:10 UTC] No.45646364[source]▶

>>45643618 #

But they aren't keeping up

They are lauded for the ability to cost ratio, or their ability to parameter ratio, but virtually everyone using LLMs for productive work are using ChatGPT/Gemini/Claude.

They are kind of like Huffy bicycles. Good value, work well, but if you go to any serious event, no one will be riding one.

replies(2): >>45646880 #>>45647584 #

2. segmondy ◴[20 Oct 25 17:49 UTC] No.45646880[source]▶

>>45646364 (TP) #

they are keeping up. i have been using just chinese models for the last 2 years. chatgpt/gemini/claude have marketing. there's nothing that you can do with those models that can't be done with deepseek, glm or kimi. if there is, do let us know.

replies(1): >>45649214 #

3. MSFT_Edging ◴[20 Oct 25 18:45 UTC] No.45647584[source]▶

>>45646364 (TP) #

The downside of their efficiency and cost-ratio is that they undermine the circular economy of massive data centers, GPU sales, and VC money that is constructing an extremely wasteful bubble.

replies(1): >>45649237 #

4. Workaccount2 ◴[20 Oct 25 20:53 UTC] No.45649214[source]▶

>>45646880 #

They can't attract a large contingent of users. Because despite being able to do everything the big name models can do, they cannot do it as well.

This aligns with the benchmarks as well; they benchmark great for what they are, but still bottom of the barrel when competing for "state of the art."

And yes, it's great you daily Chinese models, but the vast majority of people try them, say "impressive", then go back to the most performant models.

replies(1): >>45653431 #

5. Workaccount2 ◴[20 Oct 25 20:54 UTC] No.45649237[source]▶

>>45647584 #

The bubble is there in China too, it's just on the governments books instead of private investors books.

6. vachina ◴[21 Oct 25 07:43 UTC] No.45653431{3}[source]▶

>>45649214 #

I'm not sure if you understood what OP meant by "marketing".

↑