Lenovo Has a CXL Memory Monster with 128x 128GB DDR5 DIMMs

Serving remote desktops to several hundred developers. Maybe a video content server for a netflix or youtube type business. Hosting a large search index? Some kind of scientific computing?

7. HeatrayEnjoyer ◴[20 Nov 24 08:51 UTC] No.42191960[source]▶

>>42190905 #

A half dozen GPT-4 instances

replies(1): >>42195659 #

8. guenthert ◴[20 Nov 24 10:03 UTC] No.42192376[source]▶

>>42190905 #

Numeric simulation (HPC). Some, not all, simulations need lots of memory. In 2018 larger servers running such had 1TiB, so I'm not the least surprised that six years later it's 16.

9. guenthert ◴[20 Nov 24 10:09 UTC] No.42192410{3}[source]▶

>>42191559 #

I'd think the market share for applications which need huge amount of space, but little CPU processing power and memory transfer rate is rather small.

Lenovo's slides indicate that they foresee this server be used for in-memory data bases.

Weren't there also distributed fs where the meta-data server couldn't be scaled out?

10. eqvinox ◴[20 Nov 24 14:34 UTC] No.42194227{3}[source]▶

>>42191559 #

We don't see more of these machines because most tasks are better served by a higher number of smaller machines. The only benefit of boxes like this is having all of that RAM in one box. Very few use cases need that.

replies(1): >>42200049 #

11. metadat ◴[20 Nov 24 16:39 UTC] No.42195659{3}[source]▶

>>42191960 #

LLM inference processors (GPUs) don't use DDR, it uses special, costly stacked HBM ram soldered to the board.

I tested out running Llama on a 512GB machine, it's rather slow and inefficient. Maybe 1-token/sec.

12. moomoo11 ◴[21 Nov 24 01:22 UTC] No.42200049{4}[source]▶

>>42194227 #

Would be fun for a graph db