Most active commenters
  • sliken(5)
  • pjmlp(3)

←back to thread

623 points magicalhippo | 40 comments | | HN request time: 1.655s | source | bottom
1. narrator ◴[] No.42619363[source]
Nvidia releases a Linux desktop supercomputer that's better price/performance wise than anything Wintel is doing and their whole new software stack will only run on WSL2. They aren't porting to Win32. Wow, it may actually be the year of Linux on the Desktop.
replies(7): >>42619399 #>>42619444 #>>42619549 #>>42619598 #>>42619820 #>>42620944 #>>42622537 #
2. CamperBob2 ◴[] No.42619399[source]
Where does it say they won't be supporting Win32?
replies(1): >>42619432 #
3. narrator ◴[] No.42619432[source]
Here he says that in order for the cloud and the PC to be compatible, he's going to only support WSL2, the Windows subsystem for Linux which is a Linux API on top of Windows.

Here's a link to the part of the keynote where he says this:

https://youtu.be/MC7L_EWylb0?t=7259

replies(2): >>42619600 #>>42619602 #
4. rvz ◴[] No.42619444[source]
> Wow, it may actually be the year of Linux on the Desktop.

?

Yeah starting at $3,000. Surely a cheap desktop computer to buy for someone who just wants to surf the web and send email /s.

There is a reason why it is for "enthusiasts" and not for the general wider consumer or typical PC buyer.

replies(3): >>42619765 #>>42619826 #>>42619984 #
5. immibis ◴[] No.42619549[source]
Never underestimate the open source world's power to create a crappy desktop experience.
replies(1): >>42621159 #
6. sliken ◴[] No.42619598[source]
Not sure how to judge better price/perf. I wouldn't expect 20 Neoverse N2 cores to do particularly well vs 16 zen5 cores. The GPU side looks promising, but they aren't mentioning memory bandwidth, configuration, spec, or performance.

Did see vague claims of "starting at $3k", max 4TB nvme, and max 128GB ram.

I'd expect AMD Strix Halo (AI Max plus 395) to be reasonably competitive.

replies(2): >>42619674 #>>42619722 #
7. stonogo ◴[] No.42619600{3}[source]
"Linux API on top of Windows" is an interesting way to describe a virtual machine.
replies(3): >>42619614 #>>42619634 #>>42619650 #
8. sliken ◴[] No.42619602{3}[source]
The keynote mentioned that it could be used as a Linux workstation.
9. sedatk ◴[] No.42619614{4}[source]
That’s more like WSL1, yes.
10. SteveNuts ◴[] No.42619634{4}[source]
WSL2 is no longer a VM, afaik.
replies(1): >>42619684 #
11. pulse7 ◴[] No.42619650{4}[source]
WSL1 was "Linux API on top of Windows", WSL2 is "Linux VM on top of Windows"
replies(1): >>42620963 #
12. z4y5f3 ◴[] No.42619674[source]
NVIDIA is likely citing 1 PFlops at FP 4 sparse (they did this for GB200), so that is 128 TFlops BF16 dense, or 2/3 of what RTX 4090 is capable of. I would put the memory bandwidth at 546 GBps, using the same 512 bit LPDDR5X 8533 Mbps as Apple M4 max.
replies(2): >>42620444 #>>42641848 #
13. gnabgib ◴[] No.42619684{5}[source]
Other way around (1 wasn't, 2 runs in a managed HyperV VM) https://learn.microsoft.com/en-us/windows/wsl/compare-versio...
14. skavi ◴[] No.42619722[source]
It’s actually “10 Arm Cortex-X925 and 10 Cortex-A725” [0]. These are much newer cores and have a reasonable chance of being competitive.

[0]: https://newsroom.arm.com/blog/arm-nvidia-project-digits-high...

replies(3): >>42619778 #>>42622425 #>>42624856 #
15. fooker ◴[] No.42619765[source]
The typical PC buyer is an enthusiast now.
16. sliken ◴[] No.42619778{3}[source]
Good catch, they called it "Grace Blackwell". Changing the CPU cores completely and calling it Grace seems weird. Maybe it was just a mistake during the keynote.
replies(1): >>42620025 #
17. bee_rider ◴[] No.42619820[source]
Seems more like a workstation. So, that’s just a continuation of the last could Decades of Unix on the Workstation, right?
replies(1): >>42620802 #
18. yjftsjthsd-h ◴[] No.42619826[source]
> Surely a cheap desktop computer to buy for someone who just wants to surf the web and send email /s.

That end of the market is occupied by Chromebooks... AKA a different GNU/Linux.

19. Topfi ◴[] No.42619984[source]
I see the most direct competitor in the Mac Studio, though of course we will have to wait for reviews to gauge how fair that comparison is. The Studio does have a fairly large niche as a solid workstation, though, so I could see this being successful.

For general desktop use, as you described, nearly any piece of modern hardware, from a RasPI, to most modern smartphones with a dock, could realistically serve most people well.

The thing is, you need to serve both, low-end use cases like browsing, and high-end dev work via workstations, because even for the "average user", there is often one specific program on which they need to rely and which has limited support outside the OS they have grown up with. Course, there will be some programs like Desktop Microsoft Office which will never be ported, but still, Digitis could open the doors to some devs working natively on Linux.

A solid, compact, high-performance, yet low power workstation with a fully supported Linux desktop out of the box could bridge that gap, similar to how I have seen some developers adopt macOS over Linux and Windows since the release of the Studio and Max MacBooks.

Again, we have yet to see independent testing, but I would be surprised if anything of this size, simplicity, efficiency and performance was possible in any hardware configuration currently on the market.

replies(1): >>42620450 #
20. wmf ◴[] No.42620025{4}[source]
I don't think it was a mistake; maybe they intend Grace to be a broader brand like Ryzen not one particular model.
replies(1): >>42620566 #
21. gardnr ◴[] No.42620444{3}[source]
Based on your evaluation, it sounds like it will run inference at speed similar to an M4 Max and also allow "startups" to experiment with fine tuning larger models or larger context windows.

It's the best "dev board" setup I've seen so far. It might be part of their larger commercial plan but it definitely hits the sweet spot for the home enthusiast who have been pleading for more VRAM.

22. sliken ◴[] No.42620450{3}[source]
I did want a M2 max studio, ended up with a 12 core Zen 4 + radeon 7800 XT for about half the money.

A Nvidia Project Digit/GB10 for $3k with 128GB ram does sound tempting. Especially since it's very likely to have standard NVMe storage that I can expand or replace as needed, unlike the Apple solution. Decent linux support is welcome as well.

Here's hoping, if not I can fall back to a 128GB ram AMD Strix Halo/395 AI Max plus. CPU perf should be in the same ballpark, but not likely to come anywhere close on GPU performance, but still likely to have decent tokens/sec for casual home tinkering.

replies(1): >>42641858 #
23. kristopolous ◴[] No.42620566{5}[source]
it's an interesting idea. I mean grace hopper was an actual person but nvidia can have whatever arbitrary naming rules they'd like.
24. throw310822 ◴[] No.42620802[source]
They should write an AI-centered OS for it, allowing people to write easily AI heavy applications. And you'd have the Amiga of 2025.
25. pjmlp ◴[] No.42620944[source]
Because NVidia naturally doesn't want to pay for Windows licenses.

NVidia works closely with Microsoft to develop their cards, all major features come first in DirectX, before landing on Vulkan and OpenGL as NVidia extensions, and eventually become standard after other vendors follow up with similar extensions.

26. pjmlp ◴[] No.42620963{5}[source]
More like,

WSL1 was "Linux API on top of NT kernel picoprocesses", WSL2 is "Linux VM on top of Hyper-V"

replies(1): >>42622524 #
27. tokai ◴[] No.42621159[source]
You're like 15 years out of date.
replies(1): >>42644907 #
28. adrian_b ◴[] No.42622425{3}[source]
For programs dominated by iterations over arrays, these 10 Arm Cortex-X925 + 10 Cortex-A725, all 20 together, should have a throughput similar with only 10 of the 16 cores of Strix Halo (assuming that Strix Halo has full Zen 5 cores, which has not been confirmed yet).

For programs dominated by irregular integer and pointer operations, like software project compilation, 10 Arm Cortex-X925 + 10 Cortex-A725 should have a similar throughput with a 16-core Strix Halo, but which is faster would depend on cooling (i.e. a Strix Halo configured for a high power consumption will be faster).

There is not enough information to compare the performance of the GPUs from this NVIDIA Digits and from Strix Halo. However, it can be assumed that NVIDIA Digits will be better for ML/AI inference. Whether it can also be competitive for training or for graphics remains to be seen.

replies(1): >>42624797 #
29. mycall ◴[] No.42622524{6}[source]
I wish WSL1 was open sourced
replies(1): >>42622571 #
30. diggan ◴[] No.42622537[source]
> their whole new software stack will only run on WSL2. They aren't porting to Win32

Wait, what do you mean exactly? Isn't WSL2 just a VM essentially? Don't you mean it'll run on Linux (which you also can run on WSL2)?

Or will it really only work with WSL2? I was excited as I thought it was just a Linux Workstation, but if WSL2 gets involved/is required somehow, then I need to run the other direction.

replies(2): >>42622939 #>>42622940 #
31. pjmlp ◴[] No.42622571{7}[source]
Picoprocesses are based on Drawbridge research, so at least there is some reading about how it all works,

https://www.microsoft.com/en-us/research/project/drawbridge/

https://learn.microsoft.com/en-us/archive/blogs/wsl/windows-...

https://www.zdnet.com/article/under-the-hood-of-microsofts-w...

32. awestroke ◴[] No.42622939[source]
No, nobody will run windows on this. It's meant to run NVIDIAs own flavor of Ubuntu with a patched kernel
33. hx8 ◴[] No.42622940[source]
Yes, WSL2 is essentially a highly integrated VM. I think it's a bit of a joke to call Ubuntu WSL2, because it seems like most Ubuntu installs are either VMs for Windows PCs or on Azure Cloud.
34. skavi ◴[] No.42624797{4}[source]
How did you come up with these numbers? There don't seem to be many shipping products with these cores. In fact, the only one I could find was the Dimensity 9400 with a single X925 and older generation A720s. And of course the Dimensity is a mobile SoC, so clocks will be low.

Are you projecting based on Arm's stated improvements from their last gen? In that case, what numbers are you using as your baseline?

replies(1): >>42625579 #
35. ksec ◴[] No.42624856{3}[source]
For context, the X925 is what used to call Cortex X5 and it is now shipping in MediaTek Dimensity 9400. It has roughly the same performance per clock as a Snapdragon 8 Elite Or roughly 5% lower performance per clock compared to Apple M3 on Geekbench 6.

Assuming they are not limited by power or heat dissipation I would say that is about as good as it gets.

The hardware is pretty damn good. I am only worried about the software.

36. adrian_b ◴[] No.42625579{5}[source]
For programs rich in array operations, which can be accelerated by SVE or AVX-512, Cortex-X925 has 6 x 128-bit execution pipelines, Cortex-A725 has 2 pipelines, Snapdragon Oryon has 4 pipelines, while a Zen 5 core has the equivalent of 8 Arm execution pipelines (i.e. 2 x 512-bit pipelines equivalent with 8 x 128-bit) + other 8 execution pipelines that can do only a subset of the operations.

That means a total of 80 execution pipelines for NVIDIA Digits, 48 execution pipelines for Snapdragon Elite and 128 equivalent execution pipelines for Strix Halo, taking into account only the complete execution pipelines, otherwise for operations like FP addition, which can be done in any pipeline, there would be 256 equivalent execution pipelines for Strix Halo.

Because the clock frequencies for multithreaded applications should be similar, if not better for Strix Halo, there is little doubt that the throughput for applications dominated by array operations should be at least 128/80 for Strix Halo vs. NVIDIA Digits, if not much better, because for many instructions even more execution pipelines are available and Zen 5 also has a higher IPC when executing irregular code, especially vs. the smaller Cortex-A725 cores. Therefore the throughput of NVIDIA Digits is smaller or at most equal in comparison with the throughput of 10 cores of Strix Halo.

On the other hand, for integer/pointer processing code, the number of execution units in a Cortex-925 + a Cortex-725 is about the same as in 2 Zen 5 cores. Therefore the 20 Arm cores of NVIDIA Digits have about the same number of execution units as 20 Zen 5 cores. Nevertheless, the occupancy of the Zen 5 execution units will be higher for most programs than for the Arm cores, especially because of the bigger and better cache memories, and also because of the lower IPC of Cortex-A725. Therefore the 20 Arm cores must be slower than 20 Zen 5 cores, probably only equivalent with about 15 Zen 5 cores, but the exact equivalence is hard to predict, because it depends on the NVIDIA implementation of things like the cache memories and the memory controller.

37. aparashk ◴[] No.42641848{3}[source]
I can't see how this will work in terms of a TDP. 2/3 of the 4090 power would be several times more power than can be effectively cooled in the physical form factor of an Apple Mini. Either there is severe downclocking happening under full throttle, or NVIDIA have come up with more low power design mojo than Apple has been able to muster for the M4 Max.
38. aparashk ◴[] No.42641858{4}[source]
"Starting from $3K" likely means that the maxed out memory configuration will cost more than $3K.
replies(1): >>42642149 #
39. sliken ◴[] No.42642149{5}[source]
As several other threads of mentioned, Nvidia has statements in various places along the lines of "All Nvidia Digit systems will have 128GB". Looks like you might be able to select different storage or something or maybe the number of cores (CPU and/or GPU).
40. immibis ◴[] No.42644907{3}[source]
No, it's still relevant in current year. Progress has gone backwards on some projects/distros.