Nvidia's Project Digits is a 'personal AI supercomputer'

1. magicalhippo ◴[07 Jan 25 04:19 UTC] No.42619182[source]▶

>>42619139 (OP) #

Not much was unveiled but it showed a Blackwell GPU with 1PFLOP of FP4 compute, 128GB unified DDR5X memory, 20 ARM cores, and ConnectX powering two QSFP slots so one can stack multiple of them.

edit: While the title says "personal", Jensen did say this was aimed at startups and similar, so not your living room necessarily.

replies(1): >>42619341 #

2. computably ◴[07 Jan 25 04:46 UTC] No.42619341[source]▶

>>42619182 (TP) #

From the size and pricing ($3000) alone, it's safe to conclude it has less raw FLOPs than a 5090. Since it uses LPDDR5X, almost certainly less memory bandwidth too (5090 @ 1.8 TB/s, M4 Max w/ 128GB LPDDR5X @ 546 GB/s). Basically the only advantage is how much VRAM it packs in a small form factor, and presumably greater power efficiency at its smaller scale.

The only thing it really competes with is the Mac Studio for LocalLlama-type enthusiasts and devs. It isn't cheap enough to dent the used market, nor powerful enough to stand in for bigger cards.

replies(4): >>42619643 #>>42620016 #>>42620598 #>>42622446 #

3. sliken ◴[07 Jan 25 05:44 UTC] No.42619643[source]▶

>>42619341 #

I believe $3,000 is for the unmentioned minimum config, no idea on the mentioned 4TB storage and 128GB ram version.

Running a 96GB ram model isn't cheap (often with unified memory 25% is reserved for CPUs), so maybe it will win there.

replies(1): >>42619893 #

4. ac29 ◴[07 Jan 25 06:30 UTC] No.42619893{3}[source]▶

>>42619643 #

The NVIDIA press release [0] says "Each Project DIGITS features 128GB of unified, coherent memory and up to 4TB of NVMe storage", which makes it sound like the RAM is fixed size.

[0] https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

replies(1): >>42620480 #

5. kcb ◴[07 Jan 25 06:54 UTC] No.42620016[source]▶

>>42619341 #

Making comparisons to the 5090 is silly. That thing draws 500W+ and will require a boat anchor of metal to keep it cool. The device they showed is something more along the lines of a mobile dev kit.

replies(1): >>42631937 #

6. sliken ◴[07 Jan 25 08:26 UTC] No.42620480{4}[source]▶

>>42619893 #

Awesome.

Maybe there will be storage options of 1,2,and 4TB and optional 25/100/200/400 GBit interfaces. Or maybe everything except the CPU/GPU is constant, but having a 50%, 75%, or 100% of the CPU/GPU cores so they can bin their chips.

7. KeplerBoy ◴[07 Jan 25 08:49 UTC] No.42620598[source]▶

>>42619341 #

Of course. It has much less FLOPs than the 5090, after all this will have a TDP of ~50W and run off a regular USB-PD power supply.

It's basically the successor to the AGX Orin and in line with its pricing (considering it comes with a fast NIC). The AGX Orin had RTX 3050 levels of performance.

replies(2): >>42620818 #>>42622559 #

8. krasin ◴[07 Jan 25 09:29 UTC] No.42620818{3}[source]▶

>>42620598 #

Yes and no. Jetson line (which Jetson AGX Orin is a part of) is also providing multi-camera support (with MIPI CSI-2 connectors) and other real-time / microcontroller stuff, as well as rugged options via partners.

I hope to see new Jetsons based on Blackwell sometime in 2026 (they tend to be slow to release those).

replies(1): >>42620911 #

9. KeplerBoy ◴[07 Jan 25 09:45 UTC] No.42620911{4}[source]▶

>>42620818 #

Yeah, i guess its more a branch off the jetson line. Or a midpoint between the Jetsons, IGX Orin (not a typo) and Data Center offerings.

10. llm_nerd ◴[07 Jan 25 14:09 UTC] No.42622446[source]▶

>>42619341 #

The product isn't even finalized. It might never come to fruition, and I cannot fathom how they will make the power profile fit. I am skeptical that a $3000 device with 128GB of RAM and a 4TB SSD with the specs provided will even see reality any time within the next year, but let's pretend it will.

However we do know that it offers 1/4 the TOPS of the new 5090. It will be less powerful than the $600 5070. Which, of course it will given power limitations.

The only real compelling value is that nvidia memory starves their desktop cards so severely. It's the small opening that Apple found, even though Apple's FP4/FP8 performance is a world below what nvidia is offering. So purely from that perspective this is a winning product, as 128GB opens up a lot of possibilities. But from a raw performance perspective, it's actually going to pale compared to other nvidia products.

replies(1): >>42632090 #

11. adrian_b ◴[07 Jan 25 14:21 UTC] No.42622559{3}[source]▶

>>42620598 #

The successor of NVIDIA Orin is named Thor and it is expected to be launched later this year.

It uses other Arm processor cores than Digits, i.e. Neoverse V3AE, the automotive-enhanced version of Neoverse V3 (which is the server core version of Cortex-X4). According to rumors, NVIDIA Thor might have 14 Neoverse V3AE cores in the base version and there is also a double-die version.

The GPU of NVIDIA Thor is also a Blackwell, but probably with a very different configuration than in NVIDIA Digits.

NVIDIA Thor, like Orin, is intended for high reliability applications, like in automotive or industrial environments, unlike NVIDIA Digits, which is made with consumer-level technology.

12. computably ◴[08 Jan 25 07:29 UTC] No.42631937{3}[source]▶

>>42620016 #

I agree they're not products that compete against each other. Unfortunately, the silly comparison has to be made, as less informed consumers are already claiming that the 128 GB RAM of Project Digits will obsolete workstation/server-class GPUs.

13. computably ◴[08 Jan 25 08:02 UTC] No.42632090{3}[source]▶

>>42622446 #

AI TOPS numbers for Blackwell/ 5090 are probably for a niche numeric type like INT8 or INT4.

At FP32 (and FP16, assuming the consumer cards are still neutered), the 5090 apparently does ~105-107 TFLOPS, and the full GB202 ~125 TFLOPS. That means a non-neutered GB202-based card could hit ~250 TFLOPS of FP16, which lines up neatly with 1 PFLOP of FP4.

In reality, FP4 is more-than-linearly efficient relative to FP32. They quoted FP4 and not FP8 / FP16 for a reason. I wouldn't be too surprised if it doesn't even support FP32, maybe even FP16. Plus, they likely cut RT cores and other graphics-related features, making for a smaller and therefore more power efficient chip, because they're positioning this as an "AI supercomputer" and this hardware doesn't make sense for most graphical applications.

I see no reason this product wouldn't come to market - besides the usual supply/demand. There's value for a small niche and particular price bracket: enthusiasts running large q4 models, cheaper but slower vs. dedicated cards (3x-10x price/VRAM) and price-competitive but much faster vs. Apple silicon. It's a good strategic move for maintaining Nvidia's hold on the ecosystem regardless of the sales revenue.