edit: While the title says "personal", Jensen did say this was aimed at startups and similar, so not your living room necessarily.
edit: While the title says "personal", Jensen did say this was aimed at startups and similar, so not your living room necessarily.
The only thing it really competes with is the Mac Studio for LocalLlama-type enthusiasts and devs. It isn't cheap enough to dent the used market, nor powerful enough to stand in for bigger cards.
However we do know that it offers 1/4 the TOPS of the new 5090. It will be less powerful than the $600 5070. Which, of course it will given power limitations.
The only real compelling value is that nvidia memory starves their desktop cards so severely. It's the small opening that Apple found, even though Apple's FP4/FP8 performance is a world below what nvidia is offering. So purely from that perspective this is a winning product, as 128GB opens up a lot of possibilities. But from a raw performance perspective, it's actually going to pale compared to other nvidia products.
At FP32 (and FP16, assuming the consumer cards are still neutered), the 5090 apparently does ~105-107 TFLOPS, and the full GB202 ~125 TFLOPS. That means a non-neutered GB202-based card could hit ~250 TFLOPS of FP16, which lines up neatly with 1 PFLOP of FP4.
In reality, FP4 is more-than-linearly efficient relative to FP32. They quoted FP4 and not FP8 / FP16 for a reason. I wouldn't be too surprised if it doesn't even support FP32, maybe even FP16. Plus, they likely cut RT cores and other graphics-related features, making for a smaller and therefore more power efficient chip, because they're positioning this as an "AI supercomputer" and this hardware doesn't make sense for most graphical applications.
I see no reason this product wouldn't come to market - besides the usual supply/demand. There's value for a small niche and particular price bracket: enthusiasts running large q4 models, cheaper but slower vs. dedicated cards (3x-10x price/VRAM) and price-competitive but much faster vs. Apple silicon. It's a good strategic move for maintaining Nvidia's hold on the ecosystem regardless of the sales revenue.