←back to thread

623 points magicalhippo | 10 comments | | HN request time: 1.317s | source | bottom
Show context
Karupan ◴[] No.42619320[source]
I feel this is bigger than the 5x series GPUs. Given the craze around AI/LLMs, this can also potentially eat into Apple’s slice of the enthusiast AI dev segment once the M4 Max/Ultra Mac minis are released. I sure wished I held some Nvidia stocks, they seem to be doing everything right in the last few years!
replies(21): >>42619339 #>>42619433 #>>42619472 #>>42619544 #>>42619769 #>>42620175 #>>42620289 #>>42620359 #>>42620740 #>>42621569 #>>42621821 #>>42622149 #>>42622154 #>>42622259 #>>42622359 #>>42622567 #>>42622577 #>>42622621 #>>42622863 #>>42627093 #>>42627188 #
rbanffy ◴[] No.42622359[source]
This is something every company should make sure they have: an onboarding path.

Xeon Phi failed for a number of reasons, but one where it didn't need to fail was availability of software optimised for it. Now we have Xeons and EPYCs, and MI300C's with lots of efficient cores, but we could have been writing software tailored for those for 10 years now. Extracting performance from them would be a solved problem at this point. The same applies for Itanium - the very first thing Intel should have made sure it had was good Linux support. They could have it before the first silicon was released. Itaium was well supported for a while, but it's long dead by now.

Similarly, Sun has failed with SPARC, which also didn't have an easy onboarding path after they gave up on workstations. They did some things right: OpenSolaris ensured the OS remained relevant (still is, even if a bit niche), and looking the other way for x86 Solaris helps people to learn and train on it. Oracle cloud could, at least, offer it on cloud instances. Would be nice.

Now we see IBM doing the same - there is no reasonable entry level POWER machine that can compete in performance with a workstation-class x86. There is a small half-rack machine that can be mounted on a deskside case, and that's it. I don't know of any company that's planning to deploy new systems on AIX (much less IBMi, which is also POWER), or even for Linux on POWER, because it's just too easy to build it on other, competing platforms. You can get AIX, IBMi and even IBMz cloud instances from IBM cloud, but it's not easy (and I never found a "from-zero-to-ssh-or-5250-or-3270" tutorial for them). I wonder if it's even possible. You can get Linux on Z instances, but there doesn't seem to be a way to get Linux on POWER. At least not from them (several HPC research labs still offer those).

replies(4): >>42622573 #>>42624071 #>>42625125 #>>42627663 #
nimish ◴[] No.42622573[source]
1000% all these ai hardware companies will fail if they don't have this. You must have a cheap way to experiment and develop. Even if you want to only sell a $30000 datacenter card you still need a very low cost way to play.

Sad to see big companies like intel and amd don't understand this but they've never come to terms with the fact that software killed the hardware star

replies(2): >>42623471 #>>42623609 #
1. theptip ◴[] No.42623609[source]
Isn’t the cloud GPU market covering this? I can run a model for $2/hr, or get a 8xH100 if I need to play with something bigger.
replies(2): >>42624078 #>>42624873 #
2. rbanffy ◴[] No.42624078[source]
People tend to limit their usage when it's time-billed. You need some sort of desktop computer anyway, so, if you spend the 3K this one costs, you have unlimited time of Nvidia cloud software. When you need to run on bigger metal, then you pay $2/hour.
replies(1): >>42628927 #
3. johndough ◴[] No.42624873[source]
I have the skills to write efficient CUDA kernels, but $2/hr is 10% of my salary, so no way I'm renting any H100s. The electricity price for my computer is already painful enough as is. I am sure there are many eastern European developers who are more skilled and get paid even less. This is a huge waste of resources all due to NVIDIA's artificial market segmentation. Or maybe I am just cranky because I want more VRAM for cheap.
replies(1): >>42627542 #
4. rbanffy ◴[] No.42627542[source]
This has 128GB of unified memory. A similarly configured Mac Studio costs almost twice as much, and I'm not sure the GPU is on the same league (software support wise, it isn't, but that's fixable).

A real shame it's not running mainline Linux - I don't like their distro based on Ubuntu LTS.

replies(1): >>42640380 #
5. bmicraft ◴[] No.42628927[source]
3k is still very steep for anyone not on a silicon valley like salary.
replies(1): >>42655317 #
6. seanmcdirmid ◴[] No.42640380{3}[source]
$4,799 for an M2 Ultra with 128GB of RAM, so not quite twice as much. I'm not sure what the benchmark comparison would be. $5,799 if you want an extra 16 GPU cores (60 vs 76).
replies(1): >>42646901 #
7. rbanffy ◴[] No.42646901{4}[source]
We'll need to look into benchmarks when the numbers come out. Software support is also important, and a Mac will not help you that much if you are targeting CUDA.

I have to agree the desktop experience of the Mac is great, on par with the best Linuxes out there.

replies(1): >>42648555 #
8. seanmcdirmid ◴[] No.42648555{5}[source]
A lot of models are optimized for metal already, especially lamma, deepseek, and qwen. You are still taking a hit but there wasn't an alternative solution for getting that much vram in a less than $5k before this NVIDIA project came out. Will definitely look at it closely if it isn't just vaporware.
replies(1): >>42655337 #
9. rbanffy ◴[] No.42655317{3}[source]
Yes. Most people make do with a generic desktop and an Nvidia GPU. What makes this machine attractive is the beefy GPU and the full Nvidia support for the whole AI stack.
10. rbanffy ◴[] No.42655337{6}[source]
They cant walk back now without some major backlash.

The one thing I wonder is noise. That box is awfully small for the amount of compute it packs, and high-end Mac Studios are 50% heatsink. There isn’t much space in this box for a silent fan.