Most active commenters

rbanffy(12)
Karupan(6)
com2kid(5)
croes(5)
moralestapia(4)
dagmx(4)
VikingCoder(4)
trhway(4)
otabdeveloper4(4)
(3)

Popular/hot comments

>>42619453 #
>>42619339 #
>>42620074 #
>>42621277 #
>>42622359 #
>>42622080 #
>>42621569 #
>>42620177 #
>>42620359 #
>>42620854 #
>>42620925 #
>>42619637 #
>>42621821 #
>>42619510 #
>>42622203 #
>>42619433 #
>>42624005 #

←back to thread

Nvidia's Project Digits is a 'personal AI supercomputer'

(techcrunch.com)

1. Karupan ◴[07 Jan 25 04:41 UTC] No.42619320[source]▶

>>42619139 (OP) #

I feel this is bigger than the 5x series GPUs. Given the craze around AI/LLMs, this can also potentially eat into Apple’s slice of the enthusiast AI dev segment once the M4 Max/Ultra Mac minis are released. I sure wished I held some Nvidia stocks, they seem to be doing everything right in the last few years!

replies(21): >>42619339 #>>42619433 #>>42619472 #>>42619544 #>>42619769 #>>42620175 #>>42620289 #>>42620359 #>>42620740 #>>42621569 #>>42621821 #>>42622149 #>>42622154 #>>42622259 #>>42622359 #>>42622567 #>>42622577 #>>42622621 #>>42622863 #>>42627093 #>>42627188 #

2. dagmx ◴[07 Jan 25 04:45 UTC] No.42619339[source]▶

>>42619320 (TP) #

I think the enthusiast side of things is a negligible part of the market.

That said, enthusiasts do help drive a lot of the improvements to the tech stack so if they start using this, it’ll entrench NVIDIA even more.

replies(7): >>42619397 #>>42619404 #>>42619430 #>>42619479 #>>42619510 #>>42619885 #>>42621646 #

3. option ◴[07 Jan 25 04:57 UTC] No.42619397[source]▶

>>42619339 #

today’s enthusiast, grad student, hacker is tomorrow’s startup founder, CEO, CTO or 10x contributor in large tech company

replies(1): >>42620210 #

4. computably ◴[07 Jan 25 04:58 UTC] No.42619404[source]▶

>>42619339 #

Yeah, it's more about preempting competitors from attracting any ecosystem development than the revenue itself.

5. VikingCoder ◴[07 Jan 25 05:01 UTC] No.42619430[source]▶

>>42619339 #

If I were NVidia, I would be throwing everything I could at making entertainment experiences that need one of these to run...

I mean, this is awfully close to being "Her" in a box, right?

replies(2): >>42619453 #>>42620410 #

6. paxys ◴[07 Jan 25 05:01 UTC] No.42619433[source]▶

>>42619320 (TP) #

“Bigger” in what sense? For AI? Sure, because this an AI product. 5x series are gaming cards.

replies(3): >>42619490 #>>42619497 #>>42619503 #

7. dagmx ◴[07 Jan 25 05:06 UTC] No.42619453{3}[source]▶

>>42619430 #

I feel like a lot of people miss that Her was a dystopian future, not an ideal to hit.

Also, it’s $3000. For that you could buy subscriptions to OpenAI etc and have the dystopian partner everywhere you go.

replies(9): >>42619500 #>>42619529 #>>42619627 #>>42619642 #>>42620406 #>>42620519 #>>42621356 #>>42625745 #>>42627519 #

8. qwertox ◴[07 Jan 25 05:10 UTC] No.42619472[source]▶

>>42619320 (TP) #

This is somewhat similar to what GeForce was to gamers back in the days, but for AI enthusiasts. Sure, the price is much higher, but at least it's a completely integrated solution.

replies(1): >>42619685 #

9. qwertox ◴[07 Jan 25 05:11 UTC] No.42619479[source]▶

>>42619339 #

You could have said the same about gamers buying expensive hardware in the 00's. It's what made Nvidia big.

replies(2): >>42620002 #>>42620925 #

10. Karupan ◴[07 Jan 25 05:14 UTC] No.42619490[source]▶

>>42619433 #

Bigger in the sense of the announcements.

11. AuryGlenz ◴[07 Jan 25 05:15 UTC] No.42619497[source]▶

>>42619433 #

Eh. Gaming cards, but also significantly faster. If the model fits in the VRAM the 5090 is a much better buy.

12. tacticus ◴[07 Jan 25 05:15 UTC] No.42619500{4}[source]▶

>>42619453 #

they don't miss that part. they just want to be the evil character.

13. a________d ◴[07 Jan 25 05:16 UTC] No.42619503[source]▶

>>42619433 #

Not expecting this to compete with the 5x series in terms of gaming; But it's interesting to note the increase in gaming performance Jensen was speaking about with Blackwell was larger related to inferenced frames generated by the tensor cores.

I wonder how it would go as a productivity/tinkering/gaming rig? Could a GPU potentially be stacked in the same way an additional Digit can?

replies(1): >>42621001 #

14. Karupan ◴[07 Jan 25 05:20 UTC] No.42619510[source]▶

>>42619339 #

I’m not so sure it’s negligible. My anecdotal experience is that since Apple Silicon chips were found to be “ok” enough to run inference with MLX, more non-technical people in my circle have asked me how they can run LLMs on their macs.

Surely a smaller market than gamers or datacenters for sure.

replies(3): >>42619637 #>>42620854 #>>42622080 #

15. t0lo ◴[07 Jan 25 05:24 UTC] No.42619529{4}[source]▶

>>42619453 #

The dystopian overton window has shifted, didn't you know, moral ambiguity is a win now? :) Tesla was right.

16. puppymaster ◴[07 Jan 25 05:26 UTC] No.42619544[source]▶

>>42619320 (TP) #

it eats into all NVDA consumer-facing clients no? I can see why openai and etc are looking for alternative hardware solution to train their next model.

17. VikingCoder ◴[07 Jan 25 05:41 UTC] No.42619627{4}[source]▶

>>42619453 #

We already live in dystopian hell and I'd like to have Scarlett Johansen whispering in my ear, thanks.

Also, I don't particularly want my data to be processed by anyone else.

18. dagmx ◴[07 Jan 25 05:43 UTC] No.42619637{3}[source]▶

>>42619510 #

I mean negligible to their bottom line. There may be tons of units bought or not, but the margin on a single datacenter system would buy tens of these.

It’s purely an ecosystem play imho. It benefits the kind of people who will go on to make potentially cool things and will stay loyal.

replies(3): >>42619863 #>>42621450 #>>42623624 #

19. croes ◴[07 Jan 25 05:44 UTC] No.42619642{4}[source]▶

>>42619453 #

OpenAI doesn’t make any profit. So either it dies or prices go up. Not to mention the privacy aspect of your own machine and the freedom of choice which models to run

replies(2): >>42620369 #>>42623621 #

20. Karupan ◴[07 Jan 25 05:51 UTC] No.42619685[source]▶

>>42619472 #

Yep that's what I'm thinking as well. I was going to buy a 5090 mainly to play around with LLM code generation, but this is a worthy option for roughly the same price as building a new PC with a 5090.

replies(1): >>42620744 #

21. doctorpangloss ◴[07 Jan 25 06:05 UTC] No.42619769[source]▶

>>42619320 (TP) #

What slice?

Also, macOS devices are not very good inference solutions. They are just believed to be by diehards.

I don't think Digits will perform well either.

If NVIDIA wanted you to have good performance on a budget, it would ship NVLink on the 5090.

replies(2): >>42619816 #>>42619818 #

22. YetAnotherNick ◴[07 Jan 25 06:13 UTC] No.42619816[source]▶

>>42619769 #

> Also, macOS devices are not very good inference solutions

They are good for single batch inference and have very good tok/sec/user. ollama works perfectly in mac.

23. Karupan ◴[07 Jan 25 06:13 UTC] No.42619818[source]▶

>>42619769 #

They are perfectly fine for certain people. I can run Qwen-2.5-coder 14B on my M2 Max MacBook Pro with 32gb at ~16 tok/sec. At least in my circle, people are budget conscious and would prefer using existing devices rather than pay for subscriptions where possible.

And we know why they won't ship NVLink anymore on prosumer GPUs: they control almost the entire segment and why give more away for free? Good for the company and investors, bad for us consumers.

replies(1): >>42620177 #

24. htrp ◴[07 Jan 25 06:23 UTC] No.42619863{4}[source]▶

>>42619637 #

>It’s purely an ecosystem play imho. It benefits the kind of people who will go on to make potentially cool things and will stay loyal.

100%

The people who prototype on a 3k workstation will also be the people who decide how to architect for a 3k GPU buildout for model training.

25. gr3ml1n ◴[07 Jan 25 06:28 UTC] No.42619885[source]▶

>>42619339 #

AMD thought the enthusiast side of things was a negligible side of the market.

replies(1): >>42620274 #

26. Cumpiler69 ◴[07 Jan 25 06:51 UTC] No.42620002{3}[source]▶

>>42619479 #

There's a lot more gamers than people wanting to play with LLms at home.

replies(2): >>42620074 #>>42620091 #

27. anonylizard ◴[07 Jan 25 07:04 UTC] No.42620074{4}[source]▶

>>42620002 #

There's a titanic market with people wanting some uncensored local LLM/image/video generation model. This market extremely overlaps with gamers today, but will grow exponentially every year.

replies(6): >>42620079 #>>42620868 #>>42621277 #>>42621615 #>>42622007 #>>42622468 #

28. Cumpiler69 ◴[07 Jan 25 07:05 UTC] No.42620079{5}[source]▶

>>42620074 #

How big is that market you claim? Local LLM image generation already exists out off the box on latest Samsung flagship phones and it's mostly a Gimmick that gets old pretty quickly. Hardly comparable to gaming in terms of market size and profitablity.

Plus, YouTube and the Google images is already full of AI generated slop and people are already tired of it. "AI fatigue" amongst majority of general consumers is a documented thing. Gaming fatigues is not.

replies(2): >>42620282 #>>42622670 #

29. estebarb ◴[07 Jan 25 07:06 UTC] No.42620091{4}[source]▶

>>42620002 #

Sure, but those developers will create functionality that will require advanced GPUs and people will want that functionality. Eventually OS will expect it and it will became default everywhere. So, it is an important step that will push nvidia growing in the following years.

30. trhway ◴[07 Jan 25 07:24 UTC] No.42620175[source]▶

>>42619320 (TP) #

>enthusiast AI dev segment

i think it isn't about enthusiast. To me it looks like Huang/NVDA is pushing further a small revolution using the opening provided by the AI wave - up until now the GPU was add-on to the general computing core onto which that computing core offloaded some computing. With AI that offloaded computing becomes de-facto the main computing and Huang/NVDA is turning tables by making the CPU is just a small add-on on the GPU, with some general computing offloaded to that CPU.

The CPU being located that "close" and with unified memory - that would stimulate development of parallelization for a lot of general computing so that it would be executed on GPU, very fast that way, instead of on the CPU. For example classic of enterprise computing - databases, the SQL ones - a lot, if not, with some work, everything, in these databases can be executed on GPU with a significant performance gain vs. CPU. Why it isn't happening today? Load/unload onto GPU eats into performance, complexity of having only some operations offloaded to GPU is very high in dev effort, etc. Streamlined development on a platform with unified memory will change it. That way Huang/NVDA may pull out rug from under the CPU-first platforms like AMD/INTC and would own both - new AI computing as well as significant share of the classic enterprise one.

replies(1): >>42621504 #

31. acchow ◴[07 Jan 25 07:24 UTC] No.42620177{3}[source]▶

>>42619818 #

> I can run Qwen-2.5-coder 14B on my M2 Max MacBook Pro with 32gb at ~16 tok/sec. At least in my circle, people are budget conscious

Qwen 2.5 32B on openrouter is $0.16/million output tokens. At your 16 tokens per second, 1 million tokens is 17 continuous hours of output.

Openrouter will charge you 16 cents for that.

I think you may want to reevaluate which is the real budget choice here

Edit: elaborating, that extra 16GB ram on the Mac to hold the Qwen model costs $400, or equivalently 1770 days of continuous output. All assuming electricity is free

replies(4): >>42620372 #>>42621086 #>>42621314 #>>42621716 #

32. Mistletoe ◴[07 Jan 25 07:31 UTC] No.42620210{3}[source]▶

>>42619397 #

> tomorrow’s startup founder, CEO, CTO or 10x contributor in large tech company

Do we need more of those? We need plumbers and people that know how to build houses. We are completely full on founders and executives.

replies(2): >>42621629 #>>42633300 #

33. dagmx ◴[07 Jan 25 07:45 UTC] No.42620274{3}[source]▶

>>42619885 #

That’s not what I’m saying. I’m saying that the people buying this aren’t going to shift their bottom line in any kind of noticeable way. They’re already sold out of their money makers. This is just an entrenchment opportunity.

34. madwolf ◴[07 Jan 25 07:46 UTC] No.42620282{6}[source]▶

>>42620079 #

I think he implied AI generated porn. Perhaps also other kind of images that are at odds with morality and/or the law. I'm not sure but probably Samsung phones don't let you do that.

35. llm_trw ◴[07 Jan 25 07:46 UTC] No.42620289[source]▶

>>42619320 (TP) #

From the people I talk to the enthusiast market is nvidia 4090/3090 saturated because people want to do their fine tunes also porn on their off time. The Venn diagram of users who post about diffusion models and llms running at home is pretty much a circle.

replies(2): >>42621060 #>>42622245 #

36. csomar ◴[07 Jan 25 08:03 UTC] No.42620359[source]▶

>>42619320 (TP) #

Am I the only one disappointed by these? They cost roughly half the price of a macbook pro and offer hmm.. half the capacity in RAM. Sure speed matters in AI, but what do I do with speed when I can't load a 70b model.

On the other hand, with a $5000 macbook pro, I can easily load a 70b model and have a "full" macbook pro as a plus. I am not sure I fully understand the value of these cards for someone that want to run personal AI models.

replies(4): >>42620448 #>>42620456 #>>42620481 #>>42632079 #

37. blackoil ◴[07 Jan 25 08:05 UTC] No.42620369{5}[source]▶

>>42619642 #

> So either it dies or prices go up.

Or efficiency gains in hardware and software catchup making current price point profitable.

replies(1): >>42627105 #

38. Karupan ◴[07 Jan 25 08:06 UTC] No.42620372{4}[source]▶

>>42620177 #

It's a no brainer for me cause I already own the MacBook and I don't mind waiting a few extra seconds. Also, I didn't buy the mac for this purpose, it's just my daily device. So yes, I'm sure OpenRouter is cheaper, but I just don't have to think about using it as long as the open models are reasonable good for my use. Of course your needs may be quite different.

39. int_19h ◴[07 Jan 25 08:12 UTC] No.42620406{4}[source]▶

>>42619453 #

This is exactly the scenario where you don't want "the cloud" anywhere.

40. int_19h ◴[07 Jan 25 08:13 UTC] No.42620410{3}[source]▶

>>42619430 #

The real interesting stuff will happen when we get multimodal LMs that can do VR output.

41. macawfish ◴[07 Jan 25 08:20 UTC] No.42620448[source]▶

>>42620359 #

Then buy two and stack them!

Also I'm unfamiliar with macs is there really a MacBook pro with 256GB of RAM?

replies(1): >>42620521 #

42. rictic ◴[07 Jan 25 08:21 UTC] No.42620456[source]▶

>>42620359 #

Hm? They have 128GB of RAM. Macbook Pros cap out at 128GB as well. Will be interesting to see how a Project Digits machine performs in terms of inference speed.

43. gnabgib ◴[07 Jan 25 08:26 UTC] No.42620481[source]▶

>>42620359 #

Are you, perhaps, commenting on the wrong thread? Project Digits is a $3k 128GB computer.. the best your your $5K MBP can have for ram is.. 128GB.

44. nostromo ◴[07 Jan 25 08:34 UTC] No.42620519{4}[source]▶

>>42619453 #

Fun fact: Her was set in the year 2025.

replies(1): >>42621870 #

45. csomar ◴[07 Jan 25 08:34 UTC] No.42620521{3}[source]▶

>>42620448 #

No, macbooks pro cap at 128GB. But, still, they are a laptop. It'll be interesting to see if Apple can offer a good counter for the desktop. The mac pro can go to 192Gb which is closer to the 128Gb Digits + your Desktop machine. At $9299 price tag, it's not too competitive but close.

replies(1): >>42621527 #

46. behringer ◴[07 Jan 25 09:16 UTC] No.42620740[source]▶

>>42619320 (TP) #

Not only that, but it should help free up the gpus for the gamers.

47. qwertox ◴[07 Jan 25 09:16 UTC] No.42620744{3}[source]▶

>>42619685 #

It has 128 GB of unified RAM. It will not be as fast as the 32 GB VRAM of the 5090, but what gamer cards have always lacked was memory.

Plus you have fast interconnects, if you want to stack them.

I was somewhat attracted by the Jetson AGX Orin with 64 GB RAM, but this one is a no-brainer for me, as long as idle power is reasonable.

replies(2): >>42621291 #>>42621307 #

48. stuaxo ◴[07 Jan 25 09:36 UTC] No.42620854{3}[source]▶

>>42619510 #

It's annoying I do LLMs for work and have a bit of an interest in them and doing stuff with GANS etc.

I have a bit of an interest in games too.

If I could get one platform for both, I could justify 2k maybe a bit more.

I can't justify that for just one half: running games on Mac, right now via Linux: no thanks.

And on the PC side, nvidia consumer cards only go to 24gb which is a bit limiting for LLMs, while being very expensive - I only play games every few months.

replies(3): >>42621585 #>>42622473 #>>42622505 #

49. stuaxo ◴[07 Jan 25 09:38 UTC] No.42620868{5}[source]▶

>>42620074 #

Apart from the uncensored bit, I'm in this small market.

Do I buy a Macbook with silly amount of RAM when I only want to mess with images occasionally.

Do I get a big Nvidia card, topping out at 24gb - still small for some LLMs, but I could occasionally play games using it at least.

50. spaceman_2020 ◴[07 Jan 25 09:48 UTC] No.42620925{3}[source]▶

>>42619479 #

I keep thinking about stocks that have 100xd, and most seemed like obscure names to me as a layman. But man, Nvidia was a household name to anyone that ever played any game. And still so many of us never bothered buying the stock

Incredible fumble for me personally as an investor

replies(3): >>42621403 #>>42621650 #>>42623817 #

51. wpwpwpw ◴[07 Jan 25 10:01 UTC] No.42621001{3}[source]▶

>>42619503 #

Would hadn't nvidia cripple nvlink on geforce.

52. dist-epoch ◴[07 Jan 25 10:10 UTC] No.42621060[source]▶

>>42620289 #

Not your weights, not your waifu

53. oarsinsync ◴[07 Jan 25 10:14 UTC] No.42621086{4}[source]▶

>>42620177 #

> Openrouter will charge you 16 cents for that

And log everything too?

54. otabdeveloper4 ◴[07 Jan 25 10:59 UTC] No.42621277{5}[source]▶

>>42620074 #

> There's a titanic market with people wanting some uncensored local LLM/image/video generation model.

No. There's already too much porn on the internet, and AI porn is cringe and will get old very fast.

replies(4): >>42622387 #>>42622503 #>>42623432 #>>42623850 #

55. moffkalast ◴[07 Jan 25 11:02 UTC] No.42621291{4}[source]▶

>>42620744 #

Having your main pc as an LLM rig also really sucks for multitasking, since if you want to keep a model loaded to use it when needed, it means you have zero resources left to do anything else. GPU memory maxed out, most of the RAM used. Having a dedicated machine even if it's slower is a lot more practical imo, since you can actually do other things while it generates instead of having to sit there and wait, not being able to do anything else.

56. ◴[07 Jan 25 11:07 UTC] No.42621307{4}[source]▶

>>42620744 #

57. moffkalast ◴[07 Jan 25 11:08 UTC] No.42621314{4}[source]▶

>>42620177 #

It's a great option if you want to leak your entire internal codebase to 3rd parties.

58. smt88 ◴[07 Jan 25 11:18 UTC] No.42621356{4}[source]▶

>>42619453 #

If Silicon Valley could tell the difference between utopias and dystopias, we wouldn't have companies named Soylent or iRobot, and the recently announced Anduril/Palantir/OpenAI partnership to hasten the creation of either SkyNet or Big Brother wouldn't have happened at all.

replies(1): >>42630110 #

59. motoxpro ◴[07 Jan 25 11:29 UTC] No.42621403{4}[source]▶

>>42620925 #

Unless you predicted AI and Crypto then it was just really good, not 100x. It 20x from 2005-2020 but ~500x from 2005-2025

And if you truly did predict that Nvidia would own those markets and those markets would be massive, you could have also bought Amazon, Google or heck even Bitcoin. Anything you touched in tech really would have made you a millionaire really.

replies(2): >>42623557 #>>42624833 #

60. mrlongroots ◴[07 Jan 25 11:41 UTC] No.42621450{4}[source]▶

>>42619637 #

> It’s purely an ecosystem play imho. It benefits the kind of people who will go on to make potentially cool things and will stay loyal.

It will be massive for research labs. Most academics have to jump through a lot of hoops to get to play with not just CUDA, but also GPUDirect/RDMA/Infiniband etc. If you get older/donated hardware, you may have a large cluster but not newer features.

replies(1): >>42622526 #

61. tatersolid ◴[07 Jan 25 11:53 UTC] No.42621504[source]▶

>>42620175 #

> these databases can be executed on GPU with a significant performance gain vs. CPU

No, they can’t. GPU databases are niche products with severe limitations.

GPUs are fast at massively parallel math problems, they anren’t useful for all tasks.

replies(1): >>42621693 #

62. lr1970 ◴[07 Jan 25 11:57 UTC] No.42621527{4}[source]▶

>>42620521 #

> It'll be interesting to see if Apple can offer a good counter for the desktop.

Mac Pro [0] is a desktop with M2 Ultra and up to 192GB of unified memory.

[0] https://www.apple.com/mac-pro/

63. bloomingkales ◴[07 Jan 25 12:04 UTC] No.42621569[source]▶

>>42619320 (TP) #

Jensen did say in recent interview, paraphrasing, “they are trying to kill my company”.

Those Macs with unified memory is a threat he is immediately addressing. Jensen is a wartime ceo from the looks of it, he’s not joking.

No wonder AMD is staying out of the high end space, since NVIDIA is going head on with Apple (and AMD is not in the business of competing with Apple).

replies(4): >>42621601 #>>42622764 #>>42625636 #>>42628592 #

64. ◴[07 Jan 25 12:08 UTC] No.42621585{4}[source]▶

>>42620854 #

65. hkgjjgjfjfjfjf ◴[07 Jan 25 12:11 UTC] No.42621601[source]▶

>>42621569 #

You missed the Ryzen hx ai pro 395 product announcement

66. itsoktocry ◴[07 Jan 25 12:13 UTC] No.42621615{5}[source]▶

>>42620074 #

>There's a titantic market

How so?

Only 40% of gamers use a PC, a portion of those use AI in any meaningful way, and a fraction of those want to set up a local AI instance.

Then someone releases an uncensored, cloud based AI and takes your market?

67. davrosthedalek ◴[07 Jan 25 12:15 UTC] No.42621629{4}[source]▶

>>42620210 #

We might not, but Nvidia would certainly like it.

68. epolanski ◴[07 Jan 25 12:18 UTC] No.42621646[source]▶

>>42619339 #

If this is gonna be widely used by ML engineers, in biopharma, etc and they land 1000$ margins at half a million sales that's half a billion in revenue, with potential to grow.

69. ◴[07 Jan 25 12:18 UTC] No.42621650{4}[source]▶

>>42620925 #

70. trhway ◴[07 Jan 25 12:26 UTC] No.42621693{3}[source]▶

>>42621504 #

>GPU databases are niche products with severe limitations.

today. For the reasons like i mentioned.

>GPUs are fast at massively parallel math problems, they anren’t useful for all tasks.

GPU are fast at massively parallel tasks. Their memory bandwidth is 10x of that of the CPU for example. So, typical database operations, massively parallel in nature like join or filter, would run about that faster.

Majority of computing can be parallelized and thus benefit from being executed on GPU (with unified memory of the practically usable for enterprise sizes like 128GB).

replies(2): >>42626942 #>>42632328 #

71. sheepscreek ◴[07 Jan 25 12:48 UTC] No.42621821[source]▶

>>42619320 (TP) #

The developers they are referring to aren’t just enthusiasts; they are also developers who were purchasing SuperMicro and Lambda PCs to develop models for their employers. Many enterprises will buy these for local development because it frees up the highly expensive enterprise-level chip for commercial use.

This is a genius move. I am more baffled by the insane form factor that can pack this much power inside a Mac Mini-esque body. For just $6000, two of these can run 400B+ models locally. That is absolutely bonkers. Imagine running ChatGPT on your desktop. You couldn’t dream about this stuff even 1 year ago. What a time to be alive!

replies(3): >>42622203 #>>42622561 #>>42625904 #

72. swat535 ◴[07 Jan 25 12:57 UTC] No.42621870{5}[source]▶

>>42620519 #

Boring fact: The underlying theme of the movie Her is actually divorce and the destructive impact it has on people, the futuristic AI stuff is just for stuffing!

replies(1): >>42622366 #

73. weregiraffe ◴[07 Jan 25 13:16 UTC] No.42622007{5}[source]▶

>>42620074 #

>There's a titanic market

Titanic - so about to hit an iceberg and sink?

74. moralestapia ◴[07 Jan 25 13:27 UTC] No.42622080{3}[source]▶

>>42619510 #

Yes, but people already had their Macs for others reasons.

No one goes to an Apple store thinking "I'll get a laptop to do AI inference".

replies(4): >>42622296 #>>42622421 #>>42622639 #>>42623427 #

75. iKevinShah ◴[07 Jan 25 13:38 UTC] No.42622149[source]▶

>>42619320 (TP) #

I can confirm this is the case (for me).

76. informal007 ◴[07 Jan 25 13:38 UTC] No.42622154[source]▶

>>42619320 (TP) #

I would like to have Mac as my personal computer and digits as service to run llm.

77. stogot ◴[07 Jan 25 13:44 UTC] No.42622203[source]▶

>>42621821 #

How does it run 400B models across two? I didn’t see that in the article

replies(3): >>42622237 #>>42626529 #>>42627347 #

78. tempay ◴[07 Jan 25 13:48 UTC] No.42622237{3}[source]▶

>>42622203 #

> Nvidia says that two Project Digits machines can be linked together to run up to 405-billion-parameter models, if a job calls for it. Project Digits can deliver a standalone experience, as alluded to earlier, or connect to a primary Windows or Mac PC.

79. Tostino ◴[07 Jan 25 13:49 UTC] No.42622245[source]▶

>>42620289 #

Yeah, I really don't think the overlap is as much as you imagine. At least in /r/localllama and the discord servers I frequent, the vast majority of users are interested in one or the other primarily, and may just dabble with other things. Obviously this is just my observations...I could be totally misreading things.

80. axegon_ ◴[07 Jan 25 13:51 UTC] No.42622259[source]▶

>>42619320 (TP) #

> they seem to be doing everything right in the last few years

About that... Not like there isn't a lot to be desired from the linux drivers: I'm running a K80 and M40 in a workstation at home and the thought of having to ever touch the drivers, now that the system is operational, terrifies me. It is by far the biggest "don't fix it if it ain't broke" thing in my life.

replies(2): >>42622414 #>>42627424 #

81. the_other ◴[07 Jan 25 13:55 UTC] No.42622296{4}[source]▶

>>42622080 #

I'm currently wondering how likely it is I'll get into deeper LLM usage, and therefore how much Apple Silicon I need (because I'm addicted to macOS). So I'm some way closer to your steel man than you'd expect. But I'm probably a niche within a niche.

82. rbanffy ◴[07 Jan 25 14:00 UTC] No.42622359[source]▶

>>42619320 (TP) #

This is something every company should make sure they have: an onboarding path.

Xeon Phi failed for a number of reasons, but one where it didn't need to fail was availability of software optimised for it. Now we have Xeons and EPYCs, and MI300C's with lots of efficient cores, but we could have been writing software tailored for those for 10 years now. Extracting performance from them would be a solved problem at this point. The same applies for Itanium - the very first thing Intel should have made sure it had was good Linux support. They could have it before the first silicon was released. Itaium was well supported for a while, but it's long dead by now.

Similarly, Sun has failed with SPARC, which also didn't have an easy onboarding path after they gave up on workstations. They did some things right: OpenSolaris ensured the OS remained relevant (still is, even if a bit niche), and looking the other way for x86 Solaris helps people to learn and train on it. Oracle cloud could, at least, offer it on cloud instances. Would be nice.

Now we see IBM doing the same - there is no reasonable entry level POWER machine that can compete in performance with a workstation-class x86. There is a small half-rack machine that can be mounted on a deskside case, and that's it. I don't know of any company that's planning to deploy new systems on AIX (much less IBMi, which is also POWER), or even for Linux on POWER, because it's just too easy to build it on other, competing platforms. You can get AIX, IBMi and even IBMz cloud instances from IBM cloud, but it's not easy (and I never found a "from-zero-to-ssh-or-5250-or-3270" tutorial for them). I wonder if it's even possible. You can get Linux on Z instances, but there doesn't seem to be a way to get Linux on POWER. At least not from them (several HPC research labs still offer those).

replies(4): >>42622573 #>>42624071 #>>42625125 #>>42627663 #

83. AnonymousPlanet ◴[07 Jan 25 14:01 UTC] No.42622366{6}[source]▶

>>42621870 #

The overall theme of Her was human relationships. It was not about AI and not just about divorce in particular.The AI was just a plot device to include a bodyless person into the equation. Watch it again with this in mind and you will see what I mean.

replies(1): >>42625252 #

84. ceejayoz ◴[07 Jan 25 14:03 UTC] No.42622387{6}[source]▶

>>42621277 #

AI porn is currently cringe, just like Eliza for conversations was cringe.

The cutting edge will advance, and convincing bespoke porn of people's crushes/coworkers/bosses/enemies/toddlers will become a thing. With all the mayhem that results.

replies(1): >>42624219 #

85. mycall ◴[07 Jan 25 14:06 UTC] No.42622414[source]▶

>>42622259 #

Buy a second system which you can touch?

replies(1): >>42622833 #

86. JohnBooty ◴[07 Jan 25 14:06 UTC] No.42622421{4}[source]▶

>>42622080 #

They have, because until now Apple Silicon was the only practical way for many to work with larger models at home because they can be configured with 64-192GB of unified memory. Even the laptops can be configured with up to 128GB of unified memory.

Performance is not amazing (roughly 4060 level, I think?) but in many ways it was the only game in town unless you were willing and able to build a multi-3090/4090 rig.

replies(1): >>42624005 #

87. JohnBooty ◴[07 Jan 25 14:11 UTC] No.42622468{5}[source]▶

>>42620074 #

I'm sure a lot of people see "uncensored" and think "porn" but there's a lot of stuff that e.g. Dall-E won't let you do.

Suppose you're a content creator and you need an image of a real person or something copyrighted like a lot of sports logos for your latest YouTube video's thumbnail. That kind of thing.

I'm not getting into how good or bad that is; I'm just saying I think it's a pretty common use case.

88. wkat4242 ◴[07 Jan 25 14:12 UTC] No.42622473{4}[source]▶

>>42620854 #

32gb as of last night :)

89. JohnBooty ◴[07 Jan 25 14:15 UTC] No.42622503{6}[source]▶

>>42621277 #

I think there are a lot of non-porn uses. I see a lot of YouTube thumbnails that seem AI generated, but feature copyrighted stuff.

(example: a thumbnail for a YT video about a video game, featuring AI-generated art based on that game. because copyright reasons, in my very limited experience Dall-E won't let you do that)

I agree that AI porn doesn't seem a real market driver. With 8 billion people on Earth I know it has its fans I guess, but people barely pay for porn in the first place so I reallllly dunno how many people are paying for AI porn either directly or indirectly.

It's unclear to me if AI generated video will ever really cross the "uncanny valley." Of course, people betting against AI have lost those bets again and again but I don't know.

90. WaxProlix ◴[07 Jan 25 14:15 UTC] No.42622505{4}[source]▶

>>42620854 #

The new $2k card from Nvidia will be 32GB but your point stands. AMD is planning a unified chiplet based GPU architecture (AI/data center/workstation/gaming) called UDNA, which might alleviate some of these issues. It's been delayed and delayed though - hence the lackluster GPU offerings from team Red this cycle - so I haven't been getting my hopes up.

Maybe (LP)CAMM2 memory will make model usage just cheap enough that I can have a hosting server for it and do my usual midrange gaming GPU thing before then.

replies(2): >>42626549 #>>42627370 #

91. ckemere ◴[07 Jan 25 14:17 UTC] No.42622526{5}[source]▶

>>42621450 #

Academic minimal-bureaucracy purchasing card limit is about $4k, so pricing is convenient*2.

92. HarHarVeryFunny ◴[07 Jan 25 14:21 UTC] No.42622561[source]▶

>>42621821 #

The 1 PetaFLOP spec and 200GB model capacity specs are for FP4 (4-bit floating point), which means inference not training/development. It's still be a decent personal development machine, but not for that size of model.

93. numba888 ◴[07 Jan 25 14:21 UTC] No.42622567[source]▶

>>42619320 (TP) #

> I sure wished I held some Nvidia stocks, they seem to be doing everything right in the last few years!

They propelled on unexpected LLM boom. But plan 'A' was robotics in which NVidia invested a lot for decades. I think their time is about to come, with Tesla's humanoids for 20-30k and Chinese already selling for $16k.

94. nimish ◴[07 Jan 25 14:22 UTC] No.42622573[source]▶

>>42622359 #

1000% all these ai hardware companies will fail if they don't have this. You must have a cheap way to experiment and develop. Even if you want to only sell a $30000 datacenter card you still need a very low cost way to play.

Sad to see big companies like intel and amd don't understand this but they've never come to terms with the fact that software killed the hardware star

replies(2): >>42623471 #>>42623609 #

95. GaryNumanVevo ◴[07 Jan 25 14:22 UTC] No.42622577[source]▶

>>42619320 (TP) #

I bet $100k on NVIDIA stocks ~7 years ago, just recently closed out a bunch of them

96. technofiend ◴[07 Jan 25 14:26 UTC] No.42622621[source]▶

>>42619320 (TP) #

Will there really be a mac mini wirh Max or Ultra CPUs? This feels like somewhat of an overlap with the Mac Studio.

replies(1): >>42625596 #

97. kelsey98765431 ◴[07 Jan 25 14:27 UTC] No.42622639{4}[source]▶

>>42622080 #

my $5k m3 max 128gb disagrees

replies(1): >>42623970 #

98. TeMPOraL ◴[07 Jan 25 14:29 UTC] No.42622670{6}[source]▶

>>42620079 #

> Gaming fatigues is not.

It is. You may know it as the "I prefer to play board games (and feel smugly superior about it) because they're ${more social, require imagination, $whatever}" crowd.

replies(1): >>42623228 #

99. T-A ◴[07 Jan 25 14:36 UTC] No.42622764[source]▶

>>42621569 #

From https://www.tomshardware.com/pc-components/cpus/amds-beastly...

The fire-breathing 120W Zen 5-powered flagship Ryzen AI Max+ 395 comes packing 16 CPU cores and 32 threads paired with 40 RDNA 3.5 (Radeon 8060S) integrated graphics cores (CUs), but perhaps more importantly, it supports up to 128GB of memory that is shared among the CPU, GPU, and XDNA 2 NPU AI engines. The memory can also be carved up to a distinct pool dedicated to the GPU only, thus delivering an astounding 256 GB/s of memory throughput that unlocks incredible performance in memory capacity-constrained AI workloads (details below). AMD says this delivers groundbreaking capabilities for thin-and-light laptops and mini workstations, particularly in AI workloads. The company also shared plenty of gaming and content creation benchmarks.

[...]

AMD also shared some rather impressive results showing a Llama 70B Nemotron LLM AI model running on both the Ryzen AI Max+ 395 with 128GB of total system RAM (32GB for the CPU, 96GB allocated to the GPU) and a desktop Nvidia GeForce RTX 4090 with 24GB of VRAM (details of the setups in the slide below). AMD says the AI Max+ 395 delivers up to 2.2X the tokens/second performance of the desktop RTX 4090 card, but the company didn’t share time-to-first-token benchmarks.

Perhaps more importantly, AMD claims to do this at an 87% lower TDP than the 450W RTX 4090, with the AI Max+ running at a mere 55W. That implies that systems built on this platform will have exceptional power efficiency metrics in AI workloads.

replies(1): >>42624023 #

100. axegon_ ◴[07 Jan 25 14:42 UTC] No.42622833{3}[source]▶

>>42622414 #

That IS the second system (my AI home rig). I've given up on Nvidia for using it on my main computer because of their horrid drivers. I switched to Intel ARC about a month ago and I love it. The only downside is that I have a xeon on my main computer and Intel never really bothered to make ARC compatible with xeons so I had to hack my bios around, hoping I don't mess everything up. Luckily for me, it all went well so now I'm probably one of a dozen or so people worldwide to be running xeons + arc on linux. That said, the fact that I don't have to deal with nvidia's wretched linux drivers does bring a smile to my face.

101. tarsinge ◴[07 Jan 25 14:45 UTC] No.42622863[source]▶

>>42619320 (TP) #

> I sure wished I held some Nvidia stocks

I’m so tired of this recent obsession with the stock market. Now that retail is deeply invested it is tainting everything, like here on a technology forum. I don’t remember people mentioning Apple stock every time Steve Jobs made an announcement in the past decades. Nowadays it seems everyone is invested in Nvidia and just want the stock to go up, and every product announcement is a mean to that end. I really hope we get a crash so that we can get back to a more sane relation with companies and their products.

replies(1): >>42624678 #

102. Cumpiler69 ◴[07 Jan 25 15:17 UTC] No.42623228{7}[source]▶

>>42622670 #

The market heavily disagrees with you.

"The global gaming market size was valued at approximately USD 221.24 billion in 2024. It is forecasted to reach USD 424.23 billion by 2033, growing at a CAGR of around 6.50% during the forecast period (2025-2033)"

replies(1): >>42623573 #

103. com2kid ◴[07 Jan 25 15:33 UTC] No.42623427{4}[source]▶

>>42622080 #

Tons of people do, my next machine will likely be a Mac for 60% this reason and 40% Windows being so user hostile now.

104. Filligree ◴[07 Jan 25 15:33 UTC] No.42623432{6}[source]▶

>>42621277 #

> No. There's already too much porn on the internet, and AI porn is cringe and will get old very fast.

I needed an uncensored model in order to, guess what, make an AI draw my niece snowboarding down a waterfall. All the online services refuse on basis that the picture contains -- oh horrors -- a child.

"Uncensored" absolutely does not imply NSFW.

replies(1): >>42623952 #

105. rbanffy ◴[07 Jan 25 15:36 UTC] No.42623471{3}[source]▶

>>42622573 #

> Sad to see big companies like intel and amd don't understand this

And it's not like they were never bitten (Intel has) by this before.

replies(2): >>42625220 #>>42626475 #

106. fragmede ◴[07 Jan 25 15:41 UTC] No.42623557{5}[source]▶

>>42621403 #

Survivors bias though. It's hard to name all the companies that failed in the dot com bust, but even among the ones that made it through, because they're not around any more, they're harder to remember than the winners. But MCI, Palm, RIM, Nortel, Compaq, Pets.com, Webvan all failed and went to zero. There's an uncountable number of ICOs and NFTs that ended up nowhere. SVB isn't exactly an tech stock but they were strongly connected to it and they failed.

107. com2kid ◴[07 Jan 25 15:42 UTC] No.42623573{8}[source]▶

>>42623228 #

Farmville style games underwent similar explosive estimates of growth, up until they collapsed.

Much of the growth in gaming of late has come from exploitive dark patterns, and those dark patterns eventually stop working because users become immune to them.

replies(1): >>42625629 #

108. theptip ◴[07 Jan 25 15:45 UTC] No.42623609{3}[source]▶

>>42622573 #

Isn’t the cloud GPU market covering this? I can run a model for $2/hr, or get a 8xH100 if I need to play with something bigger.

replies(2): >>42624078 #>>42624873 #

109. com2kid ◴[07 Jan 25 15:46 UTC] No.42623621{5}[source]▶

>>42619642 #

OpenAI built a 3 billion dollar business in less than 3 years of a commercial offering.

replies(1): >>42627088 #

110. bwfan123 ◴[07 Jan 25 15:46 UTC] No.42623624{4}[source]▶

>>42619637 #

Devalapers developers developers - balmer monkey dance - the key to be entrenched is the platform ecosystem.

Also why aws is giving trainium credits for free

111. robohoe ◴[07 Jan 25 16:01 UTC] No.42623817{4}[source]▶

>>42620925 #

Nvidia joined S&P500 in 2001 so if you've been doing passive index fund investing, you probably got a little bit of it in your funds. So there was some upside to it.

112. Paradigma11 ◴[07 Jan 25 16:04 UTC] No.42623850{6}[source]▶

>>42621277 #

I think scams will create a far more demand. Spear Phishing targets by creating persistent elaborate online environments is going to be big.

113. otabdeveloper4 ◴[07 Jan 25 16:10 UTC] No.42623952{7}[source]▶

>>42623432 #

Yeah, and there's that story about "private window" mode in browsers because you were shopping for birthday gifts that one time. You know what I mean though.

replies(1): >>42624746 #

114. moralestapia ◴[07 Jan 25 16:12 UTC] No.42623970{5}[source]▶

>>42622639 #

Doubt it, a year ago useful local LLMs on a Mac (via something like ollama) was barely taking off.

If what you say it's true you were among the first 100 people on the planet who were doing this; which btw, further supports my argument on how extremely rare is that use case for Mac users.

replies(2): >>42625331 #>>42628423 #

115. moralestapia ◴[07 Jan 25 16:14 UTC] No.42624005{5}[source]▶

>>42622421 #

I would bet that people running LLMs on their Macs, today, is <0.1% of their user base.

replies(3): >>42625314 #>>42626764 #>>42627711 #

116. adrian_b ◴[07 Jan 25 16:15 UTC] No.42624023{3}[source]▶

>>42622764 #

"Fire breathing" is completely inappropriate.

Strix Halo is a replacement for the high-power laptop CPUs from the HX series of Intel and AMD, together with a discrete GPU.

The thermal design power of a laptop CPU-dGPU combo is normally much higher than 120 W, which is the maximum TDP recommended for Strix Halo. The faster laptop dGPUs want more than 120 W only for themselves, not counting the CPU.

So any claims of being surprised that the TDP range for Strix Halo is 45 W to 120 W are weird, like the commenter has never seen a gaming laptop or a mobile workstation laptop.

replies(1): >>42628986 #

117. AtlasBarfed ◴[07 Jan 25 16:18 UTC] No.42624071[source]▶

>>42622359 #

It really mystifies me that Intel AMD and other hardware companies obviously Nvidia in this case Don't either have a consortium or each have their own in-house Linux distribution with excellent support.

Windows has always been a barrier to hardware feature adoption to Intel. You had to wait 2 to 3 years, sometimes longer, for Windows to get around us providing hardware support.

Any OS optimizations in Windows you had to go through Microsoft. So say you added some instructions custom silicon or whatever to speed up Enterprise databases, provide high-speed networking that needed some special kernel features, etc, there was always Microsoft being in the way.

Not just in the drag the feet communication. Getting the tech people a line problem.

Microsoft will look at every single change. It did as to whether or not it would challenge their Monopoly whether or not it was in their business interest whether or not it kept you as the hardware and a subservient role.

replies(1): >>42625105 #

118. rbanffy ◴[07 Jan 25 16:19 UTC] No.42624078{4}[source]▶

>>42623609 #

People tend to limit their usage when it's time-billed. You need some sort of desktop computer anyway, so, if you spend the 3K this one costs, you have unlimited time of Nvidia cloud software. When you need to run on bigger metal, then you pay $2/hour.

replies(1): >>42628927 #

119. otabdeveloper4 ◴[07 Jan 25 16:29 UTC] No.42624219{7}[source]▶

>>42622387 #

It will always be cringe due to how so-called "AI" works. Since it's fundamentally just log-likelihood optimization under the hood, it will always be a statistically most average image. Which means it will always have that characteristic "plastic" and overdone look.

replies(1): >>42626174 #

120. lioeters ◴[07 Jan 25 17:05 UTC] No.42624678[source]▶

>>42622863 #

> hope we get a crash

That's the best time to buy. ;)

replies(1): >>42637866 #

121. Filligree ◴[07 Jan 25 17:12 UTC] No.42624746{8}[source]▶

>>42623952 #

I really don't. Censored models are so censored they're practically useless for anything but landscapes. Half of them refuse to put humans in the pictures at all.

122. adolph ◴[07 Jan 25 17:18 UTC] No.42624833{5}[source]▶

>>42621403 #

It is interesting to think about crypto as a stairstep that Nvidia used to get to its current position in AI. It wasn't games > ai, but games > crypto > ai.

123. johndough ◴[07 Jan 25 17:21 UTC] No.42624873{4}[source]▶

>>42623609 #

I have the skills to write efficient CUDA kernels, but $2/hr is 10% of my salary, so no way I'm renting any H100s. The electricity price for my computer is already painful enough as is. I am sure there are many eastern European developers who are more skilled and get paid even less. This is a huge waste of resources all due to NVIDIA's artificial market segmentation. Or maybe I am just cranky because I want more VRAM for cheap.

replies(1): >>42627542 #

124. p_ing ◴[07 Jan 25 17:42 UTC] No.42625105{3}[source]▶

>>42624071 #

From the consumer perspective, it seems that MSFT has provided scheduler changes fairly rapidly for CPU changes, like X3D, P/e cores, etc. At least within a couple of months, if not at release.

Amd/Intel work directly with Microsoft for shipping new silicon that would otherwise require it.

replies(1): >>42626993 #

125. p_ing ◴[07 Jan 25 17:44 UTC] No.42625125[source]▶

>>42622359 #

Raptor Computing provides POWER9 workstations. They're not cheap, still use last-gen hardware (DDR4/PCIe 4 ... and POWER9 itself) but they're out there.

https://www.raptorcs.com/content/base/products.html

replies(2): >>42626952 #>>42627659 #

126. nimish ◴[07 Jan 25 17:54 UTC] No.42625220{4}[source]▶

>>42623471 #

Well, Intel management is very good at snatching defeat from the jaws of victory

127. adolph ◴[07 Jan 25 17:57 UTC] No.42625252{7}[source]▶

>>42622366 #

The universal theme of Her was the set of harmonics that define what is something and the thresholds, boundaries, windows onto what is not thatthing but someotherthing, even if the thing perceived is a mirror, not just about human relationships in particular. The relationship was just a plot device to make a work of deep philosophy into a marketable romantic comedy.

128. sroussey ◴[07 Jan 25 18:02 UTC] No.42625314{6}[source]▶

>>42624005 #

People buying Macs for LLMs—sure I agree.

Since the current MacOS comes built in with small LLMs, that number might be closer to 50% not 0.1%.

replies(1): >>42627383 #

129. sroussey ◴[07 Jan 25 18:04 UTC] No.42625331{6}[source]▶

>>42623970 #

No, I got a MacBook Pro 14”with M2 Max and 64GB for LLMs, and that was two generations back.

130. adolph ◴[07 Jan 25 18:26 UTC] No.42625596[source]▶

>>42622621 #

There will undoubtably be a Mac Studio (and Mac Pro?) bump to M4 at some point. Benchmarks [0] reflect how memory bandwidth and core count [1] compare to processor improvements. Granted, ymmv to your workload.

0. https://www.macstadium.com/blog/m4-mac-mini-review

1. https://www.apple.com/mac/compare/?modelList=Mac-mini-M4,Mac...

131. mrguyorama ◴[07 Jan 25 18:28 UTC] No.42625629{9}[source]▶

>>42623573 #

>Farmville style games underwent similar explosive estimates of growth, up until they collapsed.

They did not collapse, they moved to smartphones. The "free"-to-play gacha portion of the gaming market is so successful it is most of the market. "Live service" games are literally traditional game makers trying to grab a tiny slice of that market, because it's infinitely more profitable than making actual games.

>those dark patterns eventually stop working because users become immune to them.

Really? Slot machines have been around for generations and have not become any less effective. Gambling of all forms has relied on the exact same physiological response for millennia. None of this is going away without legislation.

replies(1): >>42626766 #

132. JoshTko ◴[07 Jan 25 18:28 UTC] No.42625636[source]▶

>>42621569 #

Which interview was this?

replies(1): >>42628776 #

133. dnissley ◴[07 Jan 25 18:39 UTC] No.42625745{4}[source]▶

>>42619453 #

Please name the dystopian elements of Her.

134. numba888 ◴[07 Jan 25 18:55 UTC] No.42625904[source]▶

>>42621821 #

This looks like a bigger brother of Orin AGX, which has 64GB of RAM and runs smaller LLMs. The question will be power and performance vs 5090. We know price is 1.5x

135. ceejayoz ◴[07 Jan 25 19:17 UTC] No.42626174{8}[source]▶

>>42624219 #

The current state of the art in AI image generation was unimaginable a few years back. The idea that it'll stay as-is for the next century seems... silly.

replies(1): >>42632388 #

136. the_panopticon ◴[07 Jan 25 19:41 UTC] No.42626475{4}[source]▶

>>42623471 #

Intel does have https://www.clearlinux.org/

replies(1): >>42626941 #

137. FuriouslyAdrift ◴[07 Jan 25 19:45 UTC] No.42626529{3}[source]▶

>>42622203 #

Point to point ConnectX connection (RDMA with GPUDirect)

138. FuriouslyAdrift ◴[07 Jan 25 19:46 UTC] No.42626549{5}[source]▶

>>42622505 #

Unified architecture is still on track for 2026-ish.

139. justincormack ◴[07 Jan 25 20:05 UTC] No.42626764{6}[source]▶

>>42624005 #

Higher than that buying the top end machines though, which are very high margin

140. com2kid ◴[07 Jan 25 20:05 UTC] No.42626766{10}[source]▶

>>42625629 #

> Slot machines have been around for generations and have not become any less effective.

Slot machines are not a growth market. The majority of people wised to them literal generations ago, although enough people remain susceptible to maintain a handful of city economies.

> They did not collapse, they moved to smartphones

Agreed, but the dark patterns being used are different. The previous dark patterns became ineffective. The level of sophistication of psychological trickery in modern f2p games is far beyond anything Farmville ever attempted.

The rise of live service games also does not bode well for infinite growth in the industry as there's only so many hours to go around each day for playing games and even the evilest of player manipulation techniques can only squeeze so much blood from a stone.

The industry is already seeing the failure of new live service games to launch, possibly analogous to what happened in the MMO market when there was a rush of releases after WoW. With the exception of addicts, most people can only spend so many hours a day playing games.

141. rbanffy ◴[07 Jan 25 20:17 UTC] No.42626941{5}[source]▶

>>42626475 #

At least they don’t suffer from a lack of onboarding paths for x86, and it seems they are doing a nice job with their dGPUs.

Still unforgivable that their new CPUs hit the market without excellent Linux support.

142. justincormack ◴[07 Jan 25 20:17 UTC] No.42626942{4}[source]▶

>>42621693 #

The unified memory is no faster for the GPU than the CPU. So its not 10x the CPU. HBM on a GPU is much faster.

replies(1): >>42629225 #

143. rbanffy ◴[07 Jan 25 20:18 UTC] No.42626952{3}[source]▶

>>42625125 #

It kind of defeats the purpose of an onboarding platform if it’s more expensive than the one you think of moving away from.

IBM should see some entry-level products as loss leaders.

144. rbanffy ◴[07 Jan 25 20:22 UTC] No.42626993{4}[source]▶

>>42625105 #

> From the consumer perspective, it seems that MSFT has provided scheduler changes fairly rapidly

Now they have some competition. This is relatively new, and Satya Nadella reshaped the company because of that.

145. croes ◴[07 Jan 25 20:33 UTC] No.42627088{6}[source]▶

>>42623621 #

3 billion revenue and 5 billion loss doesn’t sound like a sustainable business model.

replies(2): >>42630094 #>>42632463 #

146. wslh ◴[07 Jan 25 20:33 UTC] No.42627093[source]▶

>>42619320 (TP) #

The nVidia price is closer (USD 3k) to a top Mac mini but I trust Apple more for the end-to-end support from hardware to apps than nVidia. Not an Apple fanboy but an user/dev, and I don't think we realize what Apple really achieved, industrially speaking. The M1 was launched in late 2020.

147. croes ◴[07 Jan 25 20:34 UTC] No.42627105{6}[source]▶

>>42620369 #

Training data gets mired in expensive and they need constant input otherwise the AI‘s knowledge is outdated

148. croes ◴[07 Jan 25 20:41 UTC] No.42627188[source]▶

>>42619320 (TP) #

Did they say anything about power consumption?

Apple M chips are pretty efficient.

149. sliken ◴[07 Jan 25 20:55 UTC] No.42627347{3}[source]▶

>>42622203 #

Not sure exactly, but they mentioned linking to together with ConnectX, which could be ethernet or IB. No idea on the speed though.

150. sliken ◴[07 Jan 25 20:57 UTC] No.42627370{5}[source]▶

>>42622505 #

Grace + Hopper, Grace + blackwell, and discussed GB10 are much like the currently shipping AMD MI300A.

I do hope that a AMD Strix Halo ships with 2 LPCAMM2 slots for a total width of 256 bits.

151. moralestapia ◴[07 Jan 25 20:59 UTC] No.42627383{7}[source]▶

>>42625314 #

I'm not arguing whether or not Macs are capable of doing it, but whether is a material force that drives people to buy Macs because of it; it's not.

152. sliken ◴[07 Jan 25 21:03 UTC] No.42627424[source]▶

>>42622259 #

Use a filesystem that snapshots AND do a complete backup.

153. vasco ◴[07 Jan 25 21:10 UTC] No.42627519{4}[source]▶

>>42619453 #

One man's dystopia is another man's dream. There's no "missing" in the moral of a movie, you make whatever you want out of it.

154. rbanffy ◴[07 Jan 25 21:12 UTC] No.42627542{5}[source]▶

>>42624873 #

This has 128GB of unified memory. A similarly configured Mac Studio costs almost twice as much, and I'm not sure the GPU is on the same league (software support wise, it isn't, but that's fixable).

A real shame it's not running mainline Linux - I don't like their distro based on Ubuntu LTS.

replies(1): >>42640380 #

155. throwaway48476 ◴[07 Jan 25 21:21 UTC] No.42627659{3}[source]▶

>>42625125 #

They're not offering POWER10 either because IBM closed the firmware again. Stupid move.

replies(1): >>42635644 #

156. UncleOxidant ◴[07 Jan 25 21:21 UTC] No.42627663[source]▶

>>42622359 #

There were Phi cards, but they were pricey and power hungry (at the time, now current GPU cards probably meet or exceed the Phi card's power consumption) for plugging into your home PC. A few years back there was a big fire sale on Phi cards - you could pick one up for like $200. But by then nobody cared.

replies(1): >>42633289 #

157. throwaway48476 ◴[07 Jan 25 21:25 UTC] No.42627711{6}[source]▶

>>42624005 #

All macs? Yes. But of 192GB mac configs? Probably >50%

158. kgwgk ◴[07 Jan 25 22:26 UTC] No.42628423{6}[source]▶

>>42623970 #

People were running llama.cpp on Mac laptops in March 2023 and Llama2 was released in July 2023. People were buying Macs to run LLMs months before M3 machines became available in November 2023.

159. nomel ◴[07 Jan 25 22:44 UTC] No.42628592[source]▶

>>42621569 #

> since NVIDIA is going head on with Apple

I think this is a race that Apple doesn't know it's part of. Apple has something that happens to work well for AI, as a side effect of having a nice GPU with lots of fast shared memory. It's not marketed for inference.

replies(1): >>42672974 #

160. bloomingkales ◴[07 Jan 25 23:03 UTC] No.42628776{3}[source]▶

>>42625636 #

https://fortune.com/2023/11/11/nvidia-ceo-jensen-huang-says-...

I can't find the exact Youtube video, but it's out there.

161. bmicraft ◴[07 Jan 25 23:19 UTC] No.42628927{5}[source]▶

>>42624078 #

3k is still very steep for anyone not on a silicon valley like salary.

replies(1): >>42655317 #

162. bmicraft ◴[07 Jan 25 23:26 UTC] No.42628986{4}[source]▶

>>42624023 #

> The thermal design power of a laptop CPU-dGPU combo is normally much higher than 120 W

Normally? Much higher than 120W? Those are some pretty abnormal (and dare I say niche?) laptops you're talking about there. Remember, that's not peak power - thermal design power is what the laptop should be able to power and cool pretty much continuously.

At those power levels, they're usually called DTR: desktop replacement. You certainly can't call it "just a laptop" anymore once we're in needs-two-power-supplies territory.

replies(1): >>42632095 #

163. trhway ◴[07 Jan 25 23:58 UTC] No.42629225{5}[source]▶

>>42626942 #

No. The unified memory on GB10 is much faster than regular RAM to CPU system:

https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

"The GB10 Superchip enables Project DIGITS to deliver powerful performance using only a standard electrical outlet. Each Project DIGITS features 128GB of unified, coherent memory and up to 4TB of NVMe storage. With the supercomputer, developers can run up to 200-billion-parameter large language models to supercharge AI innovation."

https://www.nvidia.com/en-us/data-center/grace-cpu-superchip...

"Grace is the first data center CPU to utilize server-class high-speed LPDDR5X memory with a wide memory subsystem that delivers up to 500GB/s of bandwidth "

As far as i see it is about 4x of Zen 5.

164. VikingCoder ◴[08 Jan 25 01:59 UTC] No.42630094{7}[source]▶

>>42627088 #

The real question is what the next 3 years look like. If it's another 5 billion burned for 3 billion or less in revenue, that's one thing... But...

replies(1): >>42631889 #

165. VikingCoder ◴[08 Jan 25 02:01 UTC] No.42630110{5}[source]▶

>>42621356 #

I mean, we still act like a "wild goose chase" is a bad thing.

We still schedule "bi-weekly" meetings.

We can't agree on which way charge goes in a wire.

Have you seen the y-axis on an economists chart?

166. croes ◴[08 Jan 25 07:21 UTC] No.42631889{8}[source]▶

>>42630094 #

How...

replies(1): >>42632175 #

167. maniroo ◴[08 Jan 25 07:59 UTC] No.42632079[source]▶

>>42620359 #

Bro we can connect two ProjectDigits as well. I was only looking at the M4 macbook because 128gb unified memory. Now this beast can cook better LLMs at just 3K with 4TB SSD too. M4 Macbook Max (128 GB unified ram and 4TB Storage) is 5999. So, No more apple for me. I will just get the Digits. And can create a workstation as well.

168. adrian_b ◴[08 Jan 25 08:03 UTC] No.42632095{5}[source]▶

>>42628986 #

Any laptop that in marketed as "gaming laptop" or "mobile workstation" belongs to this category.

I do not know which is the proportion of gaming laptops and mobile workstations vs. thin and light laptops. While obviously there must be much more light laptops, the gaming laptops cannot be a niche product, because there are too many models offered by a lot of vendors.

My own laptop is a Dell Precision, so it belongs to this class. I would not call Dell Precision laptops as a niche product, even if they are typically used only by professionals.

My previous laptop was some Lenovo Yoga that also belonged to this class, having a discrete NVIDIA GPU. In general, any laptop having a discrete GPU belongs to this class, because the laptop CPUs intended to be paired with discrete GPUs have a default TDP of 45 W or 55 W, while the smallest laptop discrete GPUs may have TDPs of 55 W to 75 W, but the faster laptop GPUs have TDPs between 100 W and 150 W, so the combo with CPU reaches a TDP around 200 W for the biggest laptops.

replies(2): >>42633493 #>>42654905 #

169. menaerus ◴[08 Jan 25 08:18 UTC] No.42632175{9}[source]▶

>>42631889 #

Recent report says there are 1M paying customers. At ~30USD for 12 months this is ~3.6B of revenue which kinda matches their reported figures. So to break even at their ~5B costs assuming that they need no further major investment in infrastructure they only need to increase the paying subscriptions from 1M to 2M. Since there are ~250M people who engaged with OpenAI free tier service 2x projection doesn't sound too surreal.

170. menaerus ◴[08 Jan 25 08:50 UTC] No.42632328{4}[source]▶

>>42621693 #

> So, typical database operations, massively parallel in nature like join or filter, would run about that faster.

Given workload A how much of the total runtime JOIN or FILTER would take in contrast to the storage engine layer for example? My gut feeling tells me not much since to see the actual gain you'd need to be able to parallelize everything including the storage engine challenges.

IIRC all the startups building databases around GPUs failed to deliver in the last ~10 years. All of them are shut down if I am not mistaken.

replies(1): >>42633126 #

171. otabdeveloper4 ◴[08 Jan 25 09:07 UTC] No.42632388{9}[source]▶

>>42626174 #

If you're talking about some sort of non-existent sci-fi future "AI" that isn't just log-likelihood optimization, then most likely such a fantastical thing wouldn't be using NVidia's GPU with CUDA.

This hardware is only good for current-generation "AI".

172. com2kid ◴[08 Jan 25 09:24 UTC] No.42632463{7}[source]▶

>>42627088 #

Rumor has it they run queries at a profit, and most of the cost is in training and staff.

If they is true their path to profitability isn't super rocky. Their path to achieving their current valuation may end up being trickier though!

173. trhway ◴[08 Jan 25 11:30 UTC] No.42633126{5}[source]▶

>>42632328 #

With cheap large RAMs and the SSD the storage has already became much less of an issue, especially when the database is primarily in-memory one.

How about attaching SSD based storage to NVLink? :) Nvidia does have the direct to memory tech and uses wide buses, so i don't see any issue for them to direct attach arrays of SSD if they feel like it.

>IIRC all the startups building databases around GPUs failed to deliver in the last ~10 years. All of them are shut down if I am not mistaken.

As i already said - model of database offloading some ops to GPU with its separate memory isn't feasible, and those startups confirmed it. Especially when GPU would be 8-16GB while the main RAM can easily be 1-2TB with 100-200 CPU cores. With 128GB unified memory like on GB10 the situation looks completely different (that Nvidia allows only 2 to be connected by NVLink is just a market segmentation not a real technical limitation).

replies(1): >>42643493 #

174. rbanffy ◴[08 Jan 25 11:59 UTC] No.42633289{3}[source]▶

>>42627663 #

Imagine if they were sold at cost in the beginning. Also, think about having one as the only CPU rather than a card.

175. hatboat ◴[08 Jan 25 12:00 UTC] No.42633300{4}[source]▶

>>42620210 #

If they're already an "enthusiast, grad student, hacker", are they likely to choose the "plumbers and people that know how to build houses" career track?

True passion for one's career is rare, despite the clichéd platitudes ecouraging otherwise. That's something we should encourage and invest in regardless of the field.

176. bloomingkales ◴[08 Jan 25 12:31 UTC] No.42633493{6}[source]▶

>>42632095 #

People are very unaware just how much better a gaming laptop from 3 years ago is (compared to a copilot laptop). These laptops are sub $500 on eBay, and Best Buy won’t give you more than $150 for it as a trade in (almost like they won’t admit that those laptops outclass the new category type of AI pc).

177. rbanffy ◴[08 Jan 25 16:11 UTC] No.42635644{4}[source]▶

>>42627659 #

Raptor's value proposition is a 100% free and open platform, from the firmware and up, but, if they were willing to compromise on that, they'd be able to launch a POWER10 box.

Not sure it'd competitive in price with other workstation class machines. I don't know how expensive IBM's S1012 desk side is, but with only 64 threads, it'd be a meh workstation.

178. Vilian ◴[08 Jan 25 19:59 UTC] No.42637866{3}[source]▶

>>42624678 #

But if you buy and it crash, you lose the money, no?

179. seanmcdirmid ◴[09 Jan 25 00:55 UTC] No.42640380{6}[source]▶

>>42627542 #

$4,799 for an M2 Ultra with 128GB of RAM, so not quite twice as much. I'm not sure what the benchmark comparison would be. $5,799 if you want an extra 16 GPU cores (60 vs 76).

replies(1): >>42646901 #

180. menaerus ◴[09 Jan 25 09:38 UTC] No.42643493{6}[source]▶

>>42633126 #

I mean you wouldn't run a database on a GB10 device or cluster of them thereof. GH200 is another story, however, the potential improvement wrt the databases-in-GPUs still falls short of the question if there are enough workloads that are compute-bound in the substantial part of total wall-clock time for given workload.

In other words, and hypothetically, if you can improve logical plan execution to run 2x faster by rewriting the algorithms to make use of GPU resources but physical plan execution remains to be bottlenecked by the storage engine, then the total sum of gains is negligible.

But I guess there could perhaps be some use-case where this could be proved as a win.

181. rbanffy ◴[09 Jan 25 16:01 UTC] No.42646901{7}[source]▶

>>42640380 #

We'll need to look into benchmarks when the numbers come out. Software support is also important, and a Mac will not help you that much if you are targeting CUDA.

I have to agree the desktop experience of the Mac is great, on par with the best Linuxes out there.

replies(1): >>42648555 #

182. seanmcdirmid ◴[09 Jan 25 18:37 UTC] No.42648555{8}[source]▶

>>42646901 #

A lot of models are optimized for metal already, especially lamma, deepseek, and qwen. You are still taking a hit but there wasn't an alternative solution for getting that much vram in a less than $5k before this NVIDIA project came out. Will definitely look at it closely if it isn't just vaporware.

replies(1): >>42655337 #

183. bmicraft ◴[10 Jan 25 12:01 UTC] No.42654905{6}[source]▶

>>42632095 #

You can't usually just add up the TDPs of CPU and GPU, because neither cooling nor the power circuitry supports that kind of load. That's why AMDs SmartShift is a thing.

184. rbanffy ◴[10 Jan 25 13:12 UTC] No.42655317{6}[source]▶

>>42628927 #

Yes. Most people make do with a generic desktop and an Nvidia GPU. What makes this machine attractive is the beefy GPU and the full Nvidia support for the whole AI stack.

185. rbanffy ◴[10 Jan 25 13:15 UTC] No.42655337{9}[source]▶

>>42648555 #

They cant walk back now without some major backlash.

The one thing I wonder is noise. That box is awfully small for the amount of compute it packs, and high-end Mac Studios are 50% heatsink. There isn’t much space in this box for a silent fan.

186. Tepix ◴[12 Jan 25 12:05 UTC] No.42672974{3}[source]▶

>>42628592 #

Apple is both well aware and marketing it, as seen at https://www.apple.com/my/newsroom/2024/10/apples-new-macbook...

Quote:

"It also supports up to 128GB of unified memory, so developers can easily interact with LLMs that have nearly 200 billion parameters."

↑