←back to thread

65 points fidotron | 1 comments | | HN request time: 0.204s | source
Show context
littlestymaar ◴[] No.43575210[source]
Re-using a comment a wrote some time ago:

Tenstorrent really needs to put more VRAM on their cards.

If chinese companies can hack Nvidia GPUs with 48 or 96GB vram at a competitive price, surely Tensorrent can too.

Variants of n300d at $2500 for 48GB and $3900 for 96GB would be instant hits.

~~24GB for $1500 simply isn't gonna do it.~~ (old part of the comment related to the old n300 which can be update with: 32B for $1400 still isn't enough for success. There's some progress, but that's still too low considering it's exotic hardware that will lead to tons of compatibility issues).

replies(4): >>43575246 #>>43575326 #>>43575727 #>>43579793 #
bigyabai ◴[] No.43575727[source]
Dedicated memory isn't the issue. Increase DRAM on your card and your bandwidth goes down; increase the bandwidth and your price increases reciprocally. The solution isn't to just solder more memory anywhere it fits, these are well-paid engineers that are working to optimize a complex problem space. The Chinese board fluxers are working with a different class of hardware that regularly ships with dark silicon, binned hardware and die-chopped configurations.

You'll note that Apple didn't just immediately resume shipping systems with 1.5TB of RAM when they revised their own system architecture. It's taken them half a decade to recoup a third of that capacity at the VRAM-level speeds they require to unify the GPU and CPU's memory.

replies(1): >>43580765 #
1. littlestymaar ◴[] No.43580765[source]
> Dedicated memory isn't the issue.

To run large MoE models it is.

> Increase DRAM on your card and your bandwidth goes down

Why would it?

> You'll note that Apple didn't just immediately resume shipping systems with 1.5TB of RAM when they revised their own system architecture. It's taken them half a decade to recoup a third of that capacity at the VRAM-level speeds they require to unify the GPU and CPU's memory

I fail to see how a unified architecture on a general purpose CPU is a good illustration when we're discussing PCIe accelerator cards. The problems they face have little in common.