Tenstorrent Launches Blackhole Developer Products at Tenstorrent Dev Day

(tenstorrent.com)

65 points fidotron | 1 comments | 03 Apr 25 18:07 UTC | HN request time: 0.204s | source

Show context

littlestymaar ◴[03 Apr 25 20:50 UTC] No.43575210[source]▶

>>43573310 (OP) #

Re-using a comment a wrote some time ago:

Tenstorrent really needs to put more VRAM on their cards.

If chinese companies can hack Nvidia GPUs with 48 or 96GB vram at a competitive price, surely Tensorrent can too.

Variants of n300d at $2500 for 48GB and $3900 for 96GB would be instant hits.

~~24GB for $1500 simply isn't gonna do it.~~ (old part of the comment related to the old n300 which can be update with: 32B for $1400 still isn't enough for success. There's some progress, but that's still too low considering it's exotic hardware that will lead to tons of compatibility issues).

replies(4): >>43575246 #>>43575326 #>>43575727 #>>43579793 #

bigyabai ◴[03 Apr 25 21:33 UTC] No.43575727[source]▶

>>43575210 #

Dedicated memory isn't the issue. Increase DRAM on your card and your bandwidth goes down; increase the bandwidth and your price increases reciprocally. The solution isn't to just solder more memory anywhere it fits, these are well-paid engineers that are working to optimize a complex problem space. The Chinese board fluxers are working with a different class of hardware that regularly ships with dark silicon, binned hardware and die-chopped configurations.

You'll note that Apple didn't just immediately resume shipping systems with 1.5TB of RAM when they revised their own system architecture. It's taken them half a decade to recoup a third of that capacity at the VRAM-level speeds they require to unify the GPU and CPU's memory.

replies(1): >>43580765 #

1. littlestymaar ◴[04 Apr 25 11:35 UTC] No.43580765[source]▶

>>43575727 #

> Dedicated memory isn't the issue.

To run large MoE models it is.

> Increase DRAM on your card and your bandwidth goes down

Why would it?

> You'll note that Apple didn't just immediately resume shipping systems with 1.5TB of RAM when they revised their own system architecture. It's taken them half a decade to recoup a third of that capacity at the VRAM-level speeds they require to unify the GPU and CPU's memory

I fail to see how a unified architecture on a general purpose CPU is a good illustration when we're discussing PCIe accelerator cards. The problems they face have little in common.

↑