←back to thread

200 points simonw | 2 comments | | HN request time: 0s | source
Show context
amirhirsch ◴[] No.45664936[source]
I also use Claude Code to install CUDA and PyTorch and HuggingFace models on my quad A100 machine. Shouldn't feel like debugging a 2000s Linux driver.

HuggingFace has incredible reach but poor UX, and PyTorch installs remain fragile. There’s real space here for a platform that makes this all seamless maybe even something that auto-updates a local SSD with fresh models to try every day.

replies(1): >>45665698 #
nicman23 ◴[] No.45665698[source]
what. i mean firmware has a lot of failsafes but i would not trust it to do sysops for a 100k machine
replies(2): >>45665897 #>>45667818 #
1. amirhirsch ◴[] No.45667818[source]
it’s all running inside a Proxmox VM with IOMMU and GPU passthrough. It’s as safe as doing the same on any cloud system.

Also the machine is well north of 100K when you include the RF ADCs and DACs in there that run a radar.

Worst case, I have multiple.

replies(1): >>45691095 #
2. nicman23 ◴[] No.45691095[source]
iommu does not matter ? the issue would be if one of those a100 fries.