←back to thread

468 points speckx | 3 comments | | HN request time: 0s | source
Show context
densh ◴[] No.45304632[source]
For anyone interested in playing with distributed systems, I'd really recommend getting a single machine with latest 16-core CPU from AMD and just running 8 virtual machines on it. 8 virtual machines, with 4 hyper threads pinned per machine, and 1/8 of total RAM per machine. Create a network between them virtually within your virtualization software of choice (such as Proxmox).

And suddenly you can start playing with distributed software, even though it's running on a single machine. For resiliency tests you can unplug one machine at a time with a single click. It will annihilate a Pi cluster in Perf/W as well, and you don't have to assemble a complex web of components to make it work. Just a single CPU, motherboard, m.2 SSD, and two sticks of RAM.

Naturally, using a high core count machine without virtualization will get you best overall Perf/W in most benchmarks. What's also important but often not highlighted in benchmarks in Idle W if you'd like to keep your cluster running, and only use it occasionally.

replies(6): >>45305155 #>>45305387 #>>45305468 #>>45305628 #>>45307364 #>>45313651 #
bee_rider ◴[] No.45305155[source]
Tangentially related: I really expected running old MPI programs on stuff like the AMD multi-chip workstation packages to become a bigger thing.
replies(1): >>45309479 #
1. le-mark ◴[] No.45309479[source]
I actually worked with some MPI code way back. What MPI programs are you referring to?
replies(2): >>45310410 #>>45311020 #
2. MathMonkeyMan ◴[] No.45310410[source]
I don't know, but when I was playing with finite difference code as an undergrad in Physics, all of the docs I could find (it was a while ago, though) assumed that I was going to use MPI to run a distributed workload across the university's supercomputer. My needs were less, so I just ran my Boost.Thread code on the four cores of one node.

What if you had a single server with a zillion cores in it? Maybe you could take some 15 year old MPI code and run it locally -- it'd be like a mini supercomputer with an impossibly fast network.

3. bee_rider ◴[] No.45311020[source]
I’m not thinking of one code in particular. Just, observing that in the multi-chiplet, even inside a CPU package we’re already talking over a sort of little internal network anyway. Might as well use code that was designed to run on a network, right?