←back to thread

90 points sugarpimpdorsey | 3 comments | | HN request time: 0s | source
Show context
teekert ◴[] No.44775397[source]
Perhaps it is worth noting that all super computers I know (like the Dutch Snellius and the Finnish Lumi) are Slurm clusters with login nodes.

Bioinformaticians (among others) in (for example) University Medical Centers won’t get much more bang for the buck than on a well managed Slurm cluster (ie with GPU and Fat nodes etc to distinguish between compute loads). You buy the machines, they are utilized close to 100% over their life time.

replies(4): >>44775708 #>>44775996 #>>44777261 #>>44784010 #
1. secabeen ◴[] No.44777261[source]
In HPC, the general rule of thumb is that if you can keep your machine busy more than 40% of the time, it will be cheaper to run on-prem than cloud.
replies(1): >>44784601 #
2. teekert ◴[] No.44784601[source]
I had 70% in mind but it certainly sounds reasonable (do you have a source? That would be great for our management). In our university medical hospital we have a very hard time to get rid of "shadow IT" because a single 50K machine can just process so much data (ie Next Generation Sequencing data) and can be amortized over 5 (probably 10!) years.

And then we aren't even talking about the EPD servers that are amortized in 4 years and can easily become a compute node in the cluster for another 6 (only problem are the bookkeepers who just can't live with post-amortized hardware! What a world!!)

replies(1): >>44793288 #
3. secabeen ◴[] No.44793288[source]
I don't have any hard data, my role is somewhat HPC-adjacent, rather than directly in it, so this is mostly what I've heard. One way to look at it is that for most HPC operators, they are not charged any of the following for their gear, it's just provided by the organization as part of the larger pool: Power, Cooling, Real Estate, Networking, Security, Silicon-Valley SRE salaries, 38% Cloud Vendor Profit Margin.

Of course, the organization will pay for some of those eventually, so it's not fully fair to not roll them into the IT costs, but there are also lots of ways that non-profits also don't pay those costs at the same levels that the cloud providers do either due to differences in overall costs, or in providing a lesser level of capability. (As a quick example, cloud providers need extensive physical security for their datacenters. A hospital server needs a locked door, and can leverage the existing hospital security team for free.)

Cloud is great if your need is elastic, or if you have time-sensitive revenue dependent on your calculations. In non-profit research environments, that is often not the case. Users have compute they want done "eventually", but they don't really care if it's done in 1 hour or 4 hours; they have lots of other good work to do while waiting for the compute to run in their background.