Why Companies Are Ditching the Cloud: The Rise of Cloud Repatriation

1. efitz ◴[05 Nov 24 21:58 UTC] No.42055548[source]▶

>>42054813 (OP) #

There are certain workloads that have never been really economical to run in cloud. Cloud economics is based on multi-tenancy, eg if you have a lot of hardware that is sitting idle a lot of the time, then cloud may be economical for you as the cloud provider can share it between you and others.

Cloud is also good for episodic use of expensive exotic systems like HPC and GPU fleets, if you don’t need them all the time- I call this serial multi-tenancy.

Cloud is not economical for massive storage, especially if you’re not willing to use backup solutions and reduced availability. For example, AWS S3 default keeps multiple copies of uploaded data; this is not comparable to typical on-premises RAID 1 or RAID 3. You can save money with reduced redundancy storage but then you have to take on more of the reliability burden. Likewise compute is cheap if you’re buying multi-tenant instances, but if you want dedicated instances or bare metal, then the economics aren’t nearly as attractive.

Cloud is also good for experimentation and rapid development - it’s so much faster to click a few buttons than to go through the hardware acquisition processes at many enterprises.

The companies that regret cloud due to financial concerns usually make two mistakes.

First, as noted above, they pay for premium services that are not directly comparable to on-prem, or they use workloads in cloud that are not cloud economical, or both.

Second, they don’t constrain random usage enough. It is super easy for a developer doing some testing to spin up thousands of dollars of bill. And it’s even worse if they leave it at the end of the day and go home- it’s still racking up hourly usage. And it’s downright ugly if they forget it and move on to something else. You have to be super disciplined to not spin up more than you need and turn it off as soon as you’re done with it.

replies(4): >>42055776 #>>42056476 #>>42057010 #>>42079821 #

2. cyberax ◴[05 Nov 24 22:34 UTC] No.42055776[source]▶

>>42055548 (TP) #

> but if you want dedicated instances or bare metal

Multitenant instances on AWS statically partition the hardware (CPU, RAM, network), so tenants don't really share all that much. Memory bandwidth is probably the only really affected resource.

> Second, they don’t constrain random usage enough.

AWS now has billing alerts with per-hour resolution and automatic anomaly detection. There are third-party tools that do the same.

replies(1): >>42057303 #

3. teyc ◴[06 Nov 24 00:37 UTC] No.42056476[source]▶

>>42055548 (TP) #

Most enterprises on prem already run VMware for virtualisation, it is the antiquated way of provisioning that affects how slow it is to spin something up on prem. And frequently these antiquated practices are carried to the cloud, negating any benefit.

replies(1): >>42057312 #

4. coredog64 ◴[06 Nov 24 03:08 UTC] No.42057010[source]▶

>>42055548 (TP) #

S3 has two more cost saving dimensions: How long will you commit to storing these exact bytes and how long are you willing to wait to get them. Either of those will allow you to reduce S3 costs without having to chance data loss due to AZ failure.

5. efitz ◴[06 Nov 24 05:03 UTC] No.42057303[source]▶

>>42055776 #

> Multitenant instances on AWS statically partition the hardware (CPU, RAM, network), so tenants don't really share all that much.

You are missing several points:

First, density. Cloud providers have huge machines that can run lots of VMs, and AWS in particular uses hardware (”Nitro”) for hypervisor functionality so they have very low overhead.

Cloud providers also don’t do “hardware” partitioning for many instance types. AWS sells “VCPUs” as the capacity unit; this is not necessarily a core, it may be time on a core.

Cloud providers can also over-provision; like airlines can sell more seats than exist on a plane, cloud providers can sell more VCPUs than cores on a machine, assuming (correctly) that the vast majority of instances will be idle most of the time, and they can manage noisy neighbors via live migration.

And lots of other more esoteric stuff.

replies(1): >>42057550 #

6. efitz ◴[06 Nov 24 05:06 UTC] No.42057312[source]▶

>>42056476 #

> And frequently these antiquated practices are carried to the cloud, negating any benefit.

I should have brought that up too. Airlifting your stuff to the cloud and expecting cloud to run like your data center is a way to set yourself up for disappointment and expense. The cloud is something very different than your on-premise datacenter and many things that make sense on prem, do not make sense in cloud.

7. cyberax ◴[06 Nov 24 06:28 UTC] No.42057550{3}[source]▶

>>42057303 #

> Cloud providers can also over-provision

But they don't. AWS overprovisions only on T-type instances (T3,T4,T5). The rest of the instance types don't share cores or memory between tenants.

I know, I worked with the actual AWS hardware at Amazon :) AWS engineers have always been pretty paranoid about security, so they limit the hardware sharing between tenants as much as possible. For example, AWS had been strictly limiting hyperthreading and cache sharing even before the SPECTRE/Meltdown.

AWS doesn't actually charge any premium for the bare metal instance types (the ones with ".metal" in the name). They just cost a lot because they are usually subdivided into many individual VMs.

For example, c6g.metal is $2.1760 per hour, and c6g.16xlarge is the same $2.1760. c6g.4xlarge is $0.5440

> And lots of other more esoteric stuff.

Not really. They had some plans for more esoteric stuff, but anything more complicated than EC2 Spot does not really have a market demand.

Customers prefer stability. EC2 and other foundational services like EBS and VPC are carefully designed to stay stable if the AWS control plane malfunctions ("static stability").

replies(2): >>42072622 #>>42103961 #

8. efitz ◴[07 Nov 24 02:21 UTC] No.42072622{4}[source]▶

>>42057550 #

Also former AWS :-D

9. ravedave5 ◴[07 Nov 24 19:11 UTC] No.42079821[source]▶

>>42055548 (TP) #

> it’s so much faster to click a few buttons than to go through the hardware acquisition processes at many enterprises.

My companies on prem requisition process used to be so horrifically bad that it forced ball of mud solutions because nobody had time to wait 9 months for new servers. Also we had scaling issues and we couldn't react in any timely manner. I feel like its a penny wise pound foolish approach to stay on prem.

10. BackBlast ◴[11 Nov 24 01:47 UTC] No.42103961{4}[source]▶

>>42057550 #

Seems par for the course that even AWS employees don't even understand their pricing. I noticed the pricing similarity and tried to deploy to .metal instances. And that's when I got hit with additional charges.

If you turn on a .metal instance, your account will be billed (at least) $2/hr for the privilege for every region in which you do so. A fact I didn't know until I had racked up more charges than expected. So many junk fees hiding behind every checkbox on the platform.