←back to thread

203 points mooreds | 1 comments | | HN request time: 0.207s | source
Show context
pinkmuffinere ◴[] No.45956593[source]
> During this time, us-east-1 was offline, and while we only run a limited amount of infrastructure in the region, we have to run it there because we have customers who want it there

> [Our service can only go down] five minutes and 15 seconds per year.

I don't have much experience in this area, so please correct me if I'm mistaken:

Don't these two quotes together imply that they have failed to deliver on their SLA for the subset of their customers that want their service in us-east-1? I understand the customers won't be mad at them in this case, since us-east-1 itself is down, but I feel like their title is incorrect. Some subset of their service is running on top of AWS. When AWS goes down, that subset of their service is down. When AWS was down, it seems like they were also down for some customers.

replies(3): >>45956738 #>>45956751 #>>45957858 #
1. PaulRobinson ◴[] No.45956738[source]
The bulk of the article discusses their failover strategy, where they detect failures in a region and how they route requests to a backup region, and how to deal with data consistency and cost issues arising from that.