←back to thread

1226 points bishopsmother | 1 comments | | HN request time: 0.554s | source
Show context
throwawaaarrgh ◴[] No.35051550[source]
I've been doing reliability stuff for near two decades. The one thing I am sure of is there is no way to just engineer your way to reliability. That is to say, no person, no matter how smart, can just invent some whizbang engineering thing and suddenly you have reliability.

Reliability is a thing that grows, like a plant. You start out with a new system or piece of software. It's fragile, small, weak. It is threatened by competing things and literal bugs and weather and the soil it's grown in and more. It needs constant care. Over time it grows stronger, and can eventually fend for itself pretty well. Sometimes you get lucky and it just grows fine by itself. And sometimes 50 different things conspire to kill it. But you have to be there monitoring it, finding the problems, learning how to prevent them. Every garden is a little different.

It doesn't matter what a company like Fly does technology wise. It takes time and care and churning. Eventually they will be reliable. But the initial process takes a while. And every new piece of tech they throw in is another plant in the garden.

So the good news is, they can become really reliable. But the bad news is, it doesn't come fast, and the more new plants they put in the ground, the more concerns there are to address before the garden is self sustaining.

replies(7): >>35051647 #>>35052736 #>>35052993 #>>35053029 #>>35053323 #>>35056046 #>>35056972 #
unxdfa ◴[] No.35053029[source]
You can make sensible assumptions that result in engineering gains though. Step around the problems not through them.

For example I have learned that the first step to reliability is removing as many hashicorp products from your stack as possible though. Appears I am not the only one.

replies(1): >>35055780 #
jen20 ◴[] No.35055780[source]
If you’ve been using them in ways clearly explicitly called out as not per the design goals, then sure, removing any piece of technology will help you. I’m guessing that is not your actual problem though.
replies(1): >>35059894 #
1. unxdfa ◴[] No.35059894[source]
I would not assume that Hashicorp products necessarily meet the design goals if I'm honest. Consul and vagrant have been absolute shits and vault adds more complexity and unreliability to the problem domain and has a net negative ROI. I like the idea of their products but the reality is very different.