There isn't really secret sauce to it in 2023. The techniques, processes, and etc have pretty much been documented over the past 20 years.
But if you are wondering how AWS manages to be so good at it at such scale? Hosting infrastructure is incredibly complicated and AWS employs something like 100k people. Seemingly small AWS services employ more engineers than Fly.io.
That being said my take is that what's happening at Fly.io is a lack of leadership. There are not the right people in the right positions clearly. I've worked infra at companies from 5 people to, well Rackspace, and I'm having a hard time imagining so much time passing with.. Essentially a piece of infra MIA and impacting users.