Fly.io: The reclaimer of Heroku's magic

(christine.website)

528 points sealeck | 3 comments | 15 May 22 19:56 UTC | HN request time: 0.636s | source

Show context

anon3949494 ◴[15 May 22 21:10 UTC] No.31391163[source]▶

After all the chatter this week, I've come to the conclusion that Heroku froze at the perfect time for my 4 person company. All of these so called "features" are exactly what we don't want or need.

1. Multi-region deployment only work if your database is globally distributed too. However, making your database globally distributed creates a set of new problems, most of which take time away from your core business.

2. File persistence is fine but not typically necessary. S3 works just fine.

It's easy to forget that most companies are a handful of people or just solo devs. At the same time, most money comes from the enterprise, so products that reach sufficient traction tend to shift their focus to serving the needs of these larger clients.

I'm really glad Heroku froze when it did. Markets always demand growth at all costs, and I find it incredibly refreshing that Heroku ended up staying in its lane. IMO it was and remains the best PaaS for indie devs and small teams.

replies(10): >>31391221 #>>31391236 #>>31391460 #>>31391578 #>>31391671 #>>31391717 #>>31391956 #>>31392086 #>>31392734 #>>31393610 #

tomjakubowski ◴[15 May 22 21:16 UTC] No.31391221[source]▶

>>31391163 #

> Multi-region deployment only work if your database is globally distributed too. However, making your database globally distributed creates a set of new problems, most of which take time away from your core business.

Guess what? fly.io offers a turnkey distributed/replicated Postgres for just this reason. You use an HTTP header to route writes to the region hosting your primary.

https://fly.io/docs/getting-started/multi-region-databases/

You do still need to consider the possibility of read replicas being behind the primary when designing your application. If your design considers that from day 1, I think it takes less away from solving your business problems.

Alternatively, you can also just ignore all the multi-region stuff and deploy to one place, as if it was old-school Heroku :-)

replies(1): >>31391321 #

nickjj ◴[15 May 22 21:27 UTC] No.31391321[source]▶

>>31391221 #

> Guess what? fly.io offers a turnkey distributed/replicated Postgres for just this reason. You use an HTTP header to route writes to the region hosting your primary.

Doesn't this take away a lot of the benefits of global distribution?

For example if you pay Fly hundreds of dollars a month to distribute your small app in a few datacenters around the globe but your primary DB is in California then everyone from the EU is going to have about 150-200ms round trip latency every time you write to your DB because you can't get around the limitations of the speed of light.

Now we're back to non-distributed latency times every time you want to write to the DB which is quite often in a lot of types of apps. If you want to cache mostly static read-only pages at the CDN level you can do this with a number of services.

Fly has about 20 datacenters, hosting a small'ish web app that's distributed across them will be over $200 / month without counting extra storage or bandwidth just for the web app portion. Their pg pricing isn't clear but a fairly small cluster is $33.40 / month for 2GB of memory and 40GB of storage. Based on their pricing page it sounds like that's the cost for 1 datacenter, so if you wanted read-replicas in a bunch of other places it adds up. Before you know it you might be at $500 / month to host something that will have similar latency on DB writes as a $20 / month DigitalOcean server that you self manage, Fly also charges you $2 / month per Let's Encrypt wildcard cert where as that's free from Let's Encrypt directly.

replies(5): >>31391373 #>>31391388 #>>31391477 #>>31392112 #>>31392717 #

manmal ◴[15 May 22 21:48 UTC] No.31391477[source]▶

>>31391321 #

You don’t need to route every write to primary though, but only those writes that have dependencies on other writes. Things like telemetry can be written in edge instances. Depends on your application of course, but in many cases that should be only a tiny fraction of all requests needing redirects to primary.

And why would you get 20 instances, all around the world right out of the gate? 6-7 probably do the job quite well, but maybe you don’t even need that many. Depending on where most of your customers are, you could get good results with 3-4 for most users.

replies(1): >>31391559 #

nickjj ◴[15 May 22 21:58 UTC] No.31391559[source]▶

>>31391477 #

> You don’t need to route every write to primary though, but only those writes that have dependencies on other writes.

Thanks, can you give an example of how that works? Did you write your own fork of Postgres or are you using a third party solution like BDR?

Also do you have a few use cases where you'd want writes being dependent on another write?

> 6-7 probably do the job quite well

You could, let's call it 5.

For a 2gb set up would that be about $50 for the web app, $50 for the background workers, $160ish for postgres and then $50 for Redis? We're still at $300+?

I was thinking maybe 5 background workers wasn't necessary but frameworks like Rails will put a bunch of things through a background worker where you would want low latency even if they're happening in the background because it's not only things like sending an email where it doesn't matter if it's delayed for 2 seconds behind the scenes. It's performing various Hotwire Turbo actions which render templates and modify records where you'd want to see those things reflected in the web UI as soon as possible.

replies(2): >>31391764 #>>31394018 #

nicoburns ◴[15 May 22 22:27 UTC] No.31391764[source]▶

>>31391559 #

For low-latency workers like that it might make sense to just run them on the same instance as the web servers.

replies(1): >>31392024 #

nickjj ◴[15 May 22 23:02 UTC] No.31392024[source]▶

>>31391764 #

Does Fly let you run multiple commands in separate Docker images? That's usually the pattern on how to run a web app + worker with Docker, as opposed to creating an init system in Docker and running (2) processes in 1 container (this goes against best practices). The Fly docs only mention the approach of using an init system inside of your image and also tries to talk you into running a separate VM[0] to keep your web app + worker isolated.

In either case I think the price still doubles because both your web app and worker need memory for a bunch of common set ups like Rails + Sidekiq, Flask / Django + Celery, etc..

[0]: https://fly.io/docs/app-guides/multiple-processes/

replies(3): >>31392041 #>>31392290 #>>31394646 #

1. nicoburns ◴[16 May 22 08:20 UTC] No.31394646[source]▶

>>31392024 #

Not sure how Ruby works, but can you not run the workers and the web server in the same process? In our Node.js apps, this is as simple as importing a function and calling it.

replies(1): >>31395364 #

2. nickjj ◴[16 May 22 10:32 UTC] No.31395364[source]▶

>>31394646 (TP) #

Most of the popular background workers in Ruby run as a separate process (Sidekiq, Resque, GoodJob). The same goes for using Celery with Python. I'm not sure about PHP but Laravel's docs mention running a separate command for the worker so I'm guessing that's also a 2nd process.

It's common to separate them due to either language limitations or to let you individually scale your workers vs your web apps since in a lot of cases you might be doing a lot of computationally intensive work in the workers and need more of them vs your web apps. Not just more in number of replicas but potentially a different class of compute resources too. Your wep apps might be humming along with a consistent memory / CPU usage but your workers might need double or triple the memory and better cpus.

replies(1): >>31397953 #

3. nicoburns ◴[16 May 22 14:35 UTC] No.31397953[source]▶

>>31395364 #

Yeah, it definitely makes sense to be able to scale workers and web processes separately. It just so happens that they app I work on for my day job is:

1. Fairly low traffic (requests per minute not requests per second except very occasional bursts)

2. Has somewhat prematurely been split into 6 microservices (used to be 10, but I've managed to rein that back a bit!). Which means despite running on the smallest instances available we are rather over-provisioned. We could likely move up one instances size and run absolutely everything on the one machine rather than having 12 separate instances!

3. Is for the most part only really using queue-tasks to keep request latency low.

Probably what would make most sense for us is to merge back in to a monolith, but continue to run web and worker processes separately I guess. But in general, I there is maybe a niche for running both together for apps with very small resource requirements.

↑