Most active commenters

nickjj(6)
nicoburns(4)
tptacek(4)

Popular/hot comments

>>31392024 #

←back to thread

Fly.io: The reclaimer of Heroku's magic

(christine.website)

Show context

anon3949494 ◴[15 May 22 21:10 UTC] No.31391163[source]▶

>>31390506 (OP) #

After all the chatter this week, I've come to the conclusion that Heroku froze at the perfect time for my 4 person company. All of these so called "features" are exactly what we don't want or need.

1. Multi-region deployment only work if your database is globally distributed too. However, making your database globally distributed creates a set of new problems, most of which take time away from your core business.

2. File persistence is fine but not typically necessary. S3 works just fine.

It's easy to forget that most companies are a handful of people or just solo devs. At the same time, most money comes from the enterprise, so products that reach sufficient traction tend to shift their focus to serving the needs of these larger clients.

I'm really glad Heroku froze when it did. Markets always demand growth at all costs, and I find it incredibly refreshing that Heroku ended up staying in its lane. IMO it was and remains the best PaaS for indie devs and small teams.

replies(10): >>31391221 #>>31391236 #>>31391460 #>>31391578 #>>31391671 #>>31391717 #>>31391956 #>>31392086 #>>31392734 #>>31393610 #

tomjakubowski ◴[15 May 22 21:16 UTC] No.31391221[source]▶

>>31391163 #

> Multi-region deployment only work if your database is globally distributed too. However, making your database globally distributed creates a set of new problems, most of which take time away from your core business.

Guess what? fly.io offers a turnkey distributed/replicated Postgres for just this reason. You use an HTTP header to route writes to the region hosting your primary.

https://fly.io/docs/getting-started/multi-region-databases/

You do still need to consider the possibility of read replicas being behind the primary when designing your application. If your design considers that from day 1, I think it takes less away from solving your business problems.

Alternatively, you can also just ignore all the multi-region stuff and deploy to one place, as if it was old-school Heroku :-)

replies(1): >>31391321 #

nickjj ◴[15 May 22 21:27 UTC] No.31391321[source]▶

>>31391221 #

> Guess what? fly.io offers a turnkey distributed/replicated Postgres for just this reason. You use an HTTP header to route writes to the region hosting your primary.

Doesn't this take away a lot of the benefits of global distribution?

For example if you pay Fly hundreds of dollars a month to distribute your small app in a few datacenters around the globe but your primary DB is in California then everyone from the EU is going to have about 150-200ms round trip latency every time you write to your DB because you can't get around the limitations of the speed of light.

Now we're back to non-distributed latency times every time you want to write to the DB which is quite often in a lot of types of apps. If you want to cache mostly static read-only pages at the CDN level you can do this with a number of services.

Fly has about 20 datacenters, hosting a small'ish web app that's distributed across them will be over $200 / month without counting extra storage or bandwidth just for the web app portion. Their pg pricing isn't clear but a fairly small cluster is $33.40 / month for 2GB of memory and 40GB of storage. Based on their pricing page it sounds like that's the cost for 1 datacenter, so if you wanted read-replicas in a bunch of other places it adds up. Before you know it you might be at $500 / month to host something that will have similar latency on DB writes as a $20 / month DigitalOcean server that you self manage, Fly also charges you $2 / month per Let's Encrypt wildcard cert where as that's free from Let's Encrypt directly.

replies(5): >>31391373 #>>31391388 #>>31391477 #>>31392112 #>>31392717 #

1. manmal ◴[15 May 22 21:48 UTC] No.31391477[source]▶

>>31391321 #

You don’t need to route every write to primary though, but only those writes that have dependencies on other writes. Things like telemetry can be written in edge instances. Depends on your application of course, but in many cases that should be only a tiny fraction of all requests needing redirects to primary.

And why would you get 20 instances, all around the world right out of the gate? 6-7 probably do the job quite well, but maybe you don’t even need that many. Depending on where most of your customers are, you could get good results with 3-4 for most users.

replies(1): >>31391559 #

2. nickjj ◴[15 May 22 21:58 UTC] No.31391559[source]▶

>>31391477 (TP) #

> You don’t need to route every write to primary though, but only those writes that have dependencies on other writes.

Thanks, can you give an example of how that works? Did you write your own fork of Postgres or are you using a third party solution like BDR?

Also do you have a few use cases where you'd want writes being dependent on another write?

> 6-7 probably do the job quite well

You could, let's call it 5.

For a 2gb set up would that be about $50 for the web app, $50 for the background workers, $160ish for postgres and then $50 for Redis? We're still at $300+?

I was thinking maybe 5 background workers wasn't necessary but frameworks like Rails will put a bunch of things through a background worker where you would want low latency even if they're happening in the background because it's not only things like sending an email where it doesn't matter if it's delayed for 2 seconds behind the scenes. It's performing various Hotwire Turbo actions which render templates and modify records where you'd want to see those things reflected in the web UI as soon as possible.

replies(2): >>31391764 #>>31394018 #

3. nicoburns ◴[15 May 22 22:27 UTC] No.31391764[source]▶

>>31391559 #

For low-latency workers like that it might make sense to just run them on the same instance as the web servers.

replies(1): >>31392024 #

4. nickjj ◴[15 May 22 23:02 UTC] No.31392024{3}[source]▶

>>31391764 #

Does Fly let you run multiple commands in separate Docker images? That's usually the pattern on how to run a web app + worker with Docker, as opposed to creating an init system in Docker and running (2) processes in 1 container (this goes against best practices). The Fly docs only mention the approach of using an init system inside of your image and also tries to talk you into running a separate VM[0] to keep your web app + worker isolated.

In either case I think the price still doubles because both your web app and worker need memory for a bunch of common set ups like Rails + Sidekiq, Flask / Django + Celery, etc..

[0]: https://fly.io/docs/app-guides/multiple-processes/

replies(3): >>31392041 #>>31392290 #>>31394646 #

5. tptacek ◴[15 May 22 23:05 UTC] No.31392041{4}[source]▶

>>31392024 #

It sounds like you're asking if we offer some alternative between running multiple processes in a VM, and running multiple VMs for multiple processes. What's the third option you're looking for? Are you asking if you can run Docker inside a VM, and parcel that single VM out that way? You've got root in a full-fledged Linux VM, so you can do that.

replies(1): >>31392126 #

6. nickjj ◴[15 May 22 23:18 UTC] No.31392126{5}[source]▶

>>31392041 #

> Are you asking if you can run Docker inside a VM, and parcel that single VM out that way? You've got root in a full-fledged Linux VM, so you can do that.

On a single server VPS I'd use Docker Compose and up the project to run multiple containers.

On a multi-server set up I'd use Kubernetes and set up a deployment for each long running container.

On Heroku I'd use a Procfile to spin up web / workers as needed.

The Fly docs say if you have 1 Docker image you need to run an init system in the Docker image and manage that in your image, it also suggests not using 2 processes in 1 VM and recommends spinning up 1 VM per process.

I suppose I was looking for an easy solution to run multiple processes in 1 VM (in this case multiple Docker containers). The other 3 solutions are IMO easy because once you learn how they work you depend on the happy path of those tools using the built in mechanisms they support. In the Fly case, not even the docs cover how to do it other than rolling your own init system in Docker.

If you have root, can I run docker-compose up in a Fly VM? Will it respect things like graceful timeouts out of the box? Does it support everything Docker Compose supports in the context of that single VM?

replies(2): >>31392212 #>>31392380 #

7. tptacek ◴[15 May 22 23:31 UTC] No.31392212{6}[source]▶

>>31392126 #

The document you cited (I wrote it!) is entirely about the different ways to run multiple processes in 1 VM.

There's no reason I can see why you couldn't run a VM that itself ran Docker, and have docker-compose run at startup. I wouldn't recommend it? It's kind of a lot of mechanism for a simple problem. I'd just use a process supervisor instead. But you could do it, and maybe I'm wrong and docker-compose is good for this.

What you can't do is use docker-compose to boot up a bunch of different containers in different VMs on Fly.io.

replies(1): >>31394655 #

8. throwaway892238 ◴[15 May 22 23:42 UTC] No.31392290{4}[source]▶

>>31392024 #

It's interesting that their bash init uses fg %1. That may return only on the first process changing state, rather than either process exiting. It should probably use this instead:

  #!/usr/bin/env bash
  /app/server &
  /app/server -bar &
  wait -f -n -p app ; rc=$?
  printf "%s: Application '%s' exited: status '%i'\n" "$0" "$app" "$rc"
  exit $rc

replies(1): >>31392320 #

9. tptacek ◴[15 May 22 23:46 UTC] No.31392320{5}[source]▶

>>31392290 #

That looks a million times better than the horrible hack I wrote. Do you want credit for it when I fix the doc?

replies(1): >>31405815 #

10. mrkurt ◴[15 May 22 23:55 UTC] No.31392380{6}[source]▶

>>31392126 #

This is embarrassingly non obvious in the docs, but you can run workers/web just like you would on Heroku: https://community.fly.io/t/preview-multi-process-apps-get-yo...

Most people run workers in their primary region with the writable DB, then distribute their web/DB read replicas.

11. manmal ◴[16 May 22 06:01 UTC] No.31394018[source]▶

>>31391559 #

> Thanks, can you give an example of how that works?

I just noticed I formulated it wrong, my apologies. What I meant is that the replicating regions don’t need to wait for the primary writes to go through before they respond to clients. They will still be read-only Postgres replicas, and info could be shuttled to primary in a fire-and-forget manner, if that’s an option.

Whenever an instance notices that it‘s not primary, but it is currently dealing with a critical write, it can refuse to handle the request, and return a 409 with the fly-replay header that specifies the primary region. Their infra will replay the original request in the specified region.

> Did you write your own fork of Postgres or are you using a third party solution like BDR?

When using fly.io, the best option would probably be to use their postgres cluster service which supports read-only replicas (can take a few seconds for updates to reach replicas): https://fly.io/docs/getting-started/multi-region-databases/

> For a 2gb set up would that be about $50 for the web app, $50 for the background workers, $160ish for postgres and then $50 for Redis? We're still at $300+?

Maybe. A few thoughts:

- Why would you need 5 web workers, would one running on primary not be ideal? If you need so much compute for background work, then that’s not fly‘s fault, I guess.

- Not sure the Postgres read replicas would need to be as powerful as primary

- Crazy idea: Use SQLite (replicated with Litestream) instead of Redis and save 50 bucks

replies(1): >>31404535 #

12. nicoburns ◴[16 May 22 08:20 UTC] No.31394646{4}[source]▶

>>31392024 #

Not sure how Ruby works, but can you not run the workers and the web server in the same process? In our Node.js apps, this is as simple as importing a function and calling it.

replies(1): >>31395364 #

13. nicoburns ◴[16 May 22 08:23 UTC] No.31394655{7}[source]▶

>>31392212 #

I think docker-compose is pretty good at this. One advantage is that you get a development environment and a production setup in a singe config file.

I feel like this setup might make quite a lot of sense if you have a bunch of micro services that are small enough that they can share resources.

14. nickjj ◴[16 May 22 10:32 UTC] No.31395364{5}[source]▶

>>31394646 #

Most of the popular background workers in Ruby run as a separate process (Sidekiq, Resque, GoodJob). The same goes for using Celery with Python. I'm not sure about PHP but Laravel's docs mention running a separate command for the worker so I'm guessing that's also a 2nd process.

It's common to separate them due to either language limitations or to let you individually scale your workers vs your web apps since in a lot of cases you might be doing a lot of computationally intensive work in the workers and need more of them vs your web apps. Not just more in number of replicas but potentially a different class of compute resources too. Your wep apps might be humming along with a consistent memory / CPU usage but your workers might need double or triple the memory and better cpus.

replies(1): >>31397953 #

15. nicoburns ◴[16 May 22 14:35 UTC] No.31397953{6}[source]▶

>>31395364 #

Yeah, it definitely makes sense to be able to scale workers and web processes separately. It just so happens that they app I work on for my day job is:

1. Fairly low traffic (requests per minute not requests per second except very occasional bursts)

2. Has somewhat prematurely been split into 6 microservices (used to be 10, but I've managed to rein that back a bit!). Which means despite running on the smallest instances available we are rather over-provisioned. We could likely move up one instances size and run absolutely everything on the one machine rather than having 12 separate instances!

3. Is for the most part only really using queue-tasks to keep request latency low.

Probably what would make most sense for us is to merge back in to a monolith, but continue to run web and worker processes separately I guess. But in general, I there is maybe a niche for running both together for apps with very small resource requirements.

16. nickjj ◴[17 May 22 00:09 UTC] No.31404535{3}[source]▶

>>31394018 #

> Why would you need 5 web workers, would one running on primary not be ideal?

It's not ideal due to some frameworks using background jobs to handle pushing events through to your web UI, such as broadcasting changes over websockets with Hotwire Turbo.

The UI would update when that job completes and if you only have 1 worker then it's back to waiting 100-350ms to reach the primary worker to see UI changes based on your location which loses the appeal of global distribution. You might as well consider running everything on 1 DigitalOcean server for 15x less at this point and bypass the idea of global distribution if your goal was to reduce latency for your visitors.

> Crazy idea: Use SQLite (replicated with Litestream) instead of Redis and save 50 bucks

A number of web frameworks let you use Redis as a session, cache and job queue back-end with no alternatives (or having to make pretty big compromises to use a SQL DB as an alternative). Also, Rails depends on Redis for Action Cable, swapping that for SQLite isn't an option.

17. throwaway892238 ◴[17 May 22 03:43 UTC] No.31405815{6}[source]▶

>>31392320 #

Only if it's credited to either "IPBH" or "Some Bash-loving troll on Hacker News" (ninja edit, sry)

replies(1): >>31405913 #

18. tptacek ◴[17 May 22 04:03 UTC] No.31405913{7}[source]▶

>>31405815 #

Done!

↑