Running an open source app: Usage, costs and community donations

1. LVB ◴[18 Oct 24 18:20 UTC] No.41882049[source]▶

I'm always curious what folks use for their database for things like this. Even though I like SQLite--a lot--my preference has become that the app is generally separate and mostly stateless. Almost always the data is the most important thing, so I like being able to expand/replace/trash the app infra at will with no worries.

Thought about maybe running a Postgres VPS, but I've enjoyed using neon.tech more than I expected (esp. the GUI and branching). I guess the thing that has crept in: speed/ease is really beating out my ingrained cheapness as I've gotten older and have less time. A SaaS DB has sped things up. Still don't like the monthly bills & variability though.

replies(3): >>41882319 #>>41882456 #>>41884198 #

2. mtlynch ◴[18 Oct 24 18:48 UTC] No.41882319[source]▶

>>41882049 (TP) #

>Almost always the data is the most important thing, so I like being able to expand/replace/trash the app infra at will with no worries.

Have you used SQLite with Litestream? That's the beauty of it. You can blow away the app and deploy it somewhere else, and Litestream will just pull down your data and continue along as if nothing happened.

At the top of this post, I show a demo of attaching Litestream to my app, and then blowing away my Heroku instance and redeploying a clean instance on Fly.io, and Litestream ports all the data along with the new deployment:

https://mtlynch.io/litestream/

replies(3): >>41882996 #>>41883366 #>>41883489 #

3. j45 ◴[18 Oct 24 19:05 UTC] No.41882456[source]▶

>>41882049 (TP) #

It's trivial to run mysql (or Perforce variant) or Postgres, with some minor caching for simple apps.

I'm not sure what you are hitting that would go past the capacity of a small vps.

Independent VPs for DB make sense, but if the requests are reasonably cached, you can get away with it (and beef up the backups) especially if it's something non-critical.

replies(1): >>41883032 #

4. LVB ◴[18 Oct 24 20:07 UTC] No.41882996[source]▶

>>41882319 #

I'm currently using SQLite + Litestream with one app, though it's strictly Litestream as a backup/safety net and I'd be manually standing the thing back up if it came to building the server anew, as that's not automated.

If anything, I'd probably end up looking at a dedicated PG VPS. I've started to get used to a few Postgres conveniences over SQLite, especially around datetimes, various extensions, and more sophisticated table alterations (without that infamous SQLite 12-step process), etc. So that's been an evolution, too, compared to my always-SQLite days.

5. LVB ◴[18 Oct 24 20:12 UTC] No.41883032[source]▶

>>41882456 #

Definitely considering a dedicated Postgres VPS. I've not looked yet, but I'd like to locate a decent cookbook around this. I've installed Postgres on a server before for playing around, and it was easy enough. But there are a lot of settings, considerations around access and backups and updates, etc. I suspect these things aren't overly thorny, but some of the guides/docs can make it feel that way. We'll see, as it's an area of interest, for sure.

replies(1): >>41887406 #

6. Cheer2171 ◴[18 Oct 24 20:48 UTC] No.41883366[source]▶

>>41882319 #

> No, my app never talks to a remote database server.

> It’s a simple, open-source tool that replicates a SQLite database to Amazon’s S3 cloud storage.

That was a very long walk to get to that second quote. And it makes the first quote feel deceptive.

replies(2): >>41884286 #>>41884605 #

7. rkwz ◴[18 Oct 24 21:06 UTC] No.41883489[source]▶

>>41882319 #

This is a well written guide, thanks!

replies(1): >>41884669 #

8. jwells89 ◴[18 Oct 24 22:50 UTC] No.41884198[source]▶

>>41882049 (TP) #

Spinning up a VPS for things like this is tempting to me too, but not having done significant backend work in over a decade my worry would be with administering it — namely keeping it up to date, secure, and configured correctly (initial setup is easy). What's the popular way of handling that these days?

replies(1): >>41886233 #

9. metabeard ◴[18 Oct 24 23:04 UTC] No.41884286{3}[source]▶

>>41883366 #

S3 is acting as a backup, not a data store. The SQLite file is local to the app. This recent post discusses the tradeoffs in more detail and includes metrics. https://fractaledmind.github.io/2024/10/16/sqlite-supercharg...

10. mtlynch ◴[19 Oct 24 00:16 UTC] No.41884605{3}[source]▶

>>41883366 #

Thanks for the feedback!

Can you share a bit more about why you feel it's deceptive?

The point I was trying to make is that database servers are relatively complex and expensive. S3 is still a server, but it's static storage, which is about as simple and cheap as it gets for a remote service.

Was it that I could have been clearer about the distinction? Or was the distinction clear but feels like not a big difference?

replies(1): >>41885044 #

11. mtlynch ◴[19 Oct 24 00:31 UTC] No.41884669{3}[source]▶

>>41883489 #

Cool, glad to hear it was useful!

12. ◴[19 Oct 24 01:54 UTC] No.41885044{4}[source]▶

>>41884605 #

13. ggpsv ◴[19 Oct 24 07:19 UTC] No.41886233[source]▶

>>41884198 #

Every case is different but as a baseline, you could use Ubuntu or Debian for automatic security upgrades via unattended-upgrades[0], harden ssh by allowing only pubkey authentication, disallow all public incoming connections in the firewall except for https traffic if you're serving a public service, everything else (ssh, etc) can go over wireguard (tailscale makes this easy). Use a webserver like nginx or caddy for tls termination, serving static assets, and proxying requests to an application listening on localhost or wireguard.

[0]: https://wiki.debian.org/UnattendedUpgrades

14. lucw ◴[19 Oct 24 12:22 UTC] No.41887406{3}[source]▶

>>41883032 #

I went through this around a year ago. I wanted to postgres for django apps, and I didn't want to pay the insane prices required by cloud providers for a replicated setup. I wanted a replicated setup on hetzner VMs and I wanted full control over the backup process. I wanted the deployment to be done using ansible, and I wanted my database servers to be stateless. If you vaporize both my heztner postgres VMs simultaneously, I lose one minute of data. (If I just lose the primary I probably lose less than a second of data due to realtime replication).

I'll be honest it's not documented as well as it could, some concepts like the archive process and the replication setup took me a while to understand. I also had trouble understanding what roles the various tools played. Initially I thought I could roll my own backup but then later deployed pgBackrest. I deployed and destroyed VMs countless times (my ansible playbook does everything from VM creation on proxmox / hetzner API to installing postgres, setting up replication).

What is critical is testing your backup and recovery. Start writing some data. Blow up your database infra. See if you can recover. You need a high degree of automation in your deployment in order to gain confidence that you won't lose data.

My deployment looks like this: * two Postgres 16 instances, one primary, one replica (realtime replication) * both on Debian 12 (most stable platform for Postgres according to my research) * ansible playbooks for initial deployment as well as failover * archive file backups to rsync.net storage space (with zfs snapshots) every minute * full backups using pgBackrest every 24hrs, stored to rsync.net, wasabi, and hetzner storage box.

As you can guess, it was kind of a massive investment and forced me to become a sysadmin / DBA for a while (though I went the devops route with full ansible automation and automated testing). I gained quite a bit of knowledge which is great. But I'll probably have to re-design and seriously test at the next postgres major release. Sometimes I wonder whether I should have just accepted the cost of cloud postgres deployments.

replies(1): >>41889313 #

15. LVB ◴[19 Oct 24 17:48 UTC] No.41889313{4}[source]▶

>>41887406 #

I've got a less robust version of this (also as Ansible -> Hetzner) that I've toyed with. I'm often tempted to progress it, but I've realized it is a distraction. I say that about me, and not too negatively. I know that I want to get some apps done, but the sysadmin-y stuff is pretty fun and alluring but it can chew up a lot of time.

Currently I'm viewing the $19 plan from Neon as acceptable (I just look in my Costco cart for comparison) for me now. Plus, I'm getting something for my money beyond not having to build it myself: branching. This has proved way handier than I'd expected as a solo dev and I use it all the time. A DIY postgres wouldn't have that, at least not as cleanly.

If charges go much beyond the $19 and it is still just me faffing about, I'll probably look harder at the DIY PG. OTOH if there is good real world usage and/or $ coming in, then it's easier to view Neon as just a cost of business (within reason).