Most active commenters

throwaway894345(7)
thayne(4)

SQLite concurrency and why you should care about it

(jellyfin.org)

Show context

mangecoeur ◴[01 Nov 25 15:14 UTC] No.45782293[source]▶

Sqlite is a great bit of technology but sometimes I read articles like this and think, maybe they should have used postgres. I you don’t specifically need the “one file portability” aspect of sqlite, or its not embedded (in which case you shouldn’t have concurrency issues), Postgres is easy to get running and solves these problems.

replies(11): >>45782439 #>>45782829 #>>45782906 #>>45782930 #>>45782932 #>>45783524 #>>45784757 #>>45784918 #>>45787275 #>>45788143 #>>45788886 #

thayne ◴[01 Nov 25 16:24 UTC] No.45782932[source]▶

>>45782293 #

Using postgres would make it significantly more complicated for Jellyfin users to install and set up Jellyfin. And then users would need to worry about migrating the databases when PostgreSQL has a major version upgrade. An embedded database like sqlite is a much better fit for something like Jellyfin.

replies(1): >>45783048 #

1. throwaway894345 ◴[01 Nov 25 16:36 UTC] No.45783048[source]▶

>>45782932 #

As a Jellyfin user, this hasn’t been my experience. I needed to do a fair bit of work to make sure Jellyfin could access its database no matter which node it was scheduled onto and that no more than one instance ever accessed the database at the same time. Jellyfin by far required more work to setup maintainably than any of the other applications I run, and it is also easily the least reliable application. This isn’t all down to SQLite, but it’s all down to a similar set of assumptions (exactly one application instance interacting with state over a filesystem interface).

replies(4): >>45784027 #>>45784408 #>>45785008 #>>45788178 #

2. thayne ◴[01 Nov 25 18:26 UTC] No.45784027[source]▶

>>45783048 (TP) #

Is running multiple nodes a typical way to run Jellyfin through? I would expect that most Jellyfin users only run a single instance at a time.

replies(1): >>45784622 #

3. stormbeard ◴[01 Nov 25 19:09 UTC] No.45784408[source]▶

>>45783048 (TP) #

Jellyfin isn’t meant to be some highly available distributed system, so of course this happens when you try to operate it like one. The typical user is not someone trying to run it via K8s.

replies(1): >>45784611 #

4. throwaway894345 ◴[01 Nov 25 19:35 UTC] No.45784611[source]▶

>>45784408 #

Yeah, I agree, though making software that can run in a distributed configuration is a matter of following a few basic principles, and would be far less work than what the developers have spent chasing down trying to make SQLite work for their application.

The effort required to put an application on Kubernetes is a pretty good indicator of software quality. In other words, I can have a pretty good idea about how difficult a software is to maintain in a single-instance configuration by trying to port it to Kubernetes.

5. throwaway894345 ◴[01 Nov 25 19:36 UTC] No.45784622[source]▶

>>45784027 #

Yes, but you have to go out of your way when writing software to make it so the software can only run on one node at a time. Or rather, well-architected software should require minimal, isolated edits to run in a distributed configuration (for example, replacing SQLite with a distributed SQLite).

replies(1): >>45786629 #

6. FrinkleFrankle ◴[01 Nov 25 20:24 UTC] No.45785008[source]▶

>>45783048 (TP) #

Care to share your setup?

7. thayne ◴[01 Nov 25 23:55 UTC] No.45786629{3}[source]▶

>>45784622 #

That's just not true. Distributed software is much more complicated and difficult than non-distributed software. Distributed systems have many failure modes that you don't have to worry about in non-distributed systems.

Now maybe you could have an abstraction layer over your storage layer that supports multiple data stores, including a distributed one. But that comes with tradeoffs, like being limited to the least common denominator of features of the data stores, and having to implement the abstraction layer for multiple data stores.

replies(1): >>45787727 #

8. throwaway894345 ◴[02 Nov 25 03:47 UTC] No.45787727{4}[source]▶

>>45786629 #

I’m a distributed systems architect. I design, build, and operate distributed systems.

> Distributed systems have many failure modes that you don't have to worry about in non-distributed systems.

Yes, but as previously mentioned, those failure modes are handled by abiding a few simple principles. It’s also worth noting that multiprocess or multithreaded software have many of the same failure modes, including the one discussed in this post. Architecting systems as though they are distributed largely takes care of those failure modes as well, making even single-node software like Jellyfin more robust.

> Now maybe you could have an abstraction layer over your storage layer that supports multiple data stores, including a distributed one. But that comes with tradeoffs, like being limited to the least common denominator of features of the data stores, and having to implement the abstraction layer for multiple data stores.

Generally I just target storage interfaces that can be easily distributed—things like Postgres (or maybe dqlite?) for SQL databases or an object storage API instead of a filesystem API. If you build a system like it could be distributed one day, you’ll end up with a simpler, more modular system even if you never scale to more than one node (maybe you just want to take advantage of parallelism on your single node, as was the case in this blog post).

replies(1): >>45791978 #

9. heavyset_go ◴[02 Nov 25 06:15 UTC] No.45788178[source]▶

>>45783048 (TP) #

Jellyfin isn't a Netflix replacement, it's a desktop application that's a web app by necessity. Treat it like a desktop app and you won't have these issues.

replies(1): >>45790397 #

10. throwaway894345 ◴[02 Nov 25 13:56 UTC] No.45790397[source]▶

>>45788178 #

They have clients for nearly every device; it’s clearly intended to be a streaming media server.

replies(1): >>45790903 #

11. heavyset_go ◴[02 Nov 25 15:16 UTC] No.45790903{3}[source]▶

>>45790397 #

It's a local media library manager in the same vein as media servers that came before it that were intended to run on desktops and serve up content to consoles and whatever on your LAN back when that was the thing to do.

My point is to treat it like software from that lineage and you won't have a problem, trying to treat it like something it's not, like a distributed web app, will lead to issues.

replies(1): >>45792426 #

12. thayne ◴[02 Nov 25 17:37 UTC] No.45791978{5}[source]▶

>>45787727 #

> just target storage interfaces that can be easily distributed—things like Postgres

But as I mentioned above, that makes the system more complicated for people who don't need it to be distributed.

Setting up separate db software, configuring the connection, handling separate updates, etc. is a lot more work for most users than Jellyfin just using a local embedded sqlite database. And it would probably make the application code more complicated as well.

replies(1): >>45792475 #

13. throwaway894345 ◴[02 Nov 25 18:45 UTC] No.45792426{4}[source]▶

>>45790903 #

It feels like we’re saying similar things. We both agree that its architecture makes it difficult to run with high availability, although I’ll point out that the issues documented in the article apply to single nodes and even on a single node it has pretty specific hardware requirements. I think we just disagree about whether “you have to hold it very carefully and it works just fine” is a good thing or not.

14. throwaway894345 ◴[02 Nov 25 18:52 UTC] No.45792475{6}[source]▶

>>45791978 #

> But as I mentioned above, that makes the system more complicated for people who don't need it to be distributed. Setting up separate db software, configuring the connection, handling separate updates, etc. is a lot more work for most users than Jellyfin just using a local embedded sqlite database.

You can package a Postgres database with your app just like SQLite. Users should not have to know that they are using Postgres much less configuring connections, handling updates, etc.

> And it would probably make the application code more complicated as well.

Not at all, this is an article about the hoops the application has to jump through to make SQLite behave well with parallel access. Postgres is designed for parallel access by default. It’s strictly simpler from the perspective of the application.

↑