Turso SQLite Offline Sync Public Beta

1. billconan ◴[31 Mar 25 15:30 UTC] No.43536188[source]▶

This sounds great, but I have some questions regarding data integrity and security.

If I build an offline first app using Turso, will my client directly exchange data with the database, without a layer of backend APIs to guarantee data integrity and security? For example, certain db write is only permitted for certain users, but when the db API is exposed, will that cause problems? A concrete example would be a forum where only moderators can remove users and posts. Say if I build an offline first forum, can a hacker hack the database on the filesystem and utilize the syncing feature to propagate the hacked data to the server?

replies(9): >>43536366 #>>43536534 #>>43536576 #>>43536993 #>>43537308 #>>43537313 #>>43537393 #>>43539446 #>>43540237 #

2. justanotheratom ◴[31 Mar 25 15:47 UTC] No.43536366[source]▶

>>43536188 (TP) #

That is a very crucial question. I am also interested in the answer.

Perhaps they have RLS type policies that are only modifiable on the server.

3. tracker1 ◴[31 Mar 25 16:02 UTC] No.43536534[source]▶

>>43536188 (TP) #

I'm pretty sure you'll have to write parts of your app against your own APIs that represent the owner of the db for a group.

With Turso, you would want a model that had, for example a db per user and one per group. With the turso model you want to think something closer to sharding by hand for secure write user or group.

I could be wrong on this though. That's just my rough understanding.

4. thisislife2 ◴[31 Mar 25 16:05 UTC] No.43536576[source]▶

>>43536188 (TP) #

I'd have thought that in this day and age every developer would know by now the importance of sanitizing user input before a web application accepts it? Your doubt has given me some pause ...

replies(2): >>43536776 #>>43538162 #

5. wahnfrieden ◴[31 Mar 25 16:22 UTC] No.43536776[source]▶

>>43536576 #

No need to give a rude, condescending and unhelpful answer - there will always be people learning

replies(1): >>43553116 #

6. krashidov ◴[31 Mar 25 16:40 UTC] No.43536993[source]▶

>>43536188 (TP) #

This is my problem with these local first libraries. What happens if there's some data that needs to live in a db that's separate from the replicated sqlite db?

What I would really love is a sync engine library that is agnostic of your database.

Haven't really seen one yet.

replies(1): >>43537249 #

7. vekker ◴[31 Mar 25 17:05 UTC] No.43537249[source]▶

>>43536993 #

Exactly. So many local first libs don't cover this that it makes me wonder if the applications I am typically working on are so fundamentally different from what the local-first devs are normally building?

Most apps have user data that needs to be (partially or fully) shielded from other users. Yet, most local-first libs neglect to explain how to implement this with their libraries, or sometimes it's an obscure page or footnote somewhere in their docs, as if this is just an afterthought...

replies(1): >>43539403 #

8. franciscop ◴[31 Mar 25 17:11 UTC] No.43537308[source]▶

>>43536188 (TP) #

The blog post doesn't even touch on write conflicts, which is the main reason I opened it (I was curious on how they solved them), so not surprised there's no many details about security etc.

9. refulgentis ◴[31 Mar 25 17:12 UTC] No.43537313[source]▶

>>43536188 (TP) #

You raise an interesting point, that along with the replies, compels me to note that all of this stuff is bespoke, and things that sound simple like "I just want a good syncing library" are intractable in practice.

Ex. if I'm doing a document-based app, users can have at it, corrupt their own data all they want.

I honestly cannot wrap my mind around discussions re: SQLite x web dev, perhaps because I've been in mobile dev: but I don't even know what it'd mean to have an "offline-first forum" that syncs state: it's a global object with shared state rendered on the client.

When you set aside the implications introduced by using a hack scenario, a simpler question emerges: How would my clients sync the whole forum back to the cloud? Generally, my inclination is to handwave about users being able to make posts and have it "just work", after all, can't Turo help with simple scenarios like a posts table that has a date column? That makes it virtually conflict free...but my experience is "virtually" bites you, hard.

replies(1): >>43537367 #

10. ◴[31 Mar 25 17:16 UTC] No.43537367[source]▶

>>43537313 #

11. nightowl_games ◴[31 Mar 25 17:18 UTC] No.43537393[source]▶

>>43536188 (TP) #

Honestly this is so simple and core to the idea that I literally just assume it's handled.

12. setr ◴[31 Mar 25 18:32 UTC] No.43538162[source]▶

>>43536576 #

If the database is local, your web app database access is local. It can be modified and changed by the user, unlike code hosted on the web server, and any sanitization can thus be bypassed.

Meaning the user has effectively direct access to the underlying local database. Which, if blindly and totally synced, gives the user effectively direct access to the central database.

I'd have thought that in this day and age every developer would know by now the importance of not trusting frontend validation in a web application? your doubt has given me some pause.

replies(1): >>43553099 #

13. ochiba ◴[31 Mar 25 20:15 UTC] No.43539403{3}[source]▶

>>43537249 #

It's definitely quite a hard engineering problem to solve, if you try to cover a wide range of use cases, and layer on top of that things like permissions/authorization and scalability

14. ochiba ◴[31 Mar 25 20:19 UTC] No.43539446[source]▶

>>43536188 (TP) #

I am not sure about Turso but I've seen a few different approaches to this with other sync engine architectures:

1. At a database level: Using something like RLS in Postgres

2. At a backend level: The sync engine processes write operations via the backend API, where custom validation and authorization logic can be applied.

3. At a sync engine level: If the sync engine processes the write operations, there can be some kind of authorization layer similar to RLS enforced by the sync engine on the backend.

15. aboodman ◴[31 Mar 25 21:35 UTC] No.43540237[source]▶

>>43536188 (TP) #

Yes, this is a central issue in sync. For most applications, sync engines just aren't useful without some solution. Of course you need to validate inputs, support fine-grained permissions, etc., as developers have done with web apps for eons.

In Replicache, we addressed this by making your application server responsible for writes:

https://doc.replicache.dev/concepts/how-it-works

By doing this, your server can implement any validation it wants. It can also interact with external systems, do notifications, etc. Anything you can do with a traditional API.

In our new sync engine, Zero (https://zerosync.dev), we're adding this same ability soon (like this week) under the name custom mutators:

https://bugs.rocicorp.dev/issue/3045

This has been a hard project, but is really critical to use sync engines for anything serious.

replies(1): >>43540551 #

16. isaachinman ◴[31 Mar 25 22:14 UTC] No.43540551[source]▶

>>43540237 #

Happy user of Replicache. You and the team got it right.

17. thisislife2 ◴[02 Apr 25 02:13 UTC] No.43553099{3}[source]▶

>>43538162 #

any sanitization can thus be bypassed. - Then you are obviously not doing it properly. It should also be obvious nobody is talking about frontend validation when talking about syncing a database.

replies(1): >>43571339 #

18. thisislife2 ◴[02 Apr 25 02:16 UTC] No.43553116{3}[source]▶

>>43536776 #

It wasn't an answer - it was a comment adding to his question expressing my surprise that developers still are making this kind of mistake.

replies(1): >>43553743 #

19. wahnfrieden ◴[02 Apr 25 04:44 UTC] No.43553743{4}[source]▶

>>43553116 #

I know

20. setr ◴[03 Apr 25 15:42 UTC] No.43571339{4}[source]▶

>>43553099 #

So when you say “sanitize user input”, you meant “store unsanitized/unvalidated user input in the local DB, and then sanitize it on sync to the central server”? You’ll need a hook into the syncing process to do that.

Perhaps something like “a layer of backend APIs to guarantee data integrity and security”?

This is a sync between a local database (read: on the user’s machine) and a central one (read: on your fancy server). The whole point of introducing a local database is to make database writes happen locally… on the frontend. everything related to the app, including database writes, is happening on the user’s machine. The only time you have a backend that you actually own and control is on database sync between local and central.