Most active commenters
  • fauigerzigerk(3)
  • bastawhiz(3)

←back to thread

637 points neilk | 14 comments | | HN request time: 1.798s | source | bottom
1. chacham15 ◴[] No.43554350[source]
So, if I understand correctly, the consistency model is essentially git. I.e. you have a local copy, makes changes to it, and then when its time to "push" you can get a conflict where you can "rebase" or "merge".

The problem here is that there is no way to cleanly detect a conflict. The documentation talks about pages which have changed, but a page changing isnt a good indicator of conflict. A conflict can happen due to a read conflict. E.g.

Update Customer Id: "UPDATE Customers SET id='bar' WHERE id='foo'; UPDATE Orders SET customerId='bar' WHERE customerId='foo'"

Add Customer Purchase: "SELECT id FROM Customers WHERE email="blah"; INSERT INTO Orders(customerId, ...) VALUES("foo", ...);"

If the update task gets committed first and the pages for the Orders table are full (i.e. inserting causes a new page to allocated) these two operations dont have any page conflicts, but the result is incorrect.\

In order to fix this, you would need to track the pages read during the transaction in which the write occurred, but that could easily end up being the whole table if the update column isnt part of an index (and thus requiring a table scan).

replies(2): >>43554511 #>>43554646 #
2. fulafel ◴[] No.43554511[source]
In git the rebase of course isn't a sound operation either, the merge is heuristic and you're liable to get conflicts or silent mismerges.

Some simple examples: https://www.caktusgroup.com/blog/2018/03/19/when-clean-merge...

3. ncruces ◴[] No.43554646[source]
They address this later on.

If strict serializability is not possible, because your changes are based on a snapshot that is already invalid, you can either replay (your local transactions are not durable, but system-wide you regain serializability) or merge (degrading to snapshot isolation).

As long as local unsynchronized transactions retain the page read set, and look for conflicts there, this should be sound.

replies(2): >>43555813 #>>43556414 #
4. fauigerzigerk ◴[] No.43555813[source]
What I find hard to imagine is how the app should respond when synchronisation fails after locally committing a bunch of transactions.

Dropping them all is technically consistent but it may be unsafe depending on the circumstances. E.g. a doc records an urgent referral but then the tx fails because admin staff has concurrently updated the patient's phone number or whatever. Automatically replaying is unsafe because consistency cannot be guaranteed.

Manual merging may be the only safe option in many cases. But how can the app reconstitute the context of those failed transactions so that users can review and revise? At the very least it would need access to a transaction ID that can be linked back to a user level entity, task or workflow. I don't think SQLite surfaces transaction IDs. So this would have to be provided by the Graft API I guess.

replies(1): >>43558296 #
5. bastawhiz ◴[] No.43556414[source]
> your local transactions are not durable

This manifests itself to the user as just data loss, though. You do something, it looks like it worked, but then it goes away later.

replies(1): >>43556755 #
6. ncruces ◴[] No.43556755{3}[source]
From the description, you can reapply transactions. How the system handles it (how much of it is up to the application, how much is handled in graft) I have no idea.
replies(1): >>43559031 #
7. NickM ◴[] No.43558296{3}[source]
What I find hard to imagine is how the app should respond when synchronisation fails after locally committing a bunch of transactions... Manual merging may be the only safe option in many cases.

Yeah, exactly right. This is why CRDTs are popular: they give you well-defined semantics for automatic conflict resolution, and save you from having to implement all that stuff from scratch yourself.

The author writes that CRDTs "don’t generalize to arbitrary data." This is true, and sometimes it may be easier to your own custom app-specific conflict resolution logic than massaging your data to fit within preexisting CRDTs, but doing that is extremely tricky to get right.

It seems like the implied tradeoff being made by Graft is "you can just keep using the same data formats you're already using, and everything just works!" But the real tradeoff is that you're going to have to write a lot of tricky, error-prone conflict resolution logic. There's no such thing as a free lunch, unfortunately.

replies(1): >>43559305 #
8. bastawhiz ◴[] No.43559031{4}[source]
What does that mean though? How can you possibly reapply a failed transaction later? The database itself can't possibly know how to reconcile that (if it did, it wouldn't have been a failure in the first place). So it has to be done by the application, and that isn't always possible. There is still always the possibility of unavoidable data loss.

"Consistency" is really easy, as it turns out, if you allow yourself to simply drop any inconvenient transactions at some arbitrary point in the future.

replies(1): >>43562926 #
9. fauigerzigerk ◴[] No.43559305{4}[source]
The problem I have with CRDTs is that while being conflict-free in a technical sense they don't allow me to express application level constraints.

E.g, how do you make sure that a hotel room cannot be booked by more than one person at a time or at least flag this situation as a constraint violation that needs manual intervention?

It's really hard to get anywhere close to the universal usefulness and simplicity of centralised transactions.

replies(2): >>43560002 #>>43571422 #
10. NickM ◴[] No.43560002{5}[source]
Yeah, this is a limitation, but generally if you have hard constraints like that to maintain, then yeah you probably should be using some sort of centralized transactional system to avoid e.g. booking the same hotel room to multiple people in the first place. Even with perfect conflict resolution, you don't want to tell someone their booking is confirmed and then later have to say "oh, sorry, never mind, somebody else booked that room and we just didn't check to verify that at the time."

But this isn't a problem specific to CRDTs, it's a limitation with any database that favors availability over consistency. And there are use cases that don't require these kinds of constraints where these limitations are more manageable.

replies(1): >>43560429 #
11. fauigerzigerk ◴[] No.43560429{6}[source]
I agree, hotel booking is not a great example.

I think CRDTs would be applicable to a wider range of applications if it was possible to specify soft constraints.

So after merging your changes you can query the CRDT for a list of constraint violations that need to be resolved.

12. kikimora ◴[] No.43562926{5}[source]
This! Solving merge conflicts in git is quite hard. Building an app such that it has a UI and use cases for merging every operation is just unrealistic. Perhaps if you limit yourself to certain domains like CRDTs or turn based games or data silos modified by only one customer it can be useful. I doubt it can work in general case.
replies(1): >>43563887 #
13. bastawhiz ◴[] No.43563887{6}[source]
The only situation I can think of where it's always safe is if the order that you apply changes to the state never matters:

- Each action increments or decrements a counter

- You have a log of timestamps of actions stored as a set

- etc.

If you can't model your changes to the data store as an unordered set of actions and have that materialize into state, you will have data loss.

Consider a scenario with three clients which each dispatch an action. If action 1 sets value X to true, action 2 sets it to true, and action 3 sets it to false, you have no way to know whether X should be true or false. Even with timestamps, unless you have a centralized writer you can't possibly know whether some/none/all of the timestamps that the clients used are accurate.

Truly a hard problem!

14. fatherzine ◴[] No.43571422{5}[source]
"How do you make sure that a hotel room cannot be booked by more than one person at a time" Excellent question! You don't. Instead, assuming a globally consistent transaction ordering, eg Spanner's TrueTime, but any uuid scheme suffices, it becomes a tradeoff between reconciliation latency and perceived unreliability. A room may be booked by several persons at a time, but eventually only one of them will win the reconciliation process.

    A: T.uuid3712[X] = reserve X
    ...
    B: T.uuid6214[X] = reserve X  // eventually loses to A because of uuid ordering
    ...
    A<-T.uuid6214[X]: discard T.uuid6214[X]
    ...
    B<-T.uuid3712[X]: discard T.uuid6214[X], B.notify(cancel T.uuid6214[X])
    -----
    A wins, B discards
The engineering challenge becomes to reduce the reconciliation latency window to something tolerable to users. If the reconciliation latency is small enough, then a blocking API can completely hide the unreliability from users.