How to do distributed locking (2016)

(martin.kleppmann.com)

244 points yusufaytas | 3 comments | 20 Oct 24 10:38 UTC | HN request time: 0.759s | source

Show context

jmull ◴[20 Oct 24 12:46 UTC] No.41895002[source]▶

This overcomplicates things...

* If you have something like what the article calls a fencing token, you don't need any locks.

* The token doesn't need to be monotonically increasing, just a passive unique value that both the client and storage have.

Let's call it a version token. It could be monotonically increasing, but a generated UUID, which is typically easier, would work too. (Technically, it could even be a hash of all the data in the store, though that's probably not practical.) The logic becomes:

(1) client retrieves the current version token from storage, along with any data it may want to modify. There's no external lock, though the storage needs to retrieve the data and version token atomically, ensuring the token is specifically for the version of the data retrieved.

(2) client sends the version token back along with any changes.

(3) Storage accepts the changes if the current token matches the one passed with the changes and creates a new version token (atomically, but still no external locks).

Now, you can introduce locks for other reasons (hopefully goods ones... they seem to be misused a lot). Just pointing out they are/should be independent of storage integrity in a distributed system.

(I don't even like the term lock, because they are temporary/unguaranteed. Lease or reservation might be a term that better conveys the meaning.)

replies(6): >>41895192 #>>41895264 #>>41895382 #>>41895448 #>>41895475 #>>41895513 #

wh0knows ◴[20 Oct 24 13:18 UTC] No.41895192[source]▶

>>41895002 #

This neglects the first reason listed in the article for why you would use a lock.

> Efficiency: Taking a lock saves you from unnecessarily doing the same work twice (e.g. some expensive computation). If the lock fails and two nodes end up doing the same piece of work, the result is a minor increase in cost (you end up paying 5 cents more to AWS than you otherwise would have) or a minor inconvenience (e.g. a user ends up getting the same email notification twice).

I think multiple nodes doing the same work is actually much worse than what’s listed, as it would inhibit you from having any kind of scalable distributed processing.

replies(2): >>41895289 #>>41895299 #

1. karmakaze ◴[20 Oct 24 13:37 UTC] No.41895289[source]▶

>>41895192 #

As mentioned in the article, a non-100%-correct lock can be used for efficiency purposes. So basically use an imperfect locking mechanism for efficiency and a reliable one for correctness.

replies(1): >>41895335 #

2. jmull ◴[20 Oct 24 13:47 UTC] No.41895335[source]▶

>>41895289 (TP) #

> and a reliable one for correctness

To be clear, my point is don't use distributed locking for correctness. There are much better options.

Now, the atomicity I mention implies some kind of internal synchronization mechanism for multiple requests, which could be based on locks, but those would be real, non-distributed ones.

replies(1): >>41895657 #

3. ◴[20 Oct 24 14:40 UTC] No.41895657[source]▶

>>41895335 #

↑