←back to thread

240 points yusufaytas | 1 comments | | HN request time: 0s | source
Show context
jmull ◴[] No.41895002[source]
This overcomplicates things...

* If you have something like what the article calls a fencing token, you don't need any locks.

* The token doesn't need to be monotonically increasing, just a passive unique value that both the client and storage have.

Let's call it a version token. It could be monotonically increasing, but a generated UUID, which is typically easier, would work too. (Technically, it could even be a hash of all the data in the store, though that's probably not practical.) The logic becomes:

(1) client retrieves the current version token from storage, along with any data it may want to modify. There's no external lock, though the storage needs to retrieve the data and version token atomically, ensuring the token is specifically for the version of the data retrieved.

(2) client sends the version token back along with any changes.

(3) Storage accepts the changes if the current token matches the one passed with the changes and creates a new version token (atomically, but still no external locks).

Now, you can introduce locks for other reasons (hopefully goods ones... they seem to be misused a lot). Just pointing out they are/should be independent of storage integrity in a distributed system.

(I don't even like the term lock, because they are temporary/unguaranteed. Lease or reservation might be a term that better conveys the meaning.)

replies(6): >>41895192 #>>41895264 #>>41895382 #>>41895448 #>>41895475 #>>41895513 #
1. cnlwsu ◴[] No.41895448[source]
You’re describing compare and swap which is a good solution. You’re pushing complexity down to the database, and remember this is distributed locking. When you have a single database it’s simple until the database crashes leaving you in state of not knowing which of your CAS writes took effect. In major systems that demand high availability and multi datacenter backups this becomings pretty complicated with scenarios that break this as well around node failure. Usually some form of paxos transaction log is used. Never assume there is an easy solution in distributed systems… it just always sucks