Launch HN: Regatta Storage (YC F24) – Turn S3 into a local-like, POSIX cloud FS

1. weinzierl ◴[18 Nov 24 18:18 UTC] No.42175235[source]▶

The title says POSIX but then it talks about NFS. So, what is it? Does it guarantee all POSIX semantics or not?

2. huntaub ◴[18 Nov 24 18:21 UTC] No.42175277[source]▶

You are correct in that NFS is not strictly-speaking POSIX compliant to the letter of the law, due to the caching behavior. This is an NFSv3 file system, so it shares those semantics. The point that I'm trying to emphasize is that the file system supports standard file operations which aren't possible through other FUSE adapters, or possible to perform efficiently on S3 (such as append, rename, and symbolic links) -- which provides broad compatibility with file-based applications.

replies(1): >>42175728 #

3. weinzierl ◴[18 Nov 24 19:03 UTC] No.42175728[source]▶

>>42175277 #

Which is nice and useful of course but there is ton of things that can't reliably be done with that (like running any database you that comes to mind) which makes it important to be precise here.

replies(1): >>42175820 #

4. huntaub ◴[18 Nov 24 19:13 UTC] No.42175820{3}[source]▶

>>42175728 #

Is there something specific that you worry about when running a database on a networked file system? I would imagine that any database which is correctly fsync'ing the data to the write-ahead-log should work just fine.

replies(1): >>42189101 #

5. weinzierl ◴[19 Nov 24 23:16 UTC] No.42189101{4}[source]▶

>>42175820 #

First of all databases don't support running on NFS. It is an unsupported configuration.

The deeper reason for that is, that the consistency guarantees from NFS (close-to-open consistency) are a lot weaker than what you get from POSIX.

replies(1): >>42192720 #

6. huntaub ◴[20 Nov 24 11:00 UTC] No.42192720{5}[source]▶

>>42189101 #

I don’t know if I agree, for example, Postgres has this [1] to say about using NFS as the backing store. I think that part of the challenge is that there are so many implementation details that differ between NFS servers and many configuration options that teams can fiddle with (Postgres specifically calls out “async” as dangerous). Close to open semantics are actually stronger than what something like XFS offers (because XFS isn’t required to flush data to disk on file close), and databases should be fsyncing their write ahead logs from the application layer. Like said though, this doesn’t mean that there aren’t certain configurations of NFS which won’t work (async for example means that NFS servers won’t actually write to non-volatile storage on fsync, which is of course dangerous for any application).

[1] https://www.postgresql.org/docs/current/creating-cluster.htm...