Launch HN: Regatta Storage (YC F24) – Turn S3 into a local-like, POSIX cloud FS

587 points huntaub | 1 comments | 18 Nov 24 16:49 UTC | HN request time: 0.214s | source

Hey HN, I’m Hunter the founder of Regatta Storage (https://regattastorage.com). Regatta Storage is a new cloud file system that provides unlimited pay-as-you-go capacity, local-like performance, and automatic synchronization to S3-compatible storage. For example, you can use Regatta to instantly access massive data sets in S3 with Spark, Pytorch, or pandas without paying for large, local disks or waiting for the data to download.

Check out an overview of how the service works here: https://www.youtube.com/watch?v=xh1q5p7E4JY, and you can try it for free at https://regattastorage.com after signing up for an account. We wanted to let you try it without an account, but we figured that “Hacker News shares a file system and S3 bucket” wouldn’t be the best experience for the community.

I built Regatta after spending nearly a decade building and operating at-scale cloud storage at places like Amazon’s Elastic File System (EFS) and Netflix. During my 8 years at EFS, I learned a lot about how teams thought about their storage usage. Users frequently told me that they loved how simple and scalable EFS was, and -- like S3 -- they didn’t have to guess how much capacity they needed up front.

When I got to Netflix, I was surprised that there wasn’t more usage of EFS. If you looked around, it seemed like a natural fit. Every application needed a POSIX file system. Lots of applications had unclear or spikey storage needs. Often, developers wanted their storage to last beyond the lifetime of an individual instance or container. In fact, if you looked across all Netflix applications, some ridiculous amount of money was being spent on empty storage space because each of these local drives had to be overprovisioned for potential usage.

However, in many cases, EFS wasn’t the perfect choice for these workloads. Moving workloads from local disks to NFS often encountered performance issues. Further, applications which treated their local disks as ephemeral would have to manually “clean up” left over data in a persistent storage system.

At this point, I realized that there was a missing solution in the cloud storage market which wasn’t being filled by either block or file storage, and I decided to build Regatta.

Regatta is a pay-as-you-go cloud file system that automatically expands with your application. Because it automatically synchronizes with S3 using native file formats, you can connect it to existing data sets and use recently written file data directly from S3. When data isn’t actively being used, it’s removed from the Regatta cache, so you only pay for the backing S3 storage. Finally, we’re developing a custom file protocol which allows us to achieve local-like performance for small-file workloads and Lustre-like scale-out performance for distributed data jobs.

Under the hood, customers mount a Regatta file system by connecting to our fleet of caching instances over NFSv3 (soon, our custom protocol). Our instances then connect to the customer’s S3 bucket on the backend, and provide sub-millisecond cached-read and write performance. This durable cache allows us to provide a strongly consistent, efficient view of the file system to all connected file clients. We can perform challenging operations (like directory renaming) quickly and durably, while they asynchronously propagate to the S3 bucket.

We’re excited to see users share our vision for Regatta. We have teams who are using us to build totally serverless Jupyter notebook servers for their AI researchers who prefer to upload and share data using the S3 web UI. We have teams who are using us as a distributed caching layer on top of S3 for low-latency access to common files. We have teams who are replacing their thin-provisioned Ceph boot volumes with Regatta for significant savings. We can’t wait to see what other things people will build and we hope you’ll give us a try at regattastorage.com.

We’d love to get any early feedback from the community, ideas for future direction, or experiences in this space. I’ll be in the comments for the next few hours to respond!

Show context

memset ◴[18 Nov 24 17:31 UTC] No.42174697[source]▶

>>42174204 (OP) #

This is honestly the coolest thing I've seen coming out of YC in years. I have a bunch of questions which are basically related to "how does it work" and please pardon me if my questions are silly or naive!

1. If I had a local disk which was 10 GB, what happens when I try to contend with data in the 50 GB range (as in, more that could be cached locally?) Would I immediately see degradation, or thrashing, at the 10 GB mark?

2. Does this only work in practice on AWS instances? As in, I could run it on a different cloud, but in practice we only really get fast speeds due to running everything within AWS?

3. I've always had trouble with FUSE in different kinds of docker environments. And it looks like you're using both FUSE and NFS mounts. How does all of that work?

4. Is the idea that I could literally run Clickhouse or Postgres with a regatta volume as the backing store?

5. I have to ask - how do you think about open source here?

6. Can I mount on multiple servers? What are the limits there? (ie, a lambda function.)

I haven't played with the so maybe doing so would help answer questions. But I'm really excited about this! I have tried using EFS for small projects in the past but - and maybe I was holding it wrong - I could not for the life of me figure out what I needed to get faster bandwidth, probably because I didn't know how to turn the knobs correctly.

replies(1): >>42174791 #

huntaub ◴[18 Nov 24 17:39 UTC] No.42174791[source]▶

>>42174697 #

Wow, thanks for the nice note! No questions are silly, and I'll also note that we now have a docs site (https://docs.regattastorage.com) and feel free to email me (hleath [at] regattastorage.com) if I don't fully address your questions.

> If I had a local disk which was 10 GB, what happens when I try to contend with data in the 50 GB range (as in, more that could be cached locally?) Would I immediately see degradation, or thrashing, at the 10 GB mark?

We don't actually do caching on your instance's disk. Instead, data is cached in the Linux page cache (in memory) like a regular hard drive, and Regatta provides a durable, shared cache that automatically expands with the working set size of your application. For example, if you were trying to work with data in the 50 GiB range, Regatta would automatically cache all 50 GiB -- allowing you to access it with sub-millisecond latency.

> Does this only work in practice on AWS instances? As in, I could run it on a different cloud, but in practice we only really get fast speeds due to running everything within AWS?

For now, yes -- the speed is highly dependent on latency -- which is highly dependent on distance between your instance and Regatta. Today, we are only in AWS, but we are looking to launch in other clouds by the end of the year. Shoot me an email if there's somewhere specifically that you're interested in.

> I've always had trouble with FUSE in different kinds of docker environments. And it looks like you're using both FUSE and NFS mounts. How does all of that work?

There are a couple of different questions bundled together in this. Today, Regatta exposes an NFSv3 file system that you can mount. We are working on a new protocol which will be mounted via FUSE. However, in Docker environments, we also provide a CSI driver (for use with K8s) and a Docker volume plugin (for use with just Docker) that handles the mounting for you. We haven't released these publicly yet, so shoot me an email if you want early access.

> Is the idea that I could literally run Clickhouse or Postgres with a regatta volume as the backing store?

Yes, you should be able to run a database on Regatta.

> I have to ask - how do you think about open source here?

We are in the process of open sourcing all of the client code (CSI driver, mount helper, FUSE), but we don't have plans currently to open source the server code. We see the value of Regatta in managing the infrastructure so you don't have to, and if we release it via open-source, it would be difficult to run on your own.

> Can I mount on multiple servers? What are the limits there? (ie, a lambda function.)

Yes, you can mount on multiple servers simultaneously! We haven't specifically stress-tested the number of clients we support, but we should be good for O(100s) of mounts. Unfortunately, AWS locks down Lambda so we can't mount arbitrary file systems in that environment specifically.

> efs performance

Yes, the challenge here is specifically around the semantics of NFS itself and the latency of the EFS service. We think we have a path to solving both of these in the next month or two.

replies(4): >>42175084 #>>42177548 #>>42179172 #>>42179460 #

memset ◴[18 Nov 24 18:07 UTC] No.42175084[source]▶

>>42174791 #

Thank you for the detailed answers! Honestly, this project inspires me to work on infrastructure problems.

So you are saying that regatta's own SaaS infrastructure provides the disk caching layer. So you all make sure the pipe between my AWS instance and your servers are very fast and "infinitely scalable", and then the sync to S3 happens after the fact.

replies(1): >>42175112 #

huntaub ◴[18 Nov 24 18:09 UTC] No.42175112[source]▶

>>42175084 #

That's exactly right!

replies(1): >>42183374 #

gregw2 ◴[19 Nov 24 13:48 UTC] No.42183374[source]▶

>>42175112 #

So Regatta has an in memory cache? Does the posix disk write only suceed when the data is in more than one availability zone?

replies(1): >>42184421 #

1. huntaub ◴[19 Nov 24 15:23 UTC] No.42184421[source]▶

>>42183374 #

Hey there! Today, we are replicating cache data within a single availability zone, but we’re working on a multi-availability zone product. If you have a need for multi-AZ, please shoot me an email at hleath [at] regattastorage.com, I’d love to learn more

↑