←back to thread

573 points huntaub | 4 comments | | HN request time: 1.205s | source

Hey HN, I’m Hunter the founder of Regatta Storage (https://regattastorage.com). Regatta Storage is a new cloud file system that provides unlimited pay-as-you-go capacity, local-like performance, and automatic synchronization to S3-compatible storage. For example, you can use Regatta to instantly access massive data sets in S3 with Spark, Pytorch, or pandas without paying for large, local disks or waiting for the data to download.

Check out an overview of how the service works here: https://www.youtube.com/watch?v=xh1q5p7E4JY, and you can try it for free at https://regattastorage.com after signing up for an account. We wanted to let you try it without an account, but we figured that “Hacker News shares a file system and S3 bucket” wouldn’t be the best experience for the community.

I built Regatta after spending nearly a decade building and operating at-scale cloud storage at places like Amazon’s Elastic File System (EFS) and Netflix. During my 8 years at EFS, I learned a lot about how teams thought about their storage usage. Users frequently told me that they loved how simple and scalable EFS was, and -- like S3 -- they didn’t have to guess how much capacity they needed up front.

When I got to Netflix, I was surprised that there wasn’t more usage of EFS. If you looked around, it seemed like a natural fit. Every application needed a POSIX file system. Lots of applications had unclear or spikey storage needs. Often, developers wanted their storage to last beyond the lifetime of an individual instance or container. In fact, if you looked across all Netflix applications, some ridiculous amount of money was being spent on empty storage space because each of these local drives had to be overprovisioned for potential usage.

However, in many cases, EFS wasn’t the perfect choice for these workloads. Moving workloads from local disks to NFS often encountered performance issues. Further, applications which treated their local disks as ephemeral would have to manually “clean up” left over data in a persistent storage system.

At this point, I realized that there was a missing solution in the cloud storage market which wasn’t being filled by either block or file storage, and I decided to build Regatta.

Regatta is a pay-as-you-go cloud file system that automatically expands with your application. Because it automatically synchronizes with S3 using native file formats, you can connect it to existing data sets and use recently written file data directly from S3. When data isn’t actively being used, it’s removed from the Regatta cache, so you only pay for the backing S3 storage. Finally, we’re developing a custom file protocol which allows us to achieve local-like performance for small-file workloads and Lustre-like scale-out performance for distributed data jobs.

Under the hood, customers mount a Regatta file system by connecting to our fleet of caching instances over NFSv3 (soon, our custom protocol). Our instances then connect to the customer’s S3 bucket on the backend, and provide sub-millisecond cached-read and write performance. This durable cache allows us to provide a strongly consistent, efficient view of the file system to all connected file clients. We can perform challenging operations (like directory renaming) quickly and durably, while they asynchronously propagate to the S3 bucket.

We’re excited to see users share our vision for Regatta. We have teams who are using us to build totally serverless Jupyter notebook servers for their AI researchers who prefer to upload and share data using the S3 web UI. We have teams who are using us as a distributed caching layer on top of S3 for low-latency access to common files. We have teams who are replacing their thin-provisioned Ceph boot volumes with Regatta for significant savings. We can’t wait to see what other things people will build and we hope you’ll give us a try at regattastorage.com.

We’d love to get any early feedback from the community, ideas for future direction, or experiences in this space. I’ll be in the comments for the next few hours to respond!

Show context
krawczstef ◴[] No.42174389[source]
Does this compete with Minio?
replies(1): >>42174419 #
1. huntaub ◴[] No.42174419[source]
I don't think so, I see them as complementary. MinIO is great when you have downstream applications which speak the S3 API that need acceleration of that data. Regatta is designed for applications which speak file semantics (think, application logging, storing corpuses of training data, or state) that doesn't run on the S3 API. Regatta actually supports MinIO as an S3-compatible backend for your file system!
replies(2): >>42174615 #>>42177108 #
2. mbreese ◴[] No.42174615[source]
I think it’s more analogous to Minio’s discontinued proxy mode. This is where you’d talk to minio locally (using whatever interface/protocol) and it would act as a local cage for S3 objects. If you wrote to it, it would propagate the changes up to S3 proper (or whomever using the S3 protocol).

I believe they stopped supporting that mode because they didn’t want to keep chasing every S3 protocol change. However, if you’re just using S3, and not trying to masquerade as S3, this problem becomes easier.

3. Jugurtha ◴[] No.42177108[source]
I think it's complementary as well, even more so after MinIO deprecating its Gateway and Filesystem modes a couple of years ago. MinIO is "S3 compatible" object storage, so technically, MinIO users should be able to use your product to have a file-system like experience on their buckets and objects, although you're using IAM and there might be a need either for your client to handle pure S3 credentials, either for a third-party plugin to your client to do that. It could be a good opportunity to piggyback on MinIO's userbase.

We had built an MLOps platform[0] a few years ago and enabled users to use their S3 buckets in a "file system like" manner. This made it possible for them not to have to know or write S3 specific code in their Jupyter notebooks as most people in the industry did with boto3, which also forced them to write code (say using TensorFlow) in a certain way for training to consume the files, err, objects. It was a mess, and we removed that for notebooks that could run the same way on a laptop or on the platform, even with the shell kernel so people could explore objects like files. MLFlow could work on a filesystem or on S3, but it had no authentication, so we built around that to know which user/experiment produced which artifact.

MinIO had a Gateway that was deprecated. We didn't use it much and they didn't have an admin client at the time, so I rolled one up to orchestrate the thing.

One way I did it that hook into users' compute and storage as opposed to offering storage/compute was for two reasons:

- Organizations already had their data somewhere with established policies. Getting them to move that data is very hard (CISO, CTO, IT, legal, engineers). Friction would have been huge.

- Organizations already had budgeted compute and storage, they may have had contracts/discounts/credits with cloud providers and it didn't make sense to ask them to make a decision on budgeting for another solution.

- A design principle of having the product being able to die without leaving the users scrambling to exfil/migrate data.

One way to do it was to handle FUSE, and your mileage may vary (s3fs-fuse, goofys, etc). Amazon has released Mountpoint last year[1], and one question you'll get asked is why use Regatta when I could use Mountpoint?

Less friction for engineers and execs.

In any way, congratulations on the launch, man!

[0]: https://web.archive.org/web/20230325150132/https://iko.ai/

[1]: https://aws.amazon.com/blogs/aws/mountpoint-for-amazon-s3-ge...

replies(1): >>42177486 #
4. huntaub ◴[] No.42177486[source]
We are finding a lot of success in the ML Ops space for exactly this reason. I also completely agree that enterprise customers want to keep their data where they can govern and audit it (often in S3). We're excited about the possibility to allow folks to access and use that data while it stays in S3 for primary storage.

I agree around the questions with Mountpoint, and we're solving a very different set of problems than Mountpoint. Mountpoint, for example, isn't designed to be used with all file applications and lacks support for things like appends to existing files, random writes, renames, and symbolic links. On the other hand, Regatta supports POSIX semantics and can work with nearly all file based applications.