←back to thread

573 points huntaub | 3 comments | | HN request time: 0.638s | source

Hey HN, I’m Hunter the founder of Regatta Storage (https://regattastorage.com). Regatta Storage is a new cloud file system that provides unlimited pay-as-you-go capacity, local-like performance, and automatic synchronization to S3-compatible storage. For example, you can use Regatta to instantly access massive data sets in S3 with Spark, Pytorch, or pandas without paying for large, local disks or waiting for the data to download.

Check out an overview of how the service works here: https://www.youtube.com/watch?v=xh1q5p7E4JY, and you can try it for free at https://regattastorage.com after signing up for an account. We wanted to let you try it without an account, but we figured that “Hacker News shares a file system and S3 bucket” wouldn’t be the best experience for the community.

I built Regatta after spending nearly a decade building and operating at-scale cloud storage at places like Amazon’s Elastic File System (EFS) and Netflix. During my 8 years at EFS, I learned a lot about how teams thought about their storage usage. Users frequently told me that they loved how simple and scalable EFS was, and -- like S3 -- they didn’t have to guess how much capacity they needed up front.

When I got to Netflix, I was surprised that there wasn’t more usage of EFS. If you looked around, it seemed like a natural fit. Every application needed a POSIX file system. Lots of applications had unclear or spikey storage needs. Often, developers wanted their storage to last beyond the lifetime of an individual instance or container. In fact, if you looked across all Netflix applications, some ridiculous amount of money was being spent on empty storage space because each of these local drives had to be overprovisioned for potential usage.

However, in many cases, EFS wasn’t the perfect choice for these workloads. Moving workloads from local disks to NFS often encountered performance issues. Further, applications which treated their local disks as ephemeral would have to manually “clean up” left over data in a persistent storage system.

At this point, I realized that there was a missing solution in the cloud storage market which wasn’t being filled by either block or file storage, and I decided to build Regatta.

Regatta is a pay-as-you-go cloud file system that automatically expands with your application. Because it automatically synchronizes with S3 using native file formats, you can connect it to existing data sets and use recently written file data directly from S3. When data isn’t actively being used, it’s removed from the Regatta cache, so you only pay for the backing S3 storage. Finally, we’re developing a custom file protocol which allows us to achieve local-like performance for small-file workloads and Lustre-like scale-out performance for distributed data jobs.

Under the hood, customers mount a Regatta file system by connecting to our fleet of caching instances over NFSv3 (soon, our custom protocol). Our instances then connect to the customer’s S3 bucket on the backend, and provide sub-millisecond cached-read and write performance. This durable cache allows us to provide a strongly consistent, efficient view of the file system to all connected file clients. We can perform challenging operations (like directory renaming) quickly and durably, while they asynchronously propagate to the S3 bucket.

We’re excited to see users share our vision for Regatta. We have teams who are using us to build totally serverless Jupyter notebook servers for their AI researchers who prefer to upload and share data using the S3 web UI. We have teams who are using us as a distributed caching layer on top of S3 for low-latency access to common files. We have teams who are replacing their thin-provisioned Ceph boot volumes with Regatta for significant savings. We can’t wait to see what other things people will build and we hope you’ll give us a try at regattastorage.com.

We’d love to get any early feedback from the community, ideas for future direction, or experiences in this space. I’ll be in the comments for the next few hours to respond!

Show context
Melonotromo ◴[] No.42185334[source]
Your pricepoint is very bad. The overprovicioning statement in your Post indicated that you would be a 'cheap' alternative but 100gb for $5?

I'm also not sure that its a good architecture to have your servers inbetween my S3. If i'm on one cloud provider, the traffic between their S3 compatible solution and my infrastructure is most of the time in the same cloud provider. And if not, i will for sure have a local cache rcloning the stuff from left to right.

I also don't get your calculator at all.

replies(1): >>42185428 #
1. huntaub ◴[] No.42185428[source]
Thanks for the feedback. If price is the single blocker for teams to try the product, I'd love to discuss more. Please send me an email at hleath [at] regattastorage.com.

> If i'm on one cloud provider, the traffic between their S3 compatible solution and my infrastructure is most of the time in the same cloud provider

This is exactly right, and it's why we're working to deploy our infrastructure to every major cloud. We don't want customers paying egress costs or cross-cloud latency to use Regatta.

> I also don't get your calculator at all.

This could probably use a bit more explanation on the website. We're comparing to the usage of local devices. We find that, most often, teams will only use 15% of the EBS volumes that they've purchased (over a monthly time period). This means that instead of paying $0.125/GiB-mo of storage (like io2 offers), they're actually paying $0.833/GiB-mo of actual bytes stored ($0.125/15%). Whereas on Regatta, they're only paying for what they use -- which is a combination of our caching layer ($0.20) and S3 ($0.025). That averages out closer to $0.10/GiB stored, depending on the amount of data that you use.

replies(1): >>42185478 #
2. Melonotromo ◴[] No.42185478[source]
What is then your initial latency if i start an AI job 'fresh'? You still need to hit the backend right? How long do you then keep this data in your cache?

Btw. while your experience works well for Netflix, in my company (also very big), we have LoBs and while different teams utilize their storage in a different way, none of us are aligned on a level that we would benefit directly from your solution.

From a pure curiosity point of view: Do you have already enough customers which have savings? What are their use cases? The size of their setups?

replies(1): >>42186239 #
3. huntaub ◴[] No.42186239[source]
> What is then your initial latency if i start an AI job 'fresh'? You still need to hit the backend right? How long do you then keep this data in your cache?

That's correct, and it's something that we can tune if there's a specific need. For AI use cases specifically, we're working on adding functionality to "pre-load" the cache with your data. For example, you would be able to call an API that says "I'm about to start a job and I need this directory on the cache". We would then be able to fan out our infrastructure to download that data very quickly (think hundreds of GiB/s) -- much faster than any individual instance could download the data. Then your job would be able to access the data set at low-latency. Does that sound like it would make sense for you?

> Btw. while your experience works well for Netflix, in my company (also very big), we have LoBs and while different teams utilize their storage in a different way, none of us are aligned on a level that we would benefit directly from your solution.

I'm not totally sure what you mean here. I don't anticipate that a large organization would have to 100% buy-in to Regatta in order to get benefits. In fact, this is the reason why we are so intent on having a serverless product that "scales to 0". That would allow each of your teams to independently try Regatta without needing to spend hundreds of thousands of dollars on something Day 1 for the entire company.

> From a pure curiosity point of view: Do you have already enough customers which have savings? What are their use cases? The size of their setups?

These are pretty intimate details about the business, and I don't think I can share very specific data. However, yes -- we do have customers who are realizing massive savings (50%+) over their existing set ups.