What's important here with S3FS is that it supports (1) a fuse client - it just makes life so much easiter - and (2) NVMe storage - so that training pipelines aren't Disk I/O bound (you can't always split files small enough and parallel reading/writing enough to a S3 object store).
Disclaimer: i worked on HopsFS. HopsFS adds tiered storage - NVMe for recent data and S3 for archival.
[ref]: https://www.hopsworks.ai/post/scalable-metadata-the-new-bree...