←back to thread

621 points sebg | 1 comments | | HN request time: 0.218s | source
Show context
londons_explore ◴[] No.43716756[source]
This seems like a pretty complex setup with lots of features which aren't obviously important for a deep learning workload.

Presumably the key necessary features are PB's worth of storage, read/write parallelism (can be achieved by splitting a 1PB file into say 10,000 100GB shards, and then having each client only read the necessary shards), and redundancy

Consistency is hard to achieve and seems to have no use here - your programmers can manage to make sure different processes are writing to different filenames.

replies(2): >>43717054 #>>43717178 #
sungam ◴[] No.43717054[source]
I wonder whether it may have been originally developed for the quantitive hedge fund
replies(2): >>43717714 #>>43724356 #
1. ammo1662 ◴[] No.43724356[source]
Yes, as I mentioned in other comments. The 3FS was designed in 2019. You can check [0] (Chinese)

[0] https://www.high-flyer.cn/blog/3fs/