←back to thread

204 points mfiguiere | 1 comments | | HN request time: 0.199s | source
Show context
GauntletWizard ◴[] No.43540328[source]
When I had the brief displeasure of working on HDFS at Facebook, we took a series of customer meetings to figure out how to get our oldest customers to upgrade their clusters. I was in a meeting with the photos team about what their requirements were and what was blocking them from upgrading, and they were very frank - they asked if the upgrade preserved the internal struct types associated with blocks on the disc servers. They didn't actually use hdfs as a file system, they allocated 1 GB files with zero replication, then used deep reflection to find the extent that comprised them on the discful storage servers, then built their own archival backup file system on top of that. I was horrified. The some of the older hats on the team were less surprised, having had some inkling of what was going on, even though they clearly didn't understand the details. Others considered it tantamount to sacrilege.

I think about this a lot. What they had built was probably actually the best distributed file system within Facebook. It was similarly structured to unraid, and had good availability, durability, and space saving properties, but the approach to engineering was just so wrong headed in my opinion that I couldn't stomach it. Talking about it with other Java programmers within facebook, nobody seemed to mind. Final was just a hint after all.

replies(2): >>43540831 #>>43541532 #
1. dapperdrake ◴[] No.43541532[source]
This approach is surprisingly (or unsurprisingly) one of the most robust.