←back to thread

144 points ksec | 7 comments | | HN request time: 0.001s | source | bottom
Show context
msgodel ◴[] No.44466535[source]
The older I get the more I feel like anything other than the ExtantFS family is just silly.

The filesystem should do files, if you want something more complex do it in userspace. We even have FUSE if you want to use the Filesystem API with your crazy network database thing.

replies(3): >>44466685 #>>44466895 #>>44467306 #
yjftsjthsd-h ◴[] No.44466895[source]
I mean, I'd really like some sort of data error detection (and ideally correction). If a disk bitflips one of my files, ext* won't do anything about it.
replies(3): >>44467338 #>>44468600 #>>44469211 #
timewizard ◴[] No.44467338[source]
> some sort of data error detection (and ideally correction).

That's pretty much built into most mass storage devices already.

> If a disk bitflips one of my files

The likelihood and consequence of this occurring is in many situations not worth the overhead of adding additional ECC on top of what the drive does.

> ext* won't do anything about it.

What should it do? Blindly hand you the data without any indication that there's a problem with the underlying block? Without an fsck what mechanism do you suppose would manage these errors as they're discovered?

replies(3): >>44467434 #>>44467818 #>>44468075 #
1. ars ◴[] No.44467818[source]
> The likelihood .. of this occurring

That's 10^14 bits for a consumer drive. That's just 12TB. A heavy user (lots of videos or games) would see a bit flip a couple times a year.

replies(3): >>44468204 #>>44469358 #>>44469681 #
2. magicalhippo ◴[] No.44468204[source]
I do monthly scrubs on my NAS, I have 8 14-20TB drives that are quite full.

According to that 10^14 metric I should see read errors just about every month. Except I have just about zero.

Current disks are ~4 years, runs 24/7, and excluding a bad cable incident I've had a single case of a read error (recoverable, thanks ZFS).

I suspect those URE numbers are made by the manufacturers figuring out they can be sure the disk will do 10^14, but they don't actually try to find the real number because 10^14 is good enough.

replies(2): >>44469199 #>>44474491 #
3. ars ◴[] No.44469199[source]
If you are using enterprise drives those are 10^16, so that might explain it.
replies(1): >>44469334 #
4. magicalhippo ◴[] No.44469334{3}[source]
Fair, newest ones are, but two of my older current drives are IronWolfs 16TB which are 10^15 in the specs[1], and they've been running for 5.4 years. Again without any read errors, monthly scrubs, and of course daily use.

And before that I have been using 8x WD Reds 3TB for 6-7 years, which have 10^14 in the specs[2], and had the same experience with those.

Yes smaller size, but I ran scrubbing on those biweekly, and over so many years?

[1]: https://www.seagate.com/files/www-content/datasheets/pdfs/ir...

[2]: https://documents.westerndigital.com/content/dam/doc-library...

5. Dylan16807 ◴[] No.44469358[source]
I'm not really sure how you're supposed to interpret those error rates. The average read error probably has a lot more than 1 flipped bit, right? And if the average error affects 50 bits, then you'd expect 50x fewer errors? But I have no idea what the actual histogram looks like.
6. timewizard ◴[] No.44469681[source]
Is that raw error rate or uncorrected error rate?
7. ryao ◴[] No.44474491[source]
> I suspect those URE numbers are made by the manufacturers figuring out they can be sure the disk will do 10^14, but they don't actually try to find the real number because 10^14 is good enough.

I am inclined to agree. However, I have one thought to the contrary. When a mechanical drive is failing, you tend to have debris inside the drive hitting the platters, causing damage that creates more debris, accelerating the drive’s eventual death, with read errors becoming increasingly common while it happens. When those are included in averages, the 10^14 might very well be accurate. I have not done any rigorous analysis to justify this thought and I do not have the data to be able to do that analysis. It is just something that occurs to me that might justify the 10^14 figure.