←back to thread

144 points ksec | 1 comments | | HN request time: 0.286s | source
Show context
msgodel ◴[] No.44466535[source]
The older I get the more I feel like anything other than the ExtantFS family is just silly.

The filesystem should do files, if you want something more complex do it in userspace. We even have FUSE if you want to use the Filesystem API with your crazy network database thing.

replies(3): >>44466685 #>>44466895 #>>44467306 #
yjftsjthsd-h ◴[] No.44466895[source]
I mean, I'd really like some sort of data error detection (and ideally correction). If a disk bitflips one of my files, ext* won't do anything about it.
replies(3): >>44467338 #>>44468600 #>>44469211 #
timewizard ◴[] No.44467338[source]
> some sort of data error detection (and ideally correction).

That's pretty much built into most mass storage devices already.

> If a disk bitflips one of my files

The likelihood and consequence of this occurring is in many situations not worth the overhead of adding additional ECC on top of what the drive does.

> ext* won't do anything about it.

What should it do? Blindly hand you the data without any indication that there's a problem with the underlying block? Without an fsck what mechanism do you suppose would manage these errors as they're discovered?

replies(3): >>44467434 #>>44467818 #>>44468075 #
1. yjftsjthsd-h ◴[] No.44468075[source]
To your first couple points: I trust hardware less than you.

> What should it do? Blindly hand you the data without any indication that there's a problem with the underlying block?

Well, that's what it does now, and I think that's a problem.

> Without an fsck what mechanism do you suppose would manage these errors as they're discovered?

Linux can fail a read, and IMHO should do so if it cannot return correct data. (I support the ability to override this and tell it to give you the corrupted data, but certainly not by default.) On ZFS, if a read fails its checksum, the OS will first try to get a valid copy (ex. from a mirror or if you've set copies=2), and then if the error can't be recovered then the file read fails and the system reports/records the failure, at which point the user should probably go do a full scrub (which for our purposes should probably count as fsck) and restore the affected file(s) from backup. (Or possibly go buy a new hard drive, depending on the extent of the problem.) I would consider that ideal.