Bcachefs may be headed out of the kernel

(lwn.net)

144 points ksec | 1 comments | 04 Jul 25 13:32 UTC | HN request time: 0.206s | source

Show context

msgodel ◴[04 Jul 25 17:53 UTC] No.44466535[source]▶

The older I get the more I feel like anything other than the ExtantFS family is just silly.

The filesystem should do files, if you want something more complex do it in userspace. We even have FUSE if you want to use the Filesystem API with your crazy network database thing.

replies(3): >>44466685 #>>44466895 #>>44467306 #

yjftsjthsd-h ◴[04 Jul 25 18:42 UTC] No.44466895[source]▶

>>44466535 #

I mean, I'd really like some sort of data error detection (and ideally correction). If a disk bitflips one of my files, ext* won't do anything about it.

replies(3): >>44467338 #>>44468600 #>>44469211 #

timewizard ◴[04 Jul 25 19:49 UTC] No.44467338[source]▶

>>44466895 #

> some sort of data error detection (and ideally correction).

That's pretty much built into most mass storage devices already.

> If a disk bitflips one of my files

The likelihood and consequence of this occurring is in many situations not worth the overhead of adding additional ECC on top of what the drive does.

> ext* won't do anything about it.

What should it do? Blindly hand you the data without any indication that there's a problem with the underlying block? Without an fsck what mechanism do you suppose would manage these errors as they're discovered?

replies(3): >>44467434 #>>44467818 #>>44468075 #

throw0101d ◴[04 Jul 25 20:02 UTC] No.44467434[source]▶

>>44467338 #

>> > some sort of data error detection (and ideally correction).

> That's pretty much built into most mass storage devices already.

And ZFS has shown that it is not sufficient (at least for some use-cases, perhaps less of a big deal for 'residential' users).

> The likelihood and consequence of this occurring is in many situations not worth the overhead of adding additional ECC on top of what the drive does.

Not worth it to whom? Not having the option available at all is the problem. I can do a zfs set checksum=off pool_name/dataset_name if I really want that extra couple percentage points of performance.

> Without an fsck what mechanism do you suppose would manage these errors as they're discovered?

Depends on the data involved: if it's part of the file system tree metadata there are often multiple copies even for a single disk on ZFS. So instead of the kernel consuming corrupted data and potentially panicing (or going off into the weeds) it can find a correct copy elsewhere.

If you're in a fancier configuration with some level of RAID, then there could be other copies of the data, or it could be rebuilt through ECC.

With ext*, LVM, and mdadm no such possibility exists because there are no checksums at any of those layers (perhaps if you glom on dm-integrity?).

And with ZFS one can set copies=2 on a per-dataset basis (perhaps just for /home?), and get multiple copies strewn across the disk: won't save you from a drive dying, but could save you from corruption.

replies(2): >>44468039 #>>44469707 #

1. yjftsjthsd-h ◴[04 Jul 25 21:34 UTC] No.44468039[source]▶

>>44467434 #

> (perhaps if you glom on dm-integrity?).

I looked at that, in hopes of being able to protect my data. Unfortunately, I considered this something of a fatal flaw:

> It uses journaling for guaranteeing write atomicity by default, which effectively halves the write speed.

- https://wiki.archlinux.org/title/Dm-integrity

↑