Popular/hot comments

(lwn.net)

Show context

criticalfault ◴[04 Jul 25 17:57 UTC] No.44466573[source]▶

I've been following this for a while now.

Kent is in the wrong. Having a lead position in development I would kick Kent of the team.

One thing is to challenge things. What Kent is doing is something completely different. It is obvious he introduced a feature, not only a Bugfix.

If the rules are set in a way that rc1+ gets only Bugfixes, then this is absolutely clear what happens with the feature. Tolerating this once or twice is ok, but Kent is doing this all the time, testing Linus.

Linus is absolutely in the right to kick this out and it's Kent's fault if he does so.

replies(8): >>44466668 #>>44467387 #>>44467968 #>>44468790 #>>44468966 #>>44469158 #>>44470642 #>>44470736 #

bgwalter ◴[04 Jul 25 21:20 UTC] No.44467968[source]▶

>>44466573 #

bcachefs is experimental and Kent writes in the LWN comments that nothing would get done if he didn't develop it this way. Filesystems are a massive undertaking and you can have all the rules you want. It doesn't help if nothing gets developed.

It would be interesting how strict the rules are in the Linux kernel for other people. Other projects have nepotistic structures where some developers can do what they want but others cannot.

Anyway, if Linus had developed the kernel with this kind of strictness from the beginning, maybe it wouldn't have taken off. I don't see why experimental features should follow the rules for stable features.

replies(3): >>44468097 #>>44471052 #>>44471394 #

yjftsjthsd-h ◴[04 Jul 25 21:44 UTC] No.44468097[source]▶

>>44467968 #

If it's an experimental feature, then why not let changes go into the next version?

replies(1): >>44468133 #

bgwalter ◴[04 Jul 25 21:53 UTC] No.44468133[source]▶

>>44468097 #

That is a valid objection, but I still think that for some huge and difficult features the month long pauses imposed by release cycles are absolutely detrimental.

Ideally they'd be developed outside the kernel until they are perfect, but Kent addresses this in his LWN comment: There is no funding/time to make that ideal scenario possible.

replies(3): >>44468166 #>>44468709 #>>44473730 #

Analemma_ ◴[04 Jul 25 23:08 UTC] No.44468709[source]▶

>>44468133 #

This position seems so incoherent. If it’s so experimental, why is it in the mainline kernel? And why are fixes so critical they can’t wait for a merge window? Who is using an “experimental” filesystem for mission-critical work that also has to be on untested bleeding-edge code?

Like the sibling commenter, I suspect the word “experimental” is being used here to try and evade the rules that, somehow, every other part of the kernel manages to do just fine with.

replies(2): >>44468887 #>>44470216 #

koverstreet ◴[04 Jul 25 23:36 UTC] No.44468887[source]▶

>>44468709 #

No, you have to understand that filesystems are massive (decade+) projects, and one of the key things you have to do with anything that big that has to work that perfectly is a very gradual rollout, starting with the more risk tolerant users and gradually increasing to a wider and wider set of users.

We're very far along in that process now, but it's still marked as experimental because it is not quite ready for widespread deployment by just anyone. 6.16 is getting damn close, though.

That means a lot of our users now are people getting it from distro kernels, who often have never compiled a kernel before - nevertheless, they can and do report bugs.

And no matter where you are in the rollout, when you get bug reports you have to fix them and get the fixes out to users in a timely manner so that they can keep running, keep testing and reporting bugs.

It's a big loss if a user has to wait 3 months for a bugfix - they'll get frustrated and leave, and a big part of what I do is building a community that knows how the system works, how to help debug, and how to report those bugs.

A very common refrain I get is "it's experimental <expletive deleted>, why does it matter?" - and, well, the answer is getting fixes out in a timely manner matters just as much if not more if we want to get this thing done in a timely manner.

replies(7): >>44469116 #>>44469832 #>>44470468 #>>44471432 #>>44472307 #>>44472645 #>>44476867 #

1. orbisvicis ◴[05 Jul 25 00:22 UTC] No.44469116[source]▶

>>44468887 #

Isn't this the point of DKMS, to decouple module code from kernel code?

replies(2): >>44469218 #>>44469533 #

2. koverstreet ◴[05 Jul 25 00:43 UTC] No.44469218[source]▶

>>44469116 (TP) #

Well, my hope when bcachefs was merged was for it to be a real kernel community project.

At the time it looked like that could happen - there was real interest from Redhat prior to merging. Sadly Redhat's involvement never translated into much code, and while upstreaming did get me a large influx of users - many of which have helped enormously with the QA and stabilization effort - the drama and controversies have kept developers away, so on the whole it's meant more work, pressure and stress for me.

So DKMS wouldn't be the worst route, at this point. It would be a real shame though, this close to taking the experimental label off, and an enormous hassle for users and distributions.

But it's not my call to make, of course. I just write code...

3. webstrand ◴[05 Jul 25 01:58 UTC] No.44469533[source]▶

>>44469116 (TP) #

DKMS is an awful user experience, it's an easy way to render a system unbootable. I hope Linus doesn't force existing users, like me, down that path. It's why I avoid zfs, however good it may be.

replies(3): >>44470382 #>>44470523 #>>44472183 #

4. mroche ◴[05 Jul 25 05:56 UTC] No.44470382[source]▶

>>44469533 #

DKMS isn't a "fire and forget" kind of tool, but it comes reasonably close most of the time. I would say it's a far cry from awful, though.

replies(1): >>44473979 #

5. yjftsjthsd-h ◴[05 Jul 25 06:30 UTC] No.44470523[source]▶

>>44469533 #

One of my machines runs root on ZFS via DKMS. I will grant that it is annoying, and it used to be worse, but I don't think it's been quite as bad as all that for a very long time. I would also argue that it's more acceptable for testing actively developed stuff that's getting the bugs worked out in order to work towards mainlining.

That said, I vaguely seem to recall that bcachefs ended up involving changes to other parts of the kernel to better support it; if that's true then DKMS is likely to be much more painful if not outright impossible. It's fine to compile one module (or even several) against a normal kernel, but the moment you have to patch parts of the "main" kernel it's gonna get messy.

6. krageon ◴[05 Jul 25 12:05 UTC] No.44472183[source]▶

>>44469533 #

ZFS should be avoided because it has too many dumb complete failure states (having run it in a real production storage environment), not because it's DKMS

replies(1): >>44474783 #

7. webstrand ◴[05 Jul 25 16:56 UTC] No.44473979{3}[source]▶

>>44470382 #

I think my problem is that it's just close enough to being fire-and-forget that I forget how to do the recovery when it misfires. It usually seems to crop up when I'm on vacation or something and I don't have my tools.

8. cyberpunk ◴[05 Jul 25 19:04 UTC] No.44474783{3}[source]▶

>>44472183 #

I’ve run racks and racks of it in prod also. What are these dumb complete failure states you mean?

↑

Bcachefs may be headed out of the kernel