←back to thread

216 points ksec | 1 comments | | HN request time: 0.208s | source
Show context
Volundr ◴[] No.45076237[source]
Damn. I was enjoying not having to deal with the fun of ZFS and DKMS, but it seems like now bcachefs will be in the same boat, either dealing with DKMS and occasional breakage or sticking with the kernel version that slowly gets more and more out of date.
replies(4): >>45076451 #>>45076664 #>>45076776 #>>45076818 #
WD-42 ◴[] No.45076818[source]
The article says that bcachefs is not being removed from the mainline kernel. This looks like mostly a workaround for Linus and other kernel devs to not have to deal with Kent directly.
replies(2): >>45077114 #>>45077151 #
tux3 ◴[] No.45077151[source]
It's complicated, no one really knows what "externally maintained" entails at the moment. Linus is not exactly poised to pull directly from Kent, and there is no solution lined-up at the moment.

Both Linus and Kent drive a hard bargain, and it's not as simple as finding someone else to blindly forward bcachefs patches. At the first sign of conflict, the poor person in the middle would have no power, no way to make anyone back down, and we'd be back to square one.

It's in limbo, and there is still time, but if left to bitrot it will be removed eventually.

replies(1): >>45077281 #
immibis ◴[] No.45077281[source]
That person would be accountable to Linus, but not to Kent.
replies(1): >>45077372 #
tux3 ◴[] No.45077372[source]
Unfortunately, there's also nothing they can do if Kent says no. Say there's a disagreement on a patch that touches something outside fs/bcachefs, that person can't exactly write their own patches incorporating the feedback. They're not going to fork and maintain their own patches. They'd be stuck between a rock and a hard place, and that gets us back to a deadlock.

The issue is that I have never seen Kent back down a single time. Kent will explain in details why the rules are bullshit and don't apply in this particular case, every single time, without any room for compromise.

If the only problem was when to send patches, that would be one thing. But disagreements over patches aren't just a timing problem that can be routed around.

replies(1): >>45078945 #
koverstreet ◴[] No.45078945[source]
The key thing here is I've never challenged Linus's authority on patches outside fs/bcachefs/; I've quietly respun pull requests for that, on more than one occasion.

The point of contention here was a patch within fs/bcachefs/, which was repair code to make sure users didn't lose data.

If we can't have clear boundaries and delineations of responsibility, there really is no future for bcachefs in the kernel; my core mission is a rock solid commitment to reliability and robustness, including being responsive to issues users hit, and we've seen repeatedly that the kernel process does not share those priorities.

replies(2): >>45079197 #>>45082574 #
tux3 ◴[] No.45079197[source]
You may be right, but I think looking at it from a lens of who has authority and can impose their decision is still illustrating the point I'm trying to make.

To some extent drawing clear boundaries is good as a last resort when people cannot agree, but it can't be the main way to resolve disagreements. Thinking in terms of who owns what and has the final say is not the same as trying to understand the requirements from the other side to find a solution that works for everyone.

I don't think the right answer is to blindly follow whatever Linus or other people say. I don't mean you should automatically back down without technical reasons, because authority says so. But I notice I can't remember an email where concessions where made, or attemps to find a middle grounds by understanding the other side. Maybe someone can find counterexamples.

But this idea of using ownership to decide who has more authority and can impose their vision, that can't be the only way to collaborate. It really is uncompromising.

replies(1): >>45079247 #
koverstreet ◴[] No.45079247[source]
> To some extent drawing clear boundaries is good as a last resort when people cannot agree, but it can't be the main way to resolve disagreements. Thinking in terms of who owns what and has the final say is not the same as trying to understand the requirements from the other side to find a solution that works for everyone.

Agreed 100%. In an ideal world, we'd be sitting down together, figuring out what our shared priorities are, and working from there.

Unfortunately, that hasn't been possible, and I have no idea what Linus's priorities except that they definitely aren't a bulletproof filesystem and safeguarding user data; his response to journal_rewind demonstrated that quite definitively.

So that's where we're at, and given the history with other local filesystems I think I have good reason not to concede. I don't want to see bcachefs run off the rails, but given all the times I've talked about process and the way I'm doing things I think that's exactly what would happen if I started conceding on these points. It's my life's work, after all.

You'd think bcachefs's track record (e.g. bug tracker, syzbot) and the response it gets from users would be enough, but apparently not, sadly. But given the way the kernel burns people out and outright ejects them, not too surprising.

replies(2): >>45079299 #>>45079421 #
magicalhippo ◴[] No.45079421[source]
> Unfortunately, that hasn't been possible, and I have no idea what Linus's priorities except that they definitely aren't a bulletproof filesystem and safeguarding user data

Remarks like this come across as extremely patronizing, as you completely ignore what the other party says and instead project your own conclusions about the other persons motives and beliefs.

> his response to journal_rewind demonstrated that quite definitively

No, no it did not in any shape way or form do that. You had multiple other perfectly valid options to help the affected users besides getting that code shipped in the kernel there and then. Getting it shipped in the kernel was merely a convenience.

If bcachefs was established and stable it would be a different matter. But it's an experimental file system. Per definition data loss is to be expected, even if recovery is preferable.

replies(1): >>45079458 #
koverstreet ◴[] No.45079458[source]
No, bcachefs-tools wasn't an option because the right way to do this kind of repair is to first do a dry run test repair and mount, so you can verify with your eyes that everything is back as it should be.

If we had the fuse driver done that would have worked, though. Still not completely ideal because we're at the mercy of distros to make sure they're getting -tools updates out in a timely manner, they're not always as consistent with that as the kernel. Most are good, though).

Just making it available in a git repo was not an option because lots of bcachefs users are getting it from their distro kernel and have never built a kernel before (yes, I've had to help users with building kernels for the first time; it's slow and we always look for other options), and even if you know how, if your primary machine is offline the last thing you want to have to do is build a custom rescue image with a custom kernel.

And there was really nothing special about this than any other bugfix, besides needing to use a new option (which is also something that occasionally happens with hotfixes).

Bugs are just a fact of life, every filesystem has bugs and occasionally has to get hotfixes out quickly. It's just not remotely feasible or sane to be coming up with our own parallel release process for hotfixes.

replies(1): >>45079860 #
magicalhippo ◴[] No.45079860[source]
That you or the user dislike some of the downsides does not invalidate an option.

I will absolutely agree with you that merging that repair code would be vastly preferable to you and the users. And again, if bcachefs was mature and stable, I absolutely think users should get a way to repair ASAP.

But bcachefs is currently experimental and thus one can reasonably expect users to be prepared to deal with the consequences of that. And hence the kernel team, with Linus at the top, should be able to assume this when making decisions.

If you have users who are not prepared for this, you have a different problem and should seek how to fix that ASAP. Best would probably be to figure out how to dissuade them from installing. In any case, not doing something to prevent that scenario would be a disservice to those users.

replies(1): >>45080136 #
koverstreet ◴[] No.45080136[source]
bcachefs has had active users, with real data that they want to protect, since before it was merged.

A lot of the bcachefs users are using it explicitly because they've been burned by btrfs and need something more reliable.

I am being much, much more conservative with removing the experimental label than past practice, but I have been very explicit that while it may not be perfect yet and users should expect some hiccups, I support it like any other stable production filesystem.

That's been key to getting it stabilized: setting a high expectations. Users know that if they find a critical bug it's going to be top priority.

replies(1): >>45080448 #
magicalhippo ◴[] No.45080448[source]
Given the bug fixes and changes, the experimental flag seems quite appropriate to me. That's not a bad thing.

However, it was put in the kernel as experimental. That carries with it implications.

As such, while it's very commendable that you wish to support the experimental bcachefs as-if it was production ready, you cannot reasonably impose that wish upon the rest of the kernel.

That said I think you and your small team is doing a commendable job, and I strongly wish you succeed in making bcachefs feature complete and production ready. And I say that as someone who really, really likes ZFS and run it on my Linux boxes.

replies(1): >>45085756 #
koverstreet ◴[] No.45085756[source]
All I need to support bcachefs is for the same rules to apply as they are to every other subsystem.
replies(1): >>45086543 #
magicalhippo ◴[] No.45086543[source]
From what I have read and recall, the same rules do apply.

Rather, the disagreement seems to be over what constitutes a feature and what constitutes a bugfix.

As I recall, your view is that the repair code is part of the bugfix. However Linus deems it a feature, and thus applied the "no new features outside the merge window" rule.

I think Linus is correct here and you are wrong. New code made to repair flaws that previously could not be repaired is definitely a new feature of the repair tool.

On the other hand, I am sympathetic to your argument that this is after all an experimental filesystem which has different needs from a stable hardware driver say, and as I recall the repair tool changes were entirely contained in the bcachefs subtree. As such, the worst it could do was to fail compilation on certain platforms, which already happened previously.

Personally I would have dropped the bugfix vs feature debate and focused on trying to get Linus to allow the repair code in as a new feature. From what I recall Linus said, you already burned some goodwill by the previous kernel compilation failure, but perhaps Linus could change his stance if you worked with him.

replies(1): >>45087195 #
1. koverstreet ◴[] No.45087195[source]
New features go in during RCs all the time.

The hard rule you're thinking of doesn't exist, it's all risk vs. reward.