←back to thread

215 points ksec | 3 comments | | HN request time: 0.67s | source
Show context
tarruda ◴[] No.45077138[source]
Since the existing bcachefs driver will not be removed, and the problem is the bcachefs developer not following the rules, I wonder if someone else could take on the role of pulling bcachefs changes into the mainline, while also following the merge window rules.
replies(1): >>45078845 #
koverstreet ◴[] No.45078845[source]
No, the problem wasn't following the rules.

The patch that kicked off the current conflict was the 'journal_rewind' patch; we recently (6.15) had the worst bug in the entire history upstream - it was taking out entire subvolumes.

The third report got me a metadata dump with everything I needed to debug the issue, thank god, and now we have a great deal of hardening to ensure a bug like this can never happen again. Subsequently, I wrote new repair code, which fully restored the filesystem of the 3rd user hit by the bug (first two had backups).

Linus then flipped out because it was listed as a 'feature' in the pull request; it was only listed that way to make sure that users would know about it if they were affected by the original bug and needed it. Failure to maintain your data is always a bug for a filesystem, and repair code is a bugfix.

In the private maintainer thread, and even in public, things went completely off the rails, with Linus and Ted basically asserting that they knew better than I do which bcachefs patches are regression risks (seriously), and a page and a half rant from Linus on how he doesn't trust my judgement, and a whole lot more.

There have been many repeated arguments like this over bugfixes.

The thing is, since then I started perusing pull requests from other subsystems, and it looks like I've actually been more conservative with what I consider a critical bugfix (and send outside the merge window) than other subsystems. The _only_ thing that's been out of the ordinary with bcachefs has been the volume of bugfixes - but that's exactly what you'd expect to see from a new filesystem that's stabilizing rapidly and closing out user bug reports - high volume of pure bugfixing is exactly what you want to see.

So given that, I don't think having a go-between would solve anything.

replies(6): >>45079059 #>>45079670 #>>45080227 #>>45081254 #>>45082752 #>>45083951 #
nirava ◴[] No.45079059[source]
To list down the current state of things:

1. Regardless of whether correct or not, it's Linus that decides what's a feature and what's not in Linux. Like he has for the last however many decades. Repair code is a feature if Linus says it is a feature.

2. Being correct comes second to being agreeable in human-human interactions. For example, dunking on x file system does not work as a defense when the person opposite you is a x file system maintainer.

3. rules are rules, and generally don't have to be "correct" to be enforced in an organization

I think your perceived "unfairness" might make sense if you just thought of these things as un-workaroundable constraints, Just like the fact that SSDs wear out over time.

replies(2): >>45079083 #>>45081078 #
koverstreet ◴[] No.45079083[source]
When rules and authority start to take precedence over making sure things work, things have gone off the rails and we're not doing engineering anymore.
replies(4): >>45079758 #>>45080373 #>>45081150 #>>45083228 #
ranger_danger ◴[] No.45079758[source]
I think this attitude is exactly why this happened. I would have done the same thing.

Do you argue with your school teachers that your book report shouldn't be due on Friday because it's not perfect yet?

I read several of your response threads across different websites. The most interesting to me was LWN, about the debian tools, where an actual psychologist got involved.

All the discussions seem to show the same issue: You disagree with policies held by people higher up than you, and you struggle with respecting their decisions and moving on.

Instead you keep arguing about things you can't change, and that leads people to getting frustrated and walking away from you.

It really doesn't matter how "right" you may be... not your circus, not your monkeys.

replies(2): >>45080026 #>>45081188 #
charcircuit ◴[] No.45080026[source]
Your analogy fails to account that after "Friday" bug fixes are still allowed. A file system losing your files sounds like a bug to me.

Edit since you expanded your post:

>The most interesting to me was LWN, about the debian tools, where an actual psychologist got involved.

To me the comment was patronizing implying it was purely due to bad communication from Kent's end and shows how immature people are with running these operating system are. Putting priority on processes over the end user.

>respecting their decisions and moving on.

When this causes real pain for end users. It's validating that the decision was wrong.

> really doesn't matter how "right" you may be... not your circus

It does because it causes reputational damage for bcachefs. Even beyond reputational damage, delivering a good product to end users should be a priority. In my opinion projects as big as Debian causing harm to users should be called out instead of ignored. Else it can lead to practices like replacing dependencies out from underneath programs to become standard practice.

replies(2): >>45080576 #>>45084011 #
wavemode ◴[] No.45080576[source]
You still seem to be arguing that, shipping the change was the "right" thing to do. But that's not what's in dispute. Rather it is that, if what you think is right and what the person who makes the rules thinks is right are in disagreement, the adult thing to do is not to simply disregard the rules (and certainly not repeatedly, after being warned not to).

This is the difference between being smart and being wise. If the goal of all this grandstanding was that, it's so incredibly and vitally important for these patches to get into the kernel, well guess what, now due to all this drama this part of the kernel is going to go unmaintained entirely. Is that good for the users? Did that help our stated goal in any way? No.

replies(1): >>45080659 #
charcircuit ◴[] No.45080659[source]
>the adult thing to do is not to simply disregard the rules

The adult thing is to do best by the users. Critical file system bugs are worth blocking the release of any serious operating system in the real world as there is serious user impact.

>Is that good for the users?

I think it's complicated. It could allow for a faster release schedule for bug fixes which can allow for addressing file system issues faster.

replies(2): >>45080980 #>>45081134 #
1. saubeidl ◴[] No.45081134[source]
I don't think getting the FS kicked out of the kernel is best by the users.

Good engineering requires long term thinking.

replies(1): >>45081788 #
2. procaryote ◴[] No.45081788[source]
There's more than bcachefs in the kernel. If dealing with bcachefs takes an inordinate amount of time and effort, dropping it is the right move.

I don't know the situation well enought to review where they drew the line, but there definitely should be a line somewhere.

replies(1): >>45081815 #
3. saubeidl ◴[] No.45081815[source]
That was my point exactly.