Most active commenters
  • koverstreet(19)
  • motorest(12)
  • quotemstr(6)
  • mort96(5)
  • nullc(5)
  • ranger_danger(4)
  • charcircuit(4)
  • nirava(3)
  • tarruda(3)
  • trueismywork(3)

←back to thread

214 points ksec | 89 comments | | HN request time: 1.548s | source | bottom
Show context
tarruda ◴[] No.45077138[source]
Since the existing bcachefs driver will not be removed, and the problem is the bcachefs developer not following the rules, I wonder if someone else could take on the role of pulling bcachefs changes into the mainline, while also following the merge window rules.
replies(1): >>45078845 #
1. koverstreet ◴[] No.45078845[source]
No, the problem wasn't following the rules.

The patch that kicked off the current conflict was the 'journal_rewind' patch; we recently (6.15) had the worst bug in the entire history upstream - it was taking out entire subvolumes.

The third report got me a metadata dump with everything I needed to debug the issue, thank god, and now we have a great deal of hardening to ensure a bug like this can never happen again. Subsequently, I wrote new repair code, which fully restored the filesystem of the 3rd user hit by the bug (first two had backups).

Linus then flipped out because it was listed as a 'feature' in the pull request; it was only listed that way to make sure that users would know about it if they were affected by the original bug and needed it. Failure to maintain your data is always a bug for a filesystem, and repair code is a bugfix.

In the private maintainer thread, and even in public, things went completely off the rails, with Linus and Ted basically asserting that they knew better than I do which bcachefs patches are regression risks (seriously), and a page and a half rant from Linus on how he doesn't trust my judgement, and a whole lot more.

There have been many repeated arguments like this over bugfixes.

The thing is, since then I started perusing pull requests from other subsystems, and it looks like I've actually been more conservative with what I consider a critical bugfix (and send outside the merge window) than other subsystems. The _only_ thing that's been out of the ordinary with bcachefs has been the volume of bugfixes - but that's exactly what you'd expect to see from a new filesystem that's stabilizing rapidly and closing out user bug reports - high volume of pure bugfixing is exactly what you want to see.

So given that, I don't think having a go-between would solve anything.

replies(6): >>45079059 #>>45079670 #>>45080227 #>>45081254 #>>45082752 #>>45083951 #
2. nirava ◴[] No.45079059[source]
To list down the current state of things:

1. Regardless of whether correct or not, it's Linus that decides what's a feature and what's not in Linux. Like he has for the last however many decades. Repair code is a feature if Linus says it is a feature.

2. Being correct comes second to being agreeable in human-human interactions. For example, dunking on x file system does not work as a defense when the person opposite you is a x file system maintainer.

3. rules are rules, and generally don't have to be "correct" to be enforced in an organization

I think your perceived "unfairness" might make sense if you just thought of these things as un-workaroundable constraints, Just like the fact that SSDs wear out over time.

replies(2): >>45079083 #>>45081078 #
3. koverstreet ◴[] No.45079083[source]
When rules and authority start to take precedence over making sure things work, things have gone off the rails and we're not doing engineering anymore.
replies(4): >>45079758 #>>45080373 #>>45081150 #>>45083228 #
4. immibis ◴[] No.45079670[source]
The problem was that you weren't following the rules.

The rules were clear about the right time to merge things so they get in the next version, and if you don't, they will have to get in the version after that. I don't know the specific time since I'm not a kernel developer, but there was one.

Linus is trying to run the release cycle on a strict schedule, like a train station. You are trying to delay the train so that you can load more luggage on, instead of just waiting for the next train. You are not doing this once or twice in an emergency, but you are trying to delay every single train. Every single train, *you* have some "emergency" which requires the train to wait just for you. And the station master has gotten fed up and kicked you out of the station.

How can it be an emergency if it happens every single time? You need to plan better, so you will be ready before the train arrives. No, the train won't wait for you just because you forgot your hairbrush, and it won't even wait for you to go home and turn your oven off, even though that's really important. You have to get on the next train instead, but you don't understand that other people have their own schedules instead of running according to yours.

If it happened once, okay - shit happens. But it happens every time. Why is that? They aren't mad at you because of this specific feature. They are mad at you because it happens every time. It seems like bcachefs is not stable. Perhaps it really was an emergency just the one time you're talking about, but that means it either was an emergency all the other times and your filesystem is too broken to be in the kernel, or it wasn't an emergency all the other times and you chose to become the boy who cried wolf. In either case Linus's reaction is valid.

replies(1): >>45079883 #
5. ranger_danger ◴[] No.45079758{3}[source]
I think this attitude is exactly why this happened. I would have done the same thing.

Do you argue with your school teachers that your book report shouldn't be due on Friday because it's not perfect yet?

I read several of your response threads across different websites. The most interesting to me was LWN, about the debian tools, where an actual psychologist got involved.

All the discussions seem to show the same issue: You disagree with policies held by people higher up than you, and you struggle with respecting their decisions and moving on.

Instead you keep arguing about things you can't change, and that leads people to getting frustrated and walking away from you.

It really doesn't matter how "right" you may be... not your circus, not your monkeys.

replies(2): >>45080026 #>>45081188 #
6. koverstreet ◴[] No.45079883[source]
It's a bugfix, and bugfixes are allowed at and time - weighing regression risk against where we're at in the cycle. It was a very high severity bug, low regression risk for the fix, and we were at rc3.
replies(3): >>45081258 #>>45082589 #>>45082820 #
7. charcircuit ◴[] No.45080026{4}[source]
Your analogy fails to account that after "Friday" bug fixes are still allowed. A file system losing your files sounds like a bug to me.

Edit since you expanded your post:

>The most interesting to me was LWN, about the debian tools, where an actual psychologist got involved.

To me the comment was patronizing implying it was purely due to bad communication from Kent's end and shows how immature people are with running these operating system are. Putting priority on processes over the end user.

>respecting their decisions and moving on.

When this causes real pain for end users. It's validating that the decision was wrong.

> really doesn't matter how "right" you may be... not your circus

It does because it causes reputational damage for bcachefs. Even beyond reputational damage, delivering a good product to end users should be a priority. In my opinion projects as big as Debian causing harm to users should be called out instead of ignored. Else it can lead to practices like replacing dependencies out from underneath programs to become standard practice.

replies(2): >>45080576 #>>45084011 #
8. tarruda ◴[] No.45080227[source]
It is not good when politics get in the way of good engineering.

Regardless of differing points of view on the situation, I think everyone can agree that bcachefs being actively updated on Linus tree is a good thing, right?

If you were able to work at your own pace, and someone else took the responsibility of pulling your changes at a pace that satisfies Linus, wouldn't that solve the problem of Linux having a good modern/CoW filesystem?

replies(2): >>45080385 #>>45081209 #
9. lokar ◴[] No.45080373{3}[source]
There are import differences between small scale (individual or a few people) engineering and larger scale engineering.

For many humans to work together over time on something very complex is hard. Structure and process are required. And sometimes they come at the expense of what some might call “pure” engineering. But they are the right trade offs to optimize for the actual goal.

If you can’t accept that, stick to solo projects.

10. koverstreet ◴[] No.45080385[source]
At this time, I don't think so.

We were never able to get any sane and consistent policy on bugfixes, and I don't have high hopes that anyone else will have better luck. The XFS folks have had their own issues with interference, leading to burnout - they're on their third maintainer, and it's really not good for a project to be cycling through maintainers and burning people out, losing consistency of leadership and institutional knowledge.

And I'm still seeing Linus lashing out at people on practically a weekly basis. I could never ask anyone else to have to deal with that.

I think the kernel community has some things they need to figure out before bcachefs can go back in.

replies(3): >>45081722 #>>45082275 #>>45082811 #
11. wavemode ◴[] No.45080576{5}[source]
You still seem to be arguing that, shipping the change was the "right" thing to do. But that's not what's in dispute. Rather it is that, if what you think is right and what the person who makes the rules thinks is right are in disagreement, the adult thing to do is not to simply disregard the rules (and certainly not repeatedly, after being warned not to).

This is the difference between being smart and being wise. If the goal of all this grandstanding was that, it's so incredibly and vitally important for these patches to get into the kernel, well guess what, now due to all this drama this part of the kernel is going to go unmaintained entirely. Is that good for the users? Did that help our stated goal in any way? No.

replies(1): >>45080659 #
12. charcircuit ◴[] No.45080659{6}[source]
>the adult thing to do is not to simply disregard the rules

The adult thing is to do best by the users. Critical file system bugs are worth blocking the release of any serious operating system in the real world as there is serious user impact.

>Is that good for the users?

I think it's complicated. It could allow for a faster release schedule for bug fixes which can allow for addressing file system issues faster.

replies(2): >>45080980 #>>45081134 #
13. nirava ◴[] No.45080980{7}[source]
> The adult thing is to do best by the users

Best by users in the long term is predictable processes. "RC = pure bug fixes" is a battle tested, dependable rule, absence of which causes chaos.

> Critical file system bugs are worth blocking the release

"Experimental" label EXACTLY to prevent this stuff from blocking release. Do you not know that bcachefs is experimental? This is an example of another rule which helps predictability.

replies(1): >>45081206 #
14. quotemstr ◴[] No.45081078[source]
> Being correct comes second to being agreeable in human-human interactions

Prioritizing agreeableness above correctness is the reason the space shuttle Challenger blew up.

The bcachefs fracas is interesting and important because it's like a stain making some damn germ's organelles visible: it highlights a psychological division in tech and humanity in general between people who prioritize

1) deferring to authority, reading the room, knowing your place

and people who prioritize

2) insisting on your concept of excellence, standing up against a crowd, and speaking truth to power.

I am disturbed to see the weight position #1 has accumulated over the past decade or two. These people argue that Linus could be arbitrarily wrong and Overstreet arbitrarily right and it still wouldn't matter because being nice is critical to the success of a large scale project or something.

They get angry because they feel comfort in understanding their place in a social hierarchy. Attempts to upend that hierarchy in the name of what's right creates cognitive dissonance. The rule-followers feel a tension they can relieve only by ganging up and asserting "rules are rules and you need to follow them!" --- whether or not, at the object level, a) there are rules, b) the rules are beneficial, and c) whether the rules are applied consistently. a, b, and c are exactly those object-level does-the-o-ring-actually-work-when-cold considerations that the rule-following, rule-enforcing kind of person rejects in favor a reality built out of words and feelings, not works and facts.

They know it, too. They need Overstreet and other upstarts to fail: the failure legitimizes their own timid acquiescence to rules that make no sense. If other people are able to challenge rules and win, the #1 kind of person would have to ask himself serious and uncomfortable questions about what he's doing with his life.

It's easier and psychologically safer to just tear down anyone trying to do something new or different.

The thing is all technological progress depends on the #2 people winning in the end. As Feynmann talked about when diagnosing this exact phenomenon as the root cause of the Challenger disaster, mother nature (who appears to have taken on corrupting filesystems as a personal hobby of hers) does not care one bit about these word games or how nice someone is. The only thing that matters when solving a problem of technology is whether something works.

I think a lot of people in tech have entirely lost sight of this reality. I can't emphasize enough how absurd it is to state "[b]eing correct comes second to being agreeable in human-human interactions" and how dangerously anti-technology, anti-science, and-civilization, and anti-human this poison mindset is.

replies(4): >>45081112 #>>45083278 #>>45083304 #>>45083792 #
15. nolist_policy ◴[] No.45081112{3}[source]
Citation needed.
replies(1): >>45081198 #
16. saubeidl ◴[] No.45081134{7}[source]
I don't think getting the FS kicked out of the kernel is best by the users.

Good engineering requires long term thinking.

replies(1): >>45081788 #
17. motorest ◴[] No.45081150{3}[source]
> When rules and authority start to take precedence over making sure things work, (...)

Didn't Linus lambast you for "lack of testing and collaboration before submitting patches", to the point the patches you were trying to push weren't even building?

https://ostechnix.com/linus-torvalds-expresses-frustration-w...

replies(1): >>45082917 #
18. motorest ◴[] No.45081188{4}[source]
> All the discussions seem to show the same issue: You disagree with policies held by people higher up than you, and you struggle with respecting their decisions and moving on.

I think it's less subtle than that. The straw that broke the camel's back was quite literally abuse towards other kernel developers.

https://lwn.net/Articles/999197/

replies(2): >>45082906 #>>45082920 #
19. quotemstr ◴[] No.45081198{4}[source]
No.
replies(1): >>45084848 #
20. charcircuit ◴[] No.45081206{8}[source]
This was a bug fix. My point is that there will always be bugs in the kernel so not all bugs are worth blocking a release, but losing data is worth blocking the release for.

>"Experimental" label EXACTLY to prevent this stuff from blocking release

In practice bcachefs is used in production with real users. If the experimental label prevents critical bug fixes from making it into the kernel then it would be better to just remove that label.

replies(2): >>45082368 #>>45082462 #
21. motorest ◴[] No.45081209[source]
> Regardless of differing points of view on the situation, I think everyone can agree that bcachefs being actively updated on Linus tree is a good thing, right?

I think bcachefs is not the problem. The problem seems to be the sole maintainer who is notoriously abusive and apparently unable to work with other kernel developers.

I'm sure if another maintainer came along, one that wasn't barred for being abusive towards other maintainers, there would be no problem getting the project back in.

22. Szpadel ◴[] No.45081254[source]
you might end up with the best filesystem in the world that no-one will use. you sacrificed long term sustainability for short term win.

even if It would be shipped in similar way to zfs, noone will use it for anything more important than homelab

why? with this altitude you cannot be threated serious and this imply many risks what you might came up with in the future. another risk is they you are sole developer of this filesystem, that's also not acceptable to consider use if bcachefs seriously.

my advice would be: consider expanding team to have few developers that are able to contribute. learn to control your pride for the good of the while project. working with (and coordinating) other developers could make you understand better upstream kernel community. and given that chance you could delegate someone else with better diplomatic skills to deal with upstream in way that would be more beneficial for the whole project in long term.

23. motorest ◴[] No.45081258{3}[source]
> It's a bugfix, and bugfixes are allowed at and time (...)

I'm afraid you sound like you're trying to gaslight everyone in the thread.

https://news.itsfoss.com/linux-kernel-bcachefs-drop/

24. motorest ◴[] No.45081722{3}[source]
> We were never able to get any sane and consistent policy on bugfixes, and I don't have high hopes that anyone else will have better luck.

This reads an awful lot like blatant gaslighting.

It's quite public that you were kicked out not only because of abusive behavior towards other kernel developers but also you kept ignoring any and all testing and QA guardrails, to the point you tried to push patched that failed to build.

From the very public discussion, you should sit down any discussion on bugfixes and testing because, while you are voicing strong opinions on high quality bars, the evidence suggests you were following none.

25. procaryote ◴[] No.45081788{8}[source]
There's more than bcachefs in the kernel. If dealing with bcachefs takes an inordinate amount of time and effort, dropping it is the right move.

I don't know the situation well enought to review where they drew the line, but there definitely should be a line somewhere.

replies(1): >>45081815 #
26. saubeidl ◴[] No.45081815{9}[source]
That was my point exactly.
27. tarruda ◴[] No.45082275{3}[source]
Keep in mind that bcachefs’s adoption and eventual mainstream acceptance are not contingent on Linus accepting your contributions or on you “removing the experimental label.” What matters is eliminating the barriers that prevent users from trying it, and that is far easier when bcachefs is an upstream filesystem—something that allows more distributions to offer it as an installer option.

> And I'm still seeing Linus lashing out at people on practically a weekly basis. I could never ask anyone else to have to deal with that.

This is a bit off‑topic, but I wouldn’t be so quick to judge how well Linus is doing his job; no one else in the world has his responsibilities.

At this point, any new kernel contributor should be familiar with Linus and have come to accept, or at least tolerate, his ways.

> I think the kernel community has some things they need to figure out before bcachefs can go back in.

Fair enough. It may be better to let things cool off while giving bcachefs more time to reach a stable state before attempting to reintegrate it into Linux development. I hope you won’t give up, because Linux needs this.

Since bcachefs is your project and you seem to enjoy working on it, it wouldn’t be a stretch to say that you need this too, right? Don’t let ego get in the way of achieving your goals.

28. motorest ◴[] No.45082368{9}[source]
> This was a bug fix.

I'm not sure exactly what you are talking about, and I'm not sure you do either. The discussion that preceded bcachefs to be dropped from the Linux kernel mainline involved an attempt to sneak a new features in RC, sidestepping testing and QA work, which was followed up by yet more egregious behavior from the mantainer.

https://www.phoronix.com/news/Linux-616-Bcachefs-Late-Featur...

replies(1): >>45082428 #
29. charcircuit ◴[] No.45082428{10}[source]
>sneak a new features in RC

Too solve a bug with the filesystem that people in the wild were hitting. Like how Linus has said in the past with how there is a blurry line between security fixes and bug fixes. There is a blurry line between filesystem bugs and recovery features.

If you read the email it is clear that the full feature has more work needed and this is more of a basic implementation to address bugs that people hit in the wild.

replies(1): >>45082954 #
30. dijksterhuis ◴[] No.45082462{9}[source]
> In practice bcachefs is used in production with real users. If the experimental label prevents critical bug fixes from making it into the kernel then it would be better to just remove that label.

alternative perspective: those users have knowingly and willingly put experimental software into production. it was their choice, they were informed of the risk and so the consequences and responsibility are their’s.

it’s like signing up to take some experimental medicine, and then complaining no-one told me about the side-effect of persistent headaches.

that doesn’t stop anyone from being user-centric in their approach, e.g. call me if you notice any symptoms and i’ll come round your house to examine you.

… as long as everyone is clear about the fact it is experimental and the boundaries/limitations that apply, e.g. there will be certain persistent headache medicines that cannot be prescribed to you, or it might take longer for them to work because you’re on an experimental medicine.

replies(1): >>45083163 #
31. JackSlateur ◴[] No.45082589{3}[source]
Reading https://lore.kernel.org/all/4xkggoquxqprvphz2hwnir7nnuygeybf...

It is a not a bugfix, and you know it :(

If you are not acting on bad faith, I suggest you read Wittgensen

He has made a lot of work around the idea of language, which basically boil down to the fact that words have no intrinsic meaning : the meaning of a word is the meaning that a given population gives to that word

So in your case, you may be right about the meaning of the word "bugfix" in some population, but you must translate and use the meaning of that word in the "kernel" population

The dictionary is a lie .. :)

replies(2): >>45083122 #>>45083638 #
32. mort96 ◴[] No.45082752[source]
It's so sad to see an excellent engineer such as yourself, building what seems like an excellent filesystem that has the potential to be better than everything else available for Linux for many use cases, completely fail to achieve your goals because you lack the people skills to navigate working as a part of a team under a technical leader. Every comment and e-mail I've seen from you has demonstrated an impressive lack of understanding with regard to why you're being treated as you are.

You don't have to agree with all other maintainers on everything, but if you're working on Linux (or any other major project that's owned, run and developed by other people), you need to have the people skills to at a minimum avoid pissing everyone else off. Or you need to delegate the communication work to someone with those skills. It's a shame you don't.

replies(2): >>45083129 #>>45083235 #
33. thoroughburro ◴[] No.45082811{3}[source]
This sort of misrepresentation of your public behavior will only trash your reputation further. I encourage anyone who reads this to actually look up the mailing list threads. It’s very illuminating.
34. thoroughburro ◴[] No.45082820{3}[source]
Do your really want to slip from being difficult to work with to being a liar? Be careful.
35. ranger_danger ◴[] No.45082906{5}[source]
I think that abuse falls under "struggling" to respect their decisions, but yes I agree that was a big part of it.
36. koverstreet ◴[] No.45082917{4}[source]
Linus has broken the build more recently than I have. (In the time since bcachefs went upstream, we've both done that once, that I've seen).

Linus doesn't seem to believe in automated testing. He just seems to think that there's no way I could QA code as quickly as I do, but that's because I've invested heavily in automated testing and building up a community of people doing very good testing and QA work; bcachefs's automated testing is the best of any upstream filesystem that I've seen (there's a whole cluster of machines dedicated to this), and I have people running my latest branch on a daily basis.

Nearly all of the collaboration just happens on IRC.

For big changes I wait for explicit acks from testers that they've ran it and things look good; a lot of people read and review my code too, it's just typically less formal than the rest of the kernel.

replies(2): >>45083105 #>>45083428 #
37. koverstreet ◴[] No.45082920{5}[source]
You might want to read the full story on that one.
replies(1): >>45082968 #
38. motorest ◴[] No.45082954{11}[source]
> Too solve a bug with the filesystem that people in the wild were hitting.

So you acknowledge that this last episode involved trying to push new features into a RC.

As it was made abundantly clear, not only is the point of RC branches to only get tiny bugfixes after testing, the feature work that was presented was also untested and risked introducing major regressions.

All these red flags were repeatedly raised in the mailing list by multiple kernel maintainers. Somehow you're ignoring all the feedback and warnings and complains raised by people from Linux kernel maintainers, and instead you've opted to try to gaslight the thread.

replies(1): >>45083260 #
39. motorest ◴[] No.45082968{6}[source]
> You might want to read the full story on that one.

I read the full story. Everyone else can do the same. Somehow it seems you opt to skip it and prefer to be deeply invested in creating an alternative reality.

40. righthand ◴[] No.45083105{5}[source]
Yeah but you don’t get to make the calls. Linus does and your “well kernel daddy does it too” and “actually I’m doing it better than my critics understand” don’t play well with the kernel daddy (or really any bdfl). Do you not see your comment as dismissive?

All your comments are dismissive of the criticisms so far and you’re shrugging your shoulders as to why.

It’s great you’re able to reason and defend yourself but Linux as a whole is larger than you and refusing to submit to their ways will make technology move no where.

41. jeltz ◴[] No.45083122{4}[source]
> - New option: journal_rewind [...]

> - Some new btree iterator tracepoints, for tracking down some livelock-ish behaviour we've been seeing in the main data write path.

Yeah, how are these two things bug-fixes? Especially the first one should not be merged late.

42. nullc ◴[] No.45083129[source]
> minimum avoid pissing everyone else off

Which also, at times, means appeasing people even when you are confident that they are wrong because you need their cooperation in the future. In a large complicated system, being able to work together is often more important to the system's reliability, performance, etc. than being as right as possible.

Plus even when you're confident you are in the right you might still be in the wrong. After all, the people you are disagreeing with are also superbly competent and they believe they're in the right just as you do. There can be hills worth dying on, but they ought to be very rare.

replies(2): >>45083423 #>>45083592 #
43. koverstreet ◴[] No.45083163{10}[source]
Again: the elephant in the room is that a lot of bcachefs users are using it explicitly because they have lost a lot of data on btrfs, and they've found it to be more trustworthy.

This puts us all in a shitty situation. I want the experimental label to come off at the right time - when every critical bug is fixed and it's as trustworthy as I can reasonably make it, when I know according to the data I have that everyone is going to have a good experience - but I have real users who need this thing and need to be supported.

There is _no reason_ to interpret the experimental label in the way that you're saying, you're advocating that reliability for the end user be deprioritized versus every other filesystem.

But deprioritizing reliability is what got us into this mess.

replies(1): >>45083763 #
44. nullc ◴[] No.45083228{3}[source]
Collaborative projects don't work on pure engineering. There are significant resource management components that basically amount to therapy, psychiatry, and side show entertainment because the most critical resources are human minds.

Excellent engineering management largely isolates engineers from having to deal with this non-engineering stuff (except for the subset that is specifically for their own personal benefit)-- but open source tends to radically flatten organizations that produce software, such that every contributor must also be their own manager to a great degree.

In a well run project you don't necessarily have to be good at or even interested in all the more socially oriented components of the project organization. But if you're not you must be willing to let someone else handle that stuff and go along with their judgements even if they seem suboptimal from the narrower perspective you've adopted. If you can't then from a "collaborative development as a system" view you're a faulty component that doesn't provide the right interface for the system's requirements (and are gonna get removed!). :)

Another way to look at it is that it would be ideal if every technical element were optimal at all times. In small systems with well understood requirements this can be possible or at least close to possible. But in big complex and poorly scoped systems it's just not possible: We have imperfect information, there are conflicting requirements, we have finite time, and so on. The system as a whole will always be far from perfect. If anyone tried to make it all perfect it would just fail to make progress, deadlock, or otherwise. The management of the project is always trying to balance the imperfections. They know that their decisions are often making things worse for a local concern, but they do so with belief that over time the decisions result in a better system overall. Linux has a good reputation in large part due to a long history of making good decisions about the flaws to accept or even introduce, which issues to gloss over vs debate to death.

replies(1): >>45085657 #
45. koverstreet ◴[] No.45083235[source]
I get a ton of comments like this.

Pointing the finger at the skills I lack and my inability, while ignoring the wider picture, of the kernel burning out maintainers and not doing well on filesystems.

It's wearying.

replies(3): >>45083324 #>>45083339 #>>45084384 #
46. koverstreet ◴[] No.45083260{12}[source]
No, I'm sorry but you're simply wrong.

bcachefs has a ton of QA, both automated testing and a lot of testers that run my latest and I work with on a daily basis. The patch was well tested; it was for codepaths that we have good regression tests for, it was algorithmically simple, and it worked perfectly to recover a filesystem from the original bug report, and it performed flawlessly again not long after.

I've explained my testing and QA on the lists multiple times.

You, like the other kernel maintainers in that thread, are making wild assertions despite having no involvement with the project.

replies(2): >>45083367 #>>45085121 #
47. koverstreet ◴[] No.45083278{3}[source]
Thanks, I've been struggling to put this into words.

When you're working on the core technology we all depend on, correctness is not optional.

replies(1): >>45083355 #
48. nirava ◴[] No.45083304{3}[source]
Ugh, this is a lot of words for nothing.

1. I laid down what I perceived as the state of things. The generalizations I drew from observing the system that is Linux development. Nowhere have I prescribed that kent "follow" my ideas. Simply that he can use these to try to understand the unfairness he feels.

2. Your anarcho-individualistic development ideas sound good in theory, but if they ever worked in practice we might have seen it be more widespread than it is today in team sizes > 3.

You should also note that if the oring is labelled experimental and there's an expectation of failure, it's development and testing will not stop the launch. The shuttle leaves when it leaves, it won't wait for the experimental oring to be done to your liking.

replies(1): >>45083536 #
49. rastignack ◴[] No.45083324{3}[source]
How about acknowledging you’ve been too sharp with words, apologizing, and attempting to move forward ?

I know other people in the kernel do the same mistake as you frequently do on mailing lists. But two wrongs do not make a right.

50. mort96 ◴[] No.45083339{3}[source]
You get a ton of comments like this because it's true. There are real problems in the kernel, I've seen how hostile it can be to people who are just trying to do the right thing and upstream their changes etc. But your case isn't that. Your behavior would get you in trouble at any job where you have to follow rules set by other people. Your refusal to treat your part of the kernel as anything other than your personal pet project has destroyed your project's potential.

If this was a month or two ago, I would've written something vaguely optimistic here about how you could still turn this around somehow, about what lessons you could learn and move forward with. But that ship has sailed. Your project is no longer the promising next generation filesystem which could replace ext4 as the default choice. Your role is now that of the developer of some small out-of-tree filesystem for a small group of especially interested users. Nobody wanted this for you, including myself. But you have refused to listen to anyone's advice, so now you're here.

replies(2): >>45083580 #>>45083898 #
51. nullc ◴[] No.45083355{4}[source]
Linux is not correct. Linux has never been correct. Linux will never be correct. An incorrect belief that it is correct can only make it less correct.

You must know this when it comes to your own work. Why isn't bcachefs written in augmented rust with dependent types and formal correctness proofs for every line of code? How could there ever be a data losing bug if you had a formal proof that the file system could never lose data? Wouldn't that be more correct?

Turns out when some strong/broad notion of correctness isn't (practically) possible it is, in fact, very optional.

Good project management is all about managing resources and balancing tradeoffs. Sometimes this means making or allowing some things to be worse for the benefit of something else or in adherence to a process with a proven track record. Almost every choice makes something less correct than it could be-- with a goal of slowly inching towards a more perfect state overall in the long run.

It's also beneficial to rock the boat a bit at times, people can be wrong, processes can need improvement-- but there is a correct level, timing, and approach to achieve the best benefit. I expect that the kind of absolute approach you seem to have adopted in comments is unlikely to be successful at effective beneficial change.

replies(1): >>45083642 #
52. motorest ◴[] No.45083367{13}[source]
> No, I'm sorry but you're simply wrong.

It sounds like you have a hard time coping with reality.

https://www.phoronix.com/news/Linux-616-Bcachefs-Late-Featur...

I repeat: it sounds an awful lot like you are trying to gaslight this thread. Not cool.

When this fact was again explicitly pointed out to you by Linus himself, you even tried to bullshit Linus and try to move the goalpost with absurd claims about how somehow it was ok to force untested and unreviewed features into a RC because somehow you know better about what users want or need as if it was some kind of justification for you to skip testing and proper release processes.

You need to set aside some time for introspection because you sound like you are your own worst enemy. And those you interact with seem to be fed up and had enough of these stunts.

replies(1): >>45084654 #
53. mort96 ◴[] No.45083423{3}[source]
Exactly. An extremely important part of working in some hierarchical organizational structure, be that as a Linux kernel developer or as an employee at a company, is the ability to disagree with a superior's decision yet acquiesce and go along with it. Good organizations leave room for disagreement, but there always comes a point where someone in a leadership position has made a final decision and the time for debate is over.
54. motorest ◴[] No.45083428{5}[source]
> Linus has broken the build more recently than I have.

Even taking your claims at face value (which from this thread alone is a heck of a leap) I'm baffled by the way you believe this holds any relevance.

I mean, the kernel project has in place a quality assurance process designed to minimize the odds of introducing problems when preparing a release. You were caught purposely ignoring any QA process in place and trying to circumvent the whole quality assurance process and sneak into a RC features that were untested and unverified.

There is a QA process, and you purposely decided to ignore it and plow away. And then your best argument for purposely ignoring any semblance of QA is that others may or may not have broken a build before?

Come on, man. You know better than this. How desperate are you to avoid any accountability to pull these gaslighting stunts?

replies(2): >>45083771 #>>45091388 #
55. quotemstr ◴[] No.45083536{4}[source]
> Simply that he can use these to try to understand the unfairness he feels.

You're suggesting he deal with unfairness by internalizing it as virtue? That's how to make people who cheer at other people's failures.

> Your anarcho-individualistic development ideas sound good in theory

Thanks for illustrating my point. No project, >3 or <= 3, has ever made any new technology by adopting as a tenet that social agreement inside the project is more important than correctly modeling the world outside it, and you're suggesting I'm using inefficiently agreeable-sounding words to express it.

56. sc68cal ◴[] No.45083580{4}[source]
Kent has gotten this same feedback across practically every single platform that has discussed his issues. He is unable to take critique and will instead just continue to argue and be combative, therefore proving yet again why he is in this situation in the first place
57. motorest ◴[] No.45083592{3}[source]
> Which also, at times, means appeasing people even when you are confident that they are wrong because you need their cooperation in the future.

Being unwilling to follow basic QA processes in preparation of a release candidate, and then doubling down by attacking the release engineer with claims the QA process doesn't apply to you because you know better, is something that is far more serious than lacking basic soft skills. It's a fireable offense in most companies.

replies(1): >>45083918 #
58. throwup238 ◴[] No.45083638{4}[source]
I think you mean Wittgenstein, though I wouldn’t recommend Philosophical Investigations as an entrypoint.
59. quotemstr ◴[] No.45083642{5}[source]
You're staking out quite the postmodernist position there. All models are wrong, so who's to say that Alice's data corruption is worse than Bob's man page typo? The important thing is we stick to process with a proven track record, right?

I don't buy it. Object level considerations do matter. Alice's bug really is worse than Bob's. That "proven track record" shouldn't apply to Alice, and insisting that it does for the sake of process, in a way indifferent to the facts of the situation, is just a pretext for doing primate social hierarchy deference rituals in a situation in which they're producing a worse outcome and everyone knows it.

replies(1): >>45083778 #
60. rob_c ◴[] No.45083763{11}[source]
>users are using it explicitly because they have lost a lot of data on btrfs

PLEASE, honestly, EDUCATE THESE USERS. This is still marked experimental for numerous reasons regardless of the 'planned work for 6.18'. Users who can't suffer any data loss and are repeating their mistake of using btrfs shouldn't be using a none default/standard/hardened filesystem period.

replies(1): >>45084241 #
61. koverstreet ◴[] No.45083771{6}[source]
Please, tell us about these wonderful QA processes the kernel has.
62. nullc ◴[] No.45083778{6}[source]
> Object level considerations do matter.

They do. And Kent expressed them and the linux kernel maintainers are amply qualified to hear out and make a call. I don't see a reason to think they were indifferent to the facts, they just weren't convinced by them. If they were they could have just said, "okay we think that this does qualify as a bugfix".

My understanding is the change in dispute wasn't over fixing the corruption introducing bug, but rather adding automated repair for cases where the corruption had already happened. I could easy see taking a position of "sad for people who are already corrupt, they can get their work around out of tree for now" (or heck, even forever depending on the scale of the impact).

Anyone who has been around for a while has seen their share of 'ate the horse to catch the spider to catch the fly to...' dance, of course the patch author is convinced that their repair is correct. They're almost always convinced of that or they don't submit it, so that carries little information. Because of this there is a strong preference for obviously minimal code in any kind of fix. Minimizing user suffering is important, but we also know every line of code comes with risk. The fact that the risk is not measurable on a case by case basis doesn't make it any less real.

replies(1): >>45084453 #
63. rob_c ◴[] No.45083792{3}[source]
> Prioritizing agreeableness above correctness is the reason the space shuttle Challenger blew up.

Oh dear lord no. That is not even what _any_ of the actual investigations suggested.

woke agreeableness is bad but it wasn't getting along at the water-cooler that lead to challenger.

64. koverstreet ◴[] No.45083898{4}[source]
That's because it is my project, and my responsibility.

I can't be bowing to the demands of any one person; I have to balance the wants and needs of everyone and prioritize shipping something that works above all else.

Repeatedly we've seen that those priorities are not shared, unfortunately.

Arguments are just as heated as they ever were, but now instead of arguing over the actual issues - does this work, are we doing this right - people jump to arguing over language and conduct and demanding apologies or calling for people to be expelled.

But my core mission is just shipping a reliable trustworthy filesystem, and that's what I'm going to stick to.

replies(2): >>45084209 #>>45087471 #
65. nullc ◴[] No.45083918{4}[source]
> It's a fireable offense in most companies.

In a company there are other employees who have your success as part of their job function. People to train you, to talk you down off a ledge, people to step in and guard you against misunderstanding or criticisms. People to advocate for you or send you home before a dispute crosses a point of no return. You're also paid to be there, to put up with the companies BS, .. the project isn't yours, it's not usually your reputation that's hurt when the company wants to make a decision you don't agree with and it goes poorly.

The context is so different, I don't think it's really comparable.

replies(1): >>45084515 #
66. whatagreatboy ◴[] No.45083951[source]
Was there any attempt at making rules for experimental features looser than other filesystems? That seems to be the biggest bottleneck here.
replies(1): >>45084532 #
67. ranger_danger ◴[] No.45084011{5}[source]
I'm convinced this account is an alt of koverstreet, possibly just to get around the posting delays.

You seem careful not to refer to any of his decisions as your own, but the writing style and inability to respect authority is still there.

replies(1): >>45084682 #
68. mort96 ◴[] No.45084209{5}[source]
> I can't be bowing to the demands of any one person

This right here is the core of the issue. When you're working as a part of a larger organizational structure, you have to bow down to your boss. When your software is a part of the kernel, it's not your project anymore; it's just one part of Linus's project. You're a contributor, not a leader. Just like I would not control Bcachefs's development process even if I contributed some small but important part to it, you do not control Linux's development process even though you contributed some small but important part to it.

Your core mission is evidently not shipping a reliable trustworthy filesystem. You say that, but your actions speak louder than your words. You know just as well as I do that a filesystem being in-tree rather than out-of-tree makes it significantly more reliable and trustworthy, which is why you chose to get Bcachefs merged into the kernel in the first place. Instead of working within the well-defined boundaries that's necessary to keep Bcachefs in the kernel, you've repeatedly pushed against those boundaries, belittled fellow maintainers, and in general worked hard to make yourself a persona non grata within the kernel community. The predictable outcome is that continued development of Bcachefs will have to happen out-of-tree, and your users won't gain the major reliability and trustworthiness benefits of using an in-tree filesystem. People will warn against using Bcachefs as their root filesystem, since every kernel upgrade will now carry some risk that DKMS or whatever mechanism is used to install the out-of-tree Bcachefs kernel module doesn't work with the new kernel.

And, to be honest, it doesn't matter whether or not you're "right" or "wrong" here. Maybe you're completely correct about absolutely everything and Linus, Greg, Ted, Miguel, Sasha, Josef, and everyone else involved are stupid and don't understand what it takes to develop reliable software. So what? They're your colleagues, some of them are your bosses. Everyone on Hacker News could take your side here and think you've been mistreated, it doesn't help. You'd still be thrown out of the kernel. You'd still be failing your users by not maintaining a good enough relationship with your colleagues and bosses to stay in-tree. You could be completely right on every technical matter and it does not matter.

If you play your cards right, you could maybe end up in a situation where you run the Bcachefs project entirely out-of-tree, with yourself as the supreme leader who doesn't bow down to the demands of anyone, with your own development and release process; and then someone else takes responsibility for pushing your code into the upstream kernel, following Linus's rules. They would dissect your releases and backport bug fixes while leaving out important features, in accordance with Linus's rules. Time will tell if you can find anyone to do that. And time will tell if you posess the humility necessary to let someone else ultimately control the experience most of your users will have.

replies(1): >>45084391 #
69. koverstreet ◴[] No.45084241{12}[source]
No, really. People aren't losing data on bcachefs. We still have minor hiccups that do affect usability, and I put a lot of effort into educating users about where we're at and what to expect.

In the past I've often told people who wanted to migrate off of btrfs "check back in six months", but I'm not now because 6.16 is looking amazingly solid; all the data I have says that your data really is safer on bcachefs than btrfs.

I'm not advocating for people to jump from ext4/xfs/zfs, that needs more time.

replies(1): >>45084861 #
70. p_ing ◴[] No.45084384{3}[source]
When everyone else is the problem, you're the problem.

Consider that when working in teams.

71. koverstreet ◴[] No.45084391{6}[source]
Linus isn't my boss, though.

Am I paid by him or the Linux foundation? No.

Has he ever contributed to bcachefs in any way, is he in any way responsible for making sure that it works properly? No.

The only sense in which he has authority is that he can decide whether or not to pull it into his tree, but that's a two way relationship.

replies(2): >>45084512 #>>45090028 #
72. quotemstr ◴[] No.45084453{7}[source]
Thanks for the thoughtful reply.

> I don't see a reason to think they were indifferent to the facts

I don't think the Linux people thought of themselves as indifferent to facts. Nor do I think they were, not at first. Most people imagine themselves as fair-minded truth-seekers. When stakes are low, they usually act like it. It's only under pressure that people reveal whether they're more committed to PR or progress.

The shitty thing about this situation is that as the dispute escalated, the technical merits of change faded from relevance. (Linus even pulled the corruption repair work in the end!) The argument transformed into a dispute over power, pride, and personalities. Linus's commitment to technical excellence was tested. It failed. Consequently, Linux will lack a cutting-edge filesystem.

I don't even object to Linus being BDFL of Linux. Somebody has to make decisions. I think Linus was wrong to reject the corruption fix patch, but he could plausibly have been right. He had an opportunity to explain his patch rejection in such a way that Overstreet would have understood it as final but also felt heard and valued. Overstreet would have been upset, and justifiably so, but by the next merge window both sides would have cooled down and progress would have resumed.

It's when Linus banned Overstreet and bcachefs from the project that he departed irrecoverably from defensibility. Linus might think he's punishing Overstreet for his intransigence by blocking his work, but Linus is actually taking his frustration out on every Linux user instead. Overstreet's ban is rooted in primate power psychology, not technical trade-offs, and it makes everyone lose.

Technical leaders who ostracize brilliant but difficult people forever cap the amount of progress we can make in the fight against the limits of nature. They're neglecting their responsibilities as leaders to harness difficult people. It's not an easy job, but being a leader shouldn't be.

Linus took the easy way out and banned the brilliant troublemaker. He should be ashamed.

> the risk is not measurable on a case by case basis

It often is. That's why when I'm on the Linus side of a case like this, I try to avoid saying "no" and instead say "yes, if". Sometimes my counterparty pulls out an "if" that convinces me.

73. mort96 ◴[] No.45084512{7}[source]
I have said my piece. It's not me you have to convince.
74. motorest ◴[] No.45084515{5}[source]
> In a company there are other employees who have your success as part of their job function.

Yes, and they enforce basic relase processes to ensure you don't break releases by skipping QA processes or introducing untested and unverified features in release candidates.

And you sure as hell don't have primadonna developers stay in the payroll for long if they start throwing tantrums and personal attacks towards fellow engineers when they are asked to follow the release process or are called out for trying to sneak untested changes in mission-critical components.

75. koverstreet ◴[] No.45084532[source]
That does seem to be one of the big disconnects, yes.

In the past I've argued that I do need a relatively free hand and to be able to move quickly, and explained my reasoning: we've been at the stage of stabilization where the userbase is fairly big, and when someone reports a bug we really need to work with them and fix it in a timely manner in order to keep them testing and reporting bugs. When someone learns the system well enough to report a bug, that's an investment of time and effort on their part, and we don't want to lose that by having them get frustrated and leave.

IOW: we need to prioritize working with the user community, not just the developer community.

All that's been ignored though, and the other kernel maintainers seem to just want to ratchet down harder and harder and harder on strictness.

At this point, we're past the bulk of stabilization, and I've seen (to my surprise) that I've actually been stricter with what I consider a critical fix than other subsystems.

So this isn't even about needing special rules for experimental; this is just about having sane and consistent rules, at all.

replies(1): >>45085965 #
76. koverstreet ◴[] No.45084654{14}[source]
The changes weren't untested or unreviewed, and they've performed flawlessly on quite a few occasions since then.

Sorry, the only person gaslighting here is you.

77. koverstreet ◴[] No.45084682{6}[source]
You think I have an alt with higher karma than my actual account? :)
78. ranger_danger ◴[] No.45084848{5}[source]
Claims made without evidence can be dismissed without evidence.
replies(1): >>45084906 #
79. trueismywork ◴[] No.45084861{13}[source]
You're arguing in circles. Either bcachefs is experimental and hence needs a lot of changes and tools to make sure that users dont lose data and hence the fixes are not critical/users can use a custom branch. Or it is stable and the only thing users need is actual big fixes. Not new tools in an RC3.

Don't compare bcachefs with btrfs for stability. Compare it with ext4. (And dont care anecdotal data, compare the process).

replies(1): >>45087245 #
80. quotemstr ◴[] No.45084906{6}[source]
[flagged]
replies(1): >>45089286 #
81. majorchord ◴[] No.45085121{13}[source]
As a rule, strong feelings about issues do not emerge from deep understanding.
82. koverstreet ◴[] No.45085657{4}[source]
yes, which is why I've been saying for years the kernel community needs to get better at actual conflict resolution...
83. trueismywork ◴[] No.45085965{3}[source]
I have seen your work and have some experience in kernel development. I think the situation is bad for everyone involved: you and linux. I would suggest trying to restart the conversation only focused on experimental feature changes.

In specific, I think there should be am effort to have a label (doesn't matter what one, hidden behind something like "icantbelieveitsnotbcachefs") where then you're (and not just you but anyone who wants to contribute changes to experimental features) allowed to push changes all the time.

That was already working for btrfs and will probably work for btrfs too.

Your argument about reducing feedback time can be a good argument in general. Yo shouldn't approach this as "im right allow me to push code, but start a different conversation about quick testing of experimental code with minimal friction. And make a case in general for linux to have this system.

84. koverstreet ◴[] No.45087245{14}[source]
So, are we agreeing that btrfs isn't fit for purpose, then?
replies(1): >>45088872 #
85. nylonstrung ◴[] No.45087471{5}[source]
I really want bcachefs to succeed.

Please just swallow your pride like all the other maintainers.

This is so self-defeating and it's disappointing the drama has overshadowed the project

86. trueismywork ◴[] No.45088872{15}[source]
I don't understand your question. Are you going somewhere with this?
87. tomhow ◴[] No.45089286{7}[source]
> Building an argument without citing a source is called "thinking for yourself". You should try it.

You've been on HN well and truly long enough to know this is not an acceptable way to comment here.

88. jjaksic ◴[] No.45090028{7}[source]
Linus is not your boss in the sense that he pays you and can tell you what you do day o day, but he is your "boss" in the sense that he's the one who ultimately approves your work (which includes both your code and your conduct).

It's a "two-way street" (you can walk away as much as he), but you need to understand that this is not an equal relationship. It might have been if Linux did not yet have a file system and you were the only person who could build one. If that was the case, Linus might have swallowed his pride for the good of the project. But as it stands, Linux has existed for some 35 years without Bcachefs and can continue to do so. So the key stakeholders have simply decided that with this amount of friction, despite the technical advantages, it's not worth the trouble (yes, it is sad).

Very realistically and bluntly, you have 3 options: a) Learn to work Linus and other kernel maintainers in ways that are comparable with their work and processes. Remember, you're on their territory, and when in Rome, do as the Romans do. b) Keep and develop Bcachefs out of kernel. That way you can stay on your own turf and work on your terms, but Bcachefs is going to be a much less attractive and viable option, leave alone become the default fs. c) Have someone else do the integration work and collaboration with the kernel team.

These are your options, but a combination is also an option. I would probably recommend starting with option b first and finish the bulk of the work out of tree. Then once it's all done and ready to ship, try to get it into the kernel as politely and timely as possible (you shouldn't need any late commits this time). Continue developing new and experimental functionality out of tree to keep the number of PRs (and thus possible causes of friction) low. I don't know if at any point you'd want to ask someone else to interface with the kernel team. I think it's far preferable if you can learn to do it yourself. Knowing how to work with people (including difficult people) is an incredibly important and useful skill in almost anyone's career. Maybe find a communication/diplomacy mentor? I've never heard anyone complain about your engineering skills, but this is really holding you back.

Again and as always, thank you for your hard work. I wish you all the best and hope that some day we can all use Bcachefs by default.

89. rcxdude ◴[] No.45091388{6}[source]
I would also like to know what the QA process is, because all I can see is basically 'linus pulls in changes in the merge window, checks that the basic stuff builds, then releases the RCs and some people do some checks in some way, varying from users on the bleeding edge, some people doing manual verification on specific hardware and use-cases, and maybe some automatic tests and analysis that are not really documented anywhere, and the end result is some bug reports'. Is there anything more co-ordinated than that? Like some description of what is tested and how, or an explicit green indication that those tests have actually happened and a policy on what would hold up a release?