Most active commenters

    ←back to thread

    207 points sebg | 16 comments | | HN request time: 0.633s | source | bottom
    1. derefr ◴[] No.45309542[source]
    CRAN’s approach here sounds like it has all the disadvantages of a monorepo without any of the advantages.

    In a true monorepo — the one for the FreeBSD base system, say — if you make a PR that updates some low-level code, then the expectation is that you 1. compile the tree and run all the tests (so far so good), 2. update the high-level code so the tests pass (hmm), and 3. include those updates in your PR. In a true centralized monorepo, a single atomic commit can affect vertical-slice change through a dependency and all of its transitive dependents.

    I don’t know what the equivalent would be in distributed “meta-monorepo” development ala CRAN, but it’s not what they’re currently doing.

    (One hypothetical approach I could imagine, is that a dependency major-version release of a package can ship with AST-rewriting-algorithm code migrations, which automatically push both “dependency-computed” PRs to the dependents’ repos, while also pushing those same patches as temporary forced overlays onto releases of dependent packages until such time as the related PRs get merged. So your dependents’ tests still have to pass before you can release your package — but you can iteratively update things on your end until those tests do pass, and then trigger a simultaneous release of your package and your dependent packages. It’s then in your dependents’ court to modify + merge your PR to undo the forced overlay, asynchronously, as they wish.)

    replies(5): >>45309883 #>>45310322 #>>45310479 #>>45310852 #>>45312230 #
    2. joek1301 ◴[] No.45309883[source]
    > One hypothetical approach I could imagine, is that a dependency major-version release of a package can ship with AST-rewriting-algorithm code migrations

    Jane Street has something similar called a "tree smash" [1]. When someone makes a breaking change to their internal dialect of OCaml, they also push a commit updating the entire company monorepo.

    It's not explicitly stated whether such migrations happen via AST rewrites, but one can imagine leveraging the existing compiler infrastructure to do that.

    [1]: https://signalsandthreads.com/future-of-programming/#3535

    replies(1): >>45312172 #
    3. chii ◴[] No.45310322[source]
    > In a true monorepo ...

    ideally yes. However, such a monorepo can become increasingly complex as the software being maintained becomes larger and larger (and/or more and more people work on it).

    You end up with massive changes - which might eventually become something that a single person cannot realistically contain within their brain. Not to mention clashes - you will have people making contradictory/conflicting changes, and there will have to be some sort of resolution mechanism outside (or the "default" one, which is first come first served).

    Of course, you could "manage" this complexity by attributing api boundary/layers, and these api changes are deemed to be important to not change too often. But that simply means you're a monorepo only in name - not too different from having different repos with versioned artefacts with a defined api boundary.

    replies(2): >>45311198 #>>45313885 #
    4. skybrian ◴[] No.45310479[source]
    Yes, it's nice when you can update arbitrarily distant files in a single commit. But when an API is popular enough to be used by dozens of independent projects, this is no longer practical. Even in a monorepo, you'll still need to break it up, adding the new API, gradually migrating the usages, and then deleting the old API.
    replies(1): >>45310810 #
    5. vasvir ◴[] No.45310810[source]
    Yes,

    Also the other problem of a big monorepo is that nothing ever dies. Let's say you have a library and there are 1000 client programs or other libraries of your API. Some of them are pretty popular and some of them are fringe.

    However when you are changing the API they all have the same weight. You have to fix them all. In the non monorepo case the fringe clients will eventually die or their maintainer will invest on them and update them. It's like capitalism vs communism with central planning and all.

    replies(1): >>45311165 #
    6. boris ◴[] No.45310852[source]
    There is a parallel with database transactions: it's great if you can do everything in a single database/transaction (atomic monorepo commit). But that only scales so far (on both dimensions: single database and single transaction). You can try distributed transactions (multiple coordinated commits) but that also has limits. The next step is eventual consistency, which would be equivalent to releasing a new version of the component while preserving the old one and with dependents eventually migrating to it at their own pace.
    replies(1): >>45311313 #
    7. malkia ◴[] No.45311165{3}[source]
    If the monorepo is build and tested by single build system (bazel, buck, etc.), then it can graph leaf dependencies with no users. For example library + tests, but no one using it (granted it might be something new popping out, still in early development).

    Bazel has the concept of visibility where while you are developing something in the tree, you may explicitly say who can use it (like trial version).

    But the point is, if something is build, it must be tested, and coverage should catch what is build, but not tested, but also should catch what is build and tested but not really used a lot.

    But why remove it, if it takes no time to build & test (?), and if it takes more time to test, it's usually on your team to start your own testing env, and not rely on the general presubmit/preflight one, and because since the last capacity planning you have only that amount of budget, you'll soon realize - do we really need this piece of code & the tests?

    I mean it's not perfect, there would be always something churning using time & money, but until it's pretty big problem it won't go away automatically (yet)

    replies(1): >>45311995 #
    8. rafaelmn ◴[] No.45311198[source]
    >Of course, you could "manage" this complexity by attributing api boundary/layers, and these api changes are deemed to be important to not change too often. But that simply means you're a monorepo only in name - not too different from having different repos with versioned artefacts with a defined api boundary.

    You have visibility into who is using what and you still get to do an atomic update commit even if a commit will touch multiple boundaries - I would say that's a big difference. I hated working with shared repos in big companies.

    9. awesome_dude ◴[] No.45311313[source]
    Doesn't that rely on the code being able to work in both states?

    I mean, to use a different metaphor, an incremental rollout is all fine and dandy until the old code discovers that it cannot work with the state generated by the new code.

    replies(2): >>45311650 #>>45312980 #
    10. immibis ◴[] No.45311650{3}[source]
    Yes, it does.
    11. gmueckl ◴[] No.45311995{4}[source]
    Dead cose in a huge monorwpo is more costly than just build and test time. It's also noise when searching through code. One thing to realize is that deleting dead code from the tree doesn't destroy anything because it's still in the repo history and can be restored from there.
    replies(1): >>45312532 #
    12. swiftcoder ◴[] No.45312172[source]
    This is more of less how Facebook developed PHP -> Hack on the fly. Each new language feature would be patched in, and at the same time, a whole-monorepo transform would be run to adopt the feature. Pretty neat, if a logistical nightmare
    13. xg15 ◴[] No.45312230[source]
    I agree, more automated tools for API migration would be a good next step, but I think that's missing the point a bit.

    Read the actionable part of the "dependency error" mail again:

    > Please reply-all and explain: Is this expected or do you need to fix anything in your package? If expected, have all maintainers of affected packages been informed well in advance? Are there false positives in our results?

    This is not a hard fail and demand that you go back and rewrite your package. It's also not a demand for you to go out on your own and write pull requests for all the dependent packages.

    The only strict requirement is to notify the dependents and explain the reason of that change. Depending on the nature of the change, it's then something the dependents can easily fix themselves - or, if they can't, you will likely get feedback what you'd have to change in your package to make the migration feasible.

    In the end, it's a request for developers to get up and talk to their users and figure out a solution together, instead of just relying on automation and deciding everything unilaterally. It's sad that this is indeed a novel concept.

    (And hey, as a side effect: If breaking changes suddenly have a cost for the author, this might give momentum to actually develop those automated migration systems. In a traditional package repository, no one might even have seen the need for them in the first place)

    14. yorwba ◴[] No.45312532{5}[source]
    Hence why Google has Sensenmann to reap dead code: https://testing.googleblog.com/2023/04/sensenmann-code-delet...
    15. stetrain ◴[] No.45312980{3}[source]
    Yes, but depending on the code you’re working on that may be the case anyway even with a monorepo.

    For example a web api that talks to a database but is deployed with more than one instance that will get rolling updates to the new version to avoid any downtime. There will be overlapping requests to both old and new code at the same time.

    Or if you want to do a trial deployment of the new version to 10% of traffic for some period of time.

    Or if it’s a mobile or desktop installed app that talks to a server where you have to handle people using the previous version well after you’ve rolled out an update.

    16. ec109685 ◴[] No.45313885[source]
    They don’t have to be massive changes. You can release the feature with with backwards compatibility and then gradually update dependencies and remove the old interface.