Most active commenters
  • Ericson2314(9)
  • steveklabnik(5)
  • sunshowers(4)

←back to thread

Jujutsu for everyone

(jj-for-everyone.github.io)
434 points Bogdanp | 24 comments | | HN request time: 1.926s | source | bottom
1. Ericson2314 ◴[] No.45084874[source]
I want to read "Jutustsu for Git experts"

For example, will the committing of conflicts (a good idea I agree), mess up my existing git rerere?

Also I agree that the staged vs unstaged distinction is stupid and should be abolished, but I do like intentionally staging "the parts of the patch I like" while I work with git add -p. Is there a a lightweight way to have such a 2-patch-deep patch set with JJ that won't involve touching timestamps unnecessarily, causing extra rebuilds with stupid build systems?

replies(3): >>45085487 #>>45088024 #>>45088309 #
2. mdaniel ◴[] No.45085487[source]
> Also I agree that the staged vs unstaged distinction is stupid

...

> I do like intentionally staging "the parts of the patch I like" while I work with git add -p

is a mysterious perspective to me. I guess with enough $(git worktree && git diff && vi && git apply) it'd be possible to achieve the staging behavior without formally staging anything but yikes

I just checked and it seems that mercurial 7.1 still doesn't believe in $(hg add -p) so presumably that 'worktree' silliness is the only way to interactively add work in their world

replies(2): >>45088621 #>>45093843 #
3. steveklabnik ◴[] No.45088024[source]
1. You wouldn’t use git rere with jj, so that’s sort of a nonsequitor.

2. You treat @ (the working copy) like the staging area, @- (the parent of the working copy) as the commit you’re working on, and then jj squash -i (“interactive”) the parts of the diff you want back into @.

replies(2): >>45088172 #>>45089207 #
4. Ericson2314 ◴[] No.45088172[source]
If I use JJ in my existing git clone, and then use the occasional git command, I don't want rerere to be messed up
replies(1): >>45088356 #
5. sfink ◴[] No.45088309[source]
Yes, and this is the most commonly used workflow these days with jj (the "squash workflow"). You have a top commit, which is also your working directory, and you make changes freely. To "stage" something, you squash it down into the next commit (all changes, or interactively selected changes with -i aka --interactive).

This generalizes to using a whole stack of "stages", by doing ´squash --into´ to select the patch to put the changes into if it's not just the next one down.

replies(1): >>45103341 #
6. steveklabnik ◴[] No.45088356{3}[source]
In general you don’t want to mix git mutating commands with jj commands. I believe that there might be a way to get jj to resync its understanding of the word, but I’m not sure what it is off the top of my head.

In rere’s case specifically, I’d expect you’d just be using jj’s rebase, so it shouldn’t be needed though of course want and need are different things.

replies(1): >>45093882 #
7. sunshowers ◴[] No.45088621[source]
In Mercurial you'd do hg commit -i and squash further changes down incrementally via hg amend -i, similar to Jujutsu.

(The first thing about Jujutsu that was earth-shattering for me was learning that jj amend is an alias for jj squash. I swore aloud for several minutes when I first learned that.)

replies(1): >>45088687 #
8. mdaniel ◴[] No.45088687{3}[source]
How would anyone possibly know that? https://mercurial-scm.org/help/commands/commit says no such thing but I also recognize that I am obviously a fool for thinking that one should add the whole file but only commit part of it

I guess that also places the burden upon the user to .. I dunno, go through that whole TUI dance again?, if one wishes to amend one more line in the file

In some sense, I do recognize it's like showing up to an emacs meeting and bitching about how it doesn't do things the vim way, but seriously, who came up with that mental model for committing only part of the working directory?

replies(1): >>45089053 #
9. sunshowers ◴[] No.45089053{4}[source]
Well, git add is super overloaded, because it lets you add untracked files (especially with -N), or all or parts of tracked files to the staging area. Mercurial is a different system with different primitives, where each command tends to do one thing, and add is only meant to operate on untracked files.

I strongly prefer JJ's approach of simply doing away with the concept of untracked files, though note that this is one of the features that is designed around developers having NVMe drives these days. It wouldn't have been possible to scan the working copy with every command back in 2004.

replies(1): >>45093932 #
10. didibus ◴[] No.45089207[source]
I think the issue for me will be that IDEs tend to show diff with head, I'm not even sure you can configure them otherwise.
replies(1): >>45089655 #
11. steveklabnik ◴[] No.45089655{3}[source]
I’m not a big “VCS in an IDE” person but visualjj works for vs code, and there’s other plugins to, I hear. If your IDE isn’t configurable and assumes certain things then it might not work, yeah.
replies(1): >>45096027 #
12. Ericson2314 ◴[] No.45093843[source]
It's a bit tongue in cheek, I'm saying

"I know this thing is bad, and it shouldn't exist, but I'm also personally used to it right now, and it does have some perverse silver linings"

13. Ericson2314 ◴[] No.45093882{4}[source]
Mmm I see. Getting out of sync sounds very bad!

https://lore.kernel.org/git/CAESOdVAspxUJKGAA58i0tvks4ZOfoGf... I hope this happens, because then it seems like far less state would be needed on the jj side.

replies(1): >>45097291 #
14. Ericson2314 ◴[] No.45093932{5}[source]
It doesn't really depend on NVMe, that's just the OS sucking.

The right way has always been FUSE, so that version control knows about every change as it happens. Push, not pull (or poll).

With FUSE passthrough, maybe this won't even be slow!

replies(1): >>45094158 #
15. sunshowers ◴[] No.45094158{6}[source]
> It doesn't really depend on NVMe, that's just the OS sucking.

I've spent so much of my professional career profiling source control access patterns. Hot cache tends to be OS VFS layer performance, but the moment you hit disk that dominates, unless the disk is NVMe (or, back in the day, PCIe flash storage). Further compounding this is the use of a naive LRU cache on some OSes, which means that once the cache size is exceeded, linear scans absolutely destroy performance.

> FUSE

So you might think that, but FUSE turns out to be very hard to do correctly and performantly. I was on the source control team at Facebook, and EdenFS took many years to become stable and performant enough. (It was solving a harder problem though, which was to fetch files lazily.)

I believe Microsoft tried using a FUSE equivalent for the Windows repo for a while, but gave up at some point.

replies(1): >>45103417 #
16. didibus ◴[] No.45096027{4}[source]
I don't mean VCS UI, I mean that in the project view and when viewing source files, the editor highlights added, removed and changed lines and normally if you click on it it shows the diff at that code block.

Generally IDEs will be diffing against head or staging and it's not very configurable.

replies(1): >>45096651 #
17. steveklabnik ◴[] No.45096651{5}[source]
Yeah, visualjj does that stuff. But for sure, if your IDE isn’t configurable there’s not much that can be done.
18. steveklabnik ◴[] No.45097291{5}[source]
It definitely will be nice, but I’m not super heartened by the conversation that ensued.

It also doesn’t super change jj itself, in the sense that it still needs to support other backends, but it does simplify the git backend.

replies(1): >>45103324 #
19. Ericson2314 ◴[] No.45103324{6}[source]
I did not read the whole thread, but I didn't encounter anyone saying "no", just tepid bike-shedding.
20. Ericson2314 ◴[] No.45103341[source]
Ah a squashing of the top commit in a git rebase will touch the working tree more than necessary. But jj might just not do that?
replies(1): >>45107119 #
21. Ericson2314 ◴[] No.45103417{7}[source]
We're still talking about different things here. I'm saying the entire "VCS scans file system to sync state" is the wrong algorithm. It's unecessary work because there are two sources of truth.

Forgot the constant factors of FUSE, and imagine an in-kernel git implementation. If you have a Merkel CoW filesystem, then when (ignoring journals) you modify child files, you need to update parent directories on disk anyways, this is a great time to recompute VCS hashes too.

"git status" is, if the journal is flushed and hashes are up to date, always an O(1) operation.

replies(1): >>45111070 #
22. sfink ◴[] No.45107119{3}[source]
Oh, I missed this part. I think jj is better here in at least one scenario.

Specifically, I believe the scenario you're talking about is:

    change file1
    build, producing binary.out
    squash the change down (leaving your working copy unmodified)
    rebuild
If the squash updates the timestamp on file1, then the rebuild will redo the compilation steps that use file1 as input.

When I test it out, it looks like doing a whole-file squash with jj does not update the timestamp. Hm... I guess even a partial squash doesn't update the current contents, let me try that too... yes, again jj does not touch the file nor update its timestamp.

So it looks like it does do what you want, if I'm understanding things correctly.

replies(1): >>45127380 #
23. sunshowers ◴[] No.45111070{8}[source]
You might be interested in how this problem was solved by our team at Meta, in EdenFS (https://github.com/facebook/sapling/blob/main/eden/fs/docs/O...) and Watchman: https://github.com/facebook/watchman.

What you're describing is reasonably similar to EdenFS, except EdenFS runs in userspace.

Watchman layers a consistent view of file metadata on top of inotify (etc), as well as providing stateless queries on top of EdenFS. It acts as a unified interface over regular filesystems as well as Eden that provides file lstat info and hashes over a Unix domain socket.

Back in the day, Watchman sped up status queries by over 5x for a repo with hundreds of thousands of files: https://engineering.fb.com/2014/01/07/core-infra/scaling-mer... I worked directly on this and co-wrote this blog post.

In truth, getting these two components working to the standard expected by developers was a very difficult systems problem with a ton of event ordering and cache invalidation concerns. (With EdenFS, in particular, I believe there was machine learning involved to detect prefetch patterns.) For smaller repos, it is much simpler to do linear scans. Since it is really fast on modern hardware anyway, it is also the right thing to do, following the maxim of doing the simplest thing that works.

24. Ericson2314 ◴[] No.45127380{4}[source]
Yeah that makes sense!l. With git reset HEAD^ --soft, git commit --amend, git will do the same thing, but it won't bother to do that with rebases.