←back to thread

Dolt is Git for data

(www.dolthub.com)
358 points timsehn | 3 comments | | HN request time: 0.448s | source
1. tgb ◴[] No.22736930[source]
An example use case that "git for data" seems to break: storing data for medical research where the participants are allowed to withdraw from the study after the fact. Then their data must be deleted retroactively, not just in the head node. I don't know of a good methodology for dealing with this at all as it breaks backups, for example.

The problem extends beyond medical research due to privacy laws like the GDPR. A participant or user must be able to delete their data not merely hide it so as to protect themselves from data breaches. Suggestions welcome.

replies(2): >>22736959 #>>22739368 #
2. kspacewalk2 ◴[] No.22736959[source]
In principle, you should be able to 'rewrite history' in the same way you can already do with git. It is clunky to remove a file from all versions using git itself but easy using tools like bfg[0].

[0] https://rtyley.github.io/bfg-repo-cleaner

3. zachmu ◴[] No.22739368[source]
You can rebase to change the history. As with git, if you do this, everyone with a clone will need to clone a fresh copy, as they can no longer merge with the remote HEAD.