←back to thread

334 points gjvc | 1 comments | | HN request time: 0.202s | source
Show context
cosmic_quanta ◴[] No.31847838[source]
That looks awesome. One of the listed use-cases is 'time-travel': https://dolthub.com/blog/2021-03-09-dolt-use-cases-in-the-wi...

I wish we could use this at work. We're trying to predict time-series stuff. However, there's a lot of infrastructure complexity which is there to ensure that when we're training on data from years ago, that we're not using data that would be in the future from this point (future data leaking into the past).

Using Dolt, as far as I understand it, we could simply set the DB to a point in the past where the 'future' data wasn't available. Very cool

replies(5): >>31847959 #>>31848014 #>>31849805 #>>31849874 #>>31859003 #
1. Cthulhu_ ◴[] No.31859003[source]
I've (partially / POC) implemented time travel in a SQLite database; the TL;DR is that whenever you create a table, you add a second, identical or nearly-identical table with a `_history` suffix; the history table has a valid from and valid to. Then you add a trigger on the primary table that, on update or on delete, makes a copy of the old values into the history table, setting the 'valid_to' column to the current timestamp.

The reason I used a separate table is so that you don't have to compromise or complicate the primary table's constraints, indices and foreign keys; the history table doesn't really need those because it's not responsible for data integrity.

Anyway, once you have that, you can run queries with a `where $timestamp is between start_date and end_date` condition, which will also allow you to join many tables at a certain point in time. To also be able to get the latest version, you can use a union query (iirc).

I'm sure there's a lot of caveats there though. What I should do is take some time in the weekend and work on that POC, publish it for posterity / blog post fuel.