Build your own SQLite in Rust, Part 5: Evaluating queries

(blog.sylver.dev)

1. bfrog ◴[20 Feb 25 00:41 UTC] No.43109740[source]▶

This is a really cool set of articles, and while it’s not going to replace sqlite it’s fantastic to see the pieces needed to do sql with SQLite’s file format

replies(1): >>43109976 #

2. rockwotj ◴[20 Feb 25 01:18 UTC] No.43109976[source]▶

>>43109740 #

SQLites alternative in rust is a thing though

https://github.com/tursodatabase/limbo

replies(3): >>43110098 #>>43110247 #>>43113755 #

3. MobiusHorizons ◴[20 Feb 25 01:38 UTC] No.43110098{3}[source]▶

>>43109976 #

perhaps instead of `successor` you could say `rust fork` or `alternative`. Successor implies the original is dead or deprecated, or no longer going to be used, which is very far from the truth.

4. mmiao ◴[20 Feb 25 02:00 UTC] No.43110213[source]▶

>>43108614 (OP) #

Making a basic database that works is not hard, but making a robust one is really hard. What makes SQLite shines is its extensive testsuite. I still don't understand the motivation of limbo, as without years of hardwork on test you can't say you are correct, and why i should pay for it...

replies(3): >>43110449 #>>43110974 #>>43115007 #

5. koakuma-chan ◴[20 Feb 25 02:08 UTC] No.43110247{3}[source]▶

>>43109976 #

Can't wait to get my Blazingly Fast™ Managed Cloud SQLite for only 0.0001$ per virtual compute second.

replies(1): >>43110728 #

6. pornel ◴[20 Feb 25 02:38 UTC] No.43110449[source]▶

>>43110213 #

At this point it's too early to even worry about correctness, it doesn't work yet.

But the years of work put into the existing project to make it robust don't mean the exact same years have to be spent on the reimplementation:

- there's been work spent on discovering the right architecture and evolving the db format. A new impl can copy the end result.

- hard lessons have been learned about dealing with bad disks, filesystems, fsync, flaky locks, etc. A new impl can learn from the solutions without having to rediscover them the hard way.

- C projects spend some time on compatibility with C compilers, OSes, and tweaking build scripts, which are relatively easy in Rust.

Testing will need a clever solution. Maybe they'll buy access to the official test suite? Maybe they'll use the original SQLite to fuzz and compare results?

replies(1): >>43111451 #

7. pigbearpig ◴[20 Feb 25 03:22 UTC] No.43110728{4}[source]▶

>>43110247 #

I think you’re looking for LiteFS

https://fly.io/blog/introducing-litefs/

replies(1): >>43111086 #

8. n42 ◴[20 Feb 25 04:05 UTC] No.43110974[source]▶

>>43110213 #

Something being hard doesn't mean it's not worth trying.

It's up to the developers to decide if the outcome is worth the effort.

Maybe they're right, maybe they're wrong, but either way – if they succeed, we're all better off for it, aren't we?

9. wongarsu ◴[20 Feb 25 04:21 UTC] No.43111086{5}[source]▶

>>43110728 #

Also Cloudflare D1, announced just a couple months earlier:

https://blog.cloudflare.com/introducing-d1/

10. Animats ◴[20 Feb 25 05:10 UTC] No.43111337[source]▶

>>43108614 (OP) #

I wish someone would finish a decent database in Rust. At least get it to 1.0 stable and go on from there.

- Limbo: "Limbo is a work-in-progress..."

- Sled [1]. Not sure what's going on there. Last release 3 years ago, but a constant stream of "alpha" versions that never get released.

SQLite with Rust bindings seems to be the go-to system. Depending on C packages is often a headache when cross-compiling, though.

[1] https://crates.io/crates/sled/

replies(6): >>43111479 #>>43111692 #>>43111934 #>>43112102 #>>43113368 #>>43138877 #

11. crabmusket ◴[20 Feb 25 05:32 UTC] No.43111451{3}[source]▶

>>43110449 #

The Limbo team seems to be leaning heavily into deterministic simulation testing (DST) and one of the cofounders on a recent podcast was very enthusiastic about the benefits of the approach.

https://github.com/tursodatabase/limbo/tree/main/simulator

https://changelog.com/podcast/626

12. koakuma-chan ◴[20 Feb 25 05:36 UTC] No.43111479[source]▶

>>43111337 #

I recommend `rusqlite` or `heed` (`lmdb` wrapper, key-value database). I think `sled` is abandoned.

13. tyushk ◴[20 Feb 25 06:18 UTC] No.43111692[source]▶

>>43111337 #

SurrealDB [1] is a fairly complete database written in Rust. I've used it for fairly small web apps and it felt comfortable to work with coming from MongoDB.

[1] https://github.com/surrealdb/surrealdb?tab=readme-ov-file

replies(1): >>43111801 #

14. Tanjreeve ◴[20 Feb 25 06:41 UTC] No.43111801{3}[source]▶

>>43111692 #

SurrealDB is still really coy on its performance/what it's good at/not good at to adopt for a major data project. There's lots of features but no real indication as to if I could scale them for a dataset of billions of records. I've had my fingers burnt too many times before by products with a big table of tick box features but none of them are really usable (e.g Geospatial data comes up for me a lot)

Either you need to make it easy and zero friction to adopt like duckDB and let people find out themselves in an hour or two or you need to provide some sort of benchmarks + evidence that it isn't going to die on its arse the moment you put larger than memory amounts of data in.

Nearly all of these projects work fine for in memory size datasets but only finding out after you've put major effort into adoptionv+integration isn't really easy for someone working with data when you have something battle tested like Postgres et Al.

replies(1): >>43131421 #

15. fooker ◴[20 Feb 25 07:08 UTC] No.43111934[source]▶

>>43111337 #

Starting projects is fun and easy. Finishing is not.

16. n_plus_1_acc ◴[20 Feb 25 07:40 UTC] No.43112102[source]▶

>>43111337 #

influxdb is great

17. noirscape ◴[20 Feb 25 11:08 UTC] No.43113368[source]▶

>>43111337 #

The problem that Rust runs into is that the community settled on wanting enforced semver extremely quickly, combined with an overall tendency to really not want to do post-1.0 releases if they can help it.

The result is that most Rust crates sit on the 0.x version forever (since anything before 1.0 in semver is pretty much a free-for-all), even though they're probably going to be perfectly usable in most cases with the same library breakage you expect in other languages.

IMO it just shows another dent in thinking that a simple string can confer any information about the API contracts on offer. Semver is a neat indicator, but practically that's all it is.

18. usrbinbash ◴[20 Feb 25 12:10 UTC] No.43113755{3}[source]▶

>>43109976 #

"alternative" implies that this is either on a comparable level of battle-testedness and feature completeness as sqlite, or that it solves problems that sqlite has, or that is delivers substantiable, or at least noticeable advantages over sqlite.

So, which of these points apply?

replies(1): >>43115907 #

19. mamcx ◴[20 Feb 25 14:25 UTC] No.43115007[source]▶

>>43110213 #

> without years of hardwork on test you can't say you are correct

Before, I have the myth in my head that RDBMS development is like cryptography/kernels: Something not mean for humans to do.

Now working as part of one (http://spacetimedb.com) I see now that is hard, but doable.

Certainly MORE EASY than working in ERPs that is the thing that I have done for more than 20 years (now THAT is what is insane). Literally doing RDBMS is more relaxing than doing ERPs.

But what I have learned is that what make RDBMS truly hard is not the usual suspect: ACID.

You don't need 'years of hardwork' to reach it and to prove them is right (only some insanity when you need to target very weird os/archs and deal with faulty hardware).

What is truly hard and will eat your time is sql. SQL is hard. Is *bad*. Is *very hard* to optimize, and is so anemic that your time is expended more in the query optimizer (that has hard time because sql more than anything else) and because sql is bad, I tell you, it leads to insane stuff like having a single query with 1000 joins, and people have not means to good, optimal designs (and what you have like CTE and all that is poor bandaids) so now you get a bad input and have fun.

But if the RDBMS stay in the relational part and you could make a better lang to interface it then is far more easy. Still query optimizer is big part of your time, but if you have control of the input language you can do your life easy.

replies(1): >>43115969 #

20. sejje ◴[20 Feb 25 15:33 UTC] No.43115907{4}[source]▶

>>43113755 #

I don't think it implies any of that.

I think it means you might use it in similar situations. Like when you want an in-memory database.

replies(1): >>43117934 #

21. whstl ◴[20 Feb 25 15:37 UTC] No.43115969{3}[source]▶

>>43115007 #

This is interesting.

I completely get why SQL is important and as an user I love it as a language, but as an application writer I have a very complicated relationship with it. It requires either inline SQL (which carries its own security risks and causes a bit of redundancy), or we gotta use complex abstractions on top of it, like ORMs. Tests also require spinning up a database or require some extra machinery.

It feels like we're making our lives more complicated by "requiring" a database to have SQL.

I would be totally ok with adopting a non-SQL relational database with a more structured API in greenfield projects. (Btw I will be definitely checking out your company).

22. usrbinbash ◴[20 Feb 25 17:56 UTC] No.43117934{5}[source]▶

>>43115907 #

No I might not. In these similar situations, I will use sqlite, unless someone can point out one of the items listed above as a reason to use something else.

replies(1): >>43120386 #

23. NoahKAndrews ◴[20 Feb 25 21:23 UTC] No.43120386{6}[source]▶

>>43117934 #

Just because you don't like it doesn't make it not an alternative. Even if it's objectively worse, it's still an alternative.

replies(1): >>43125614 #

24. usrbinbash ◴[21 Feb 25 09:18 UTC] No.43125614{7}[source]▶

>>43120386 #

> Just because you don't like it

Read my post again. Sympathy or lack thereof don't factor into this equation. This is about battle-testedness and features. I don't make technical decisions based on emotions.

25. mattturck ◴[21 Feb 25 18:55 UTC] No.43131421{4}[source]▶

>>43111801 #

Benchmarking is here: https://surrealdb.com/blog/beginning-our-benchmarking-journe...

replies(1): >>43148376 #

26. anarki8 ◴[22 Feb 25 13:39 UTC] No.43138877[source]▶

>>43111337 #

redb is what are you looking for

https://github.com/cberner/redb

27. Tanjreeve ◴[23 Feb 25 10:46 UTC] No.43148376{5}[source]▶

>>43131421 #

That's great they got some out two weeks ago. This has been a couple of years getting here.