Most active commenters
  • hinkley(5)

←back to thread

SQLite File Format Viewer

(sqlite-internal.pages.dev)
272 points ilumanty | 12 comments | | HN request time: 0.852s | source | bottom
Show context
jchw ◴[] No.43684392[source]
This really would've come in handy when I was debugging my own SQLite parser a couple weeks ago.

One thing that initially confused me was how exactly the pages worked w.r.t. the first page on disk... I misunderstood the SQLite documentation in different ways, but it's really rather simple: the very first page is just treated as containing the file header in it, and it pushes down the rest of the data, making the page shorter than the other pages. You can see that illustrated clearly if you click into the first page of a database using this tool: the database header comes first, then the page header.

This tool will undoubtedly come in handy for anyone who has a reason to be dealing with SQLite data structures directly for whatever reason, especially since the SQLite documentation is a bit terse at times.

replies(3): >>43684613 #>>43688207 #>>43692316 #
1. hinkley ◴[] No.43684613[source]
I really want a data format that is effectively binary JSON. What is the subset of all of the features of SQLite that makes either a read-only or an updatable data set that is compact. But better searchability than a streaming parser.
replies(6): >>43684710 #>>43685626 #>>43685952 #>>43687507 #>>43688023 #>>43690668 #
2. jchw ◴[] No.43684710[source]
If you want to maintain the properties that SQLite has for read use cases, you'll need to replicate a couple of features. At the very least, you'll probably want the format to still be page-based with a BTree structure. You really could get away with just using the SQLite format if you didn't mind the weirdness; a functional SQLite parser that can read tables would not be a significant amount of code. I think, though, that if you want to read the schema as SQLite understands it, you'd need to interpret the CREATE TABLE syntax, which would make it a bit more complex for sure. Otherwise, you can read tables and columns themselves relatively easily, and the values are all stringified.
replies(1): >>43685643 #
3. Retr0id ◴[] No.43685626[source]
sqlite itself supports a binary encoding of JSON: https://sqlite.org/jsonb.html
replies(1): >>43685896 #
4. hinkley ◴[] No.43685643[source]
Yeah if I wasn’t clear I’m talking about a minimal file that SQLite can still open read only without errors, not a third party implementation. Though there might be a few tweaks that would allow SQLite to be a bit more lenient. For instance missing metadata that can be assumed. Maybe b tree nodes exceeding the usual load factor.
5. hinkley ◴[] No.43685896[source]
When I said binary JSON I didn’t mean literal JSON. I meant “common denominator interchange format”. It’s too chatty by far and has dismal performance for queries. So you’re better off asking a specific question and getting a larger document that could answer many questions that you do t yet have. For CDNs things like this matter a lot.
replies(1): >>43689093 #
6. cwmma ◴[] No.43685952[source]
Parquet or some other column oriented data format is probably closest to what you want without getting into indexing your flat files or similar
7. w10-1 ◴[] No.43687507[source]
MongoDB's BSON?
replies(1): >>43697578 #
8. 79a6ed87 ◴[] No.43688023[source]
Have you tried MessagePack[0]?

0: https://msgpack.org/index.html

replies(1): >>43697567 #
9. ◴[] No.43689093{3}[source]
10. ◴[] No.43690668[source]
11. hinkley ◴[] No.43697567[source]
I would probably just use bson or gRPC. As o clarified elsewhere, I means JSON as an analogy. I want something that can be scanned and queried cheaply.
12. hinkley ◴[] No.43697578[source]
Mongo sits on a throne of lies and I will never condone anyone using it for any purpose except to make a joke.