←back to thread

73 points rockeetterark | 1 comments | | HN request time: 0.205s | source

As the creator of TerarkDB (acquired by ByteDance in 2019), I have developed ToplingDB in recent years.

ToplingDB is forked from RocksDB, where we have replaced almost all components with more efficient alternatives(db_bench shows ToplingDB is about ~8x faster than RocksDB):

* MemTable: SkipList is replaced by CSPP(Crash Safe Parallel Patricia trie), which is 8x faster.

* SST: BlockBasedTable is replaced by ToplingZipTable, implemented by searchable compression algo, it is very small and fast, typically less than 1μs per lookup:

  * Keys/Indexes are compressed   using NestLoudsTrie(a multi-layer nesting LOUDS succinct trie).

  * Values in a SST are compressed   together with better zip ratio than zstd, and can unzip by a single value at 1GB/sec.

  * BlockCache is no longer needed, double caching(BlockCache & PageCache) is avoided
Other hotspots are also improved:

* Flush MemTable to L0 is omited, greatly reducing write amp and is very friendly for large(GB) MemTable

  * MemTable   serves as the index of Key to "value position in WAL log"

  * Since WAL file content almost always in page cache, thus value content can be efficiently accessed by mmap

  * When Flush happens, MemTable is dumpped as an SST and WAL is treated as a blob file

    * CSPP MemTable use integer index instead of physical pointers, thus in-memory format is exactly same with in-file format
* Prefix cache for searching candidate SSTs and prefix cache for scanning by iterators

  * Caching fixed len key prefix into an array, binary search it as an uint array
* Distributed compaction(superior replacement to rocksdb remote compaction)

  * Gracefully support MergeOperator, CompactionFilter, PropertiesCollector...

  * Out of the box, development efforts are significantly reduced

  * Very easy to share compaction service on spot instances for many DB nodes
Useful Bonus Feature:

* Config by json/yaml: can config almost all features

* Optional embeded WebView: show db structures in web browser, refreshing pages like animation

* Online update db configs by http

MySQL integration, ToplingDB has integrated into MySQL by MyTopling, which is forked from MyRocks with great improvements, like improvements of ToplingDB on RocksDB:

* WBWI(WriteBatchWithIndex): like MemTable, SkipList is replace with CSPP, 20x faster(speedup is more than MemTable).

* LockManager & LockTracker: 10x faster

* Encoding & Decoding: 5x faster

* Others ....

MyRocks has many disadvantages compared to InnoDB, while MyTopling outperforms InnoDB at almost all aspect - excluding feature differences.

We have create ~100 PRs for RocksDB, in which ~40 were accepted. Our PRs are mostly "small" changes, since big changes are not likely accepted.

ToplingDB has been deployed in numerous production environments.

Welcome every one using ToplingDB & MyTopling, and discuss in https://github.com/topling/toplingdb/discussions

Show context
esafak ◴[] No.44435437[source]
A distributed KV-store plus a relational layer makes it a competitor to NewSQL databases like TiDB, which is also based on Facebook's RocksDB.

It doesn't look like it's very actively developed: https://github.com/topling/toplingdb/pulse/monthly

To the OP who's developing it: I suggest polishing your README. Provide a simple installation tutorial, maybe a trial offering like tidbcloud.com, and comparative benchmark results, since you advertise your performance.

replies(3): >>44436913 #>>44437443 #>>44461562 #
1. jauntywundrkind ◴[] No.44436913[source]
It's quite active. They just aren't using GitHub pull requests in their workflow, which is what GitHub Pulse measures. https://github.com/topling/toplingdb/commits/memtable_as_log...