An R-tree implementation for RocksDB
2017-02-14 22:35
It's long been my plan to implement an R-tree on top of RocksDB. Now there is a first version of it.
Getting started
Checkout the source code from my RocksDB rtree-table fork on Github, build RocksDB and the R-tree example.
git clone https://github.com/vmx/rocksdb.git
cd rocksdb
make static_lib
cd examples
make rtree_example
If you run the example it should output augsburg
:
$ ./rtree_example
augsburg
For more information about how to use the R-tree, see the Readme file of the project.
Implementation
The nice thing about LSM-trees is that the index data structures can be bulk loaded. For now for my R-tree it's just a simple bottom up building with a fixed node size (default is 4KiB). The data is pre-sorted by the low value of the first dimension. This means that data has a total order, hence also sorted results based on the first dimension. The idea is based on the paper On Support of Ordering in Multidimensional Data Structures by Filip Křižka, Michal Krátký, Radim Bača.
The tree is far from optimal, but it is a good starting point. Currently only doubles are supported. In the future I'd like to support integers, fixed size decimals and also strings.
If you have a look at the source code and cringe because of the coding style, feel free to submit pull requests (my current C++ skills are sure limited).
Next steps
Currently it's a fork of RocksDB which surely isn't ideal. I've already mentioned it in last year's FOSS4G talk about the R-tree in RocksDB (warning: autoplay) that there are several possibilities:
- Best (unlikely): Upstream merge
- Good: Add-on without additional patches
- Still OK: Be an easy to maintain fork
- Worst case: Stay a fork
I hope to work together with the RocksDB folks to find a way to make such extensions easily possible with no (or minimal) code changes. Perhaps having stable interfaces or classes that can easily be overloaded.