GeoCouch: The future is now
2010-05-03 22:35
Update: This blog entry is outdated and kepts for historical reasons. Please do always check for newer blog posts. The up to date information on how to install and use GeoCouch can be found in its README.
An idea has become reality. Exactly two years after the blog post with the initial vision, a new version of GeoCouch is finished. It's a huge step forward. The first time the dependencies were narrowed down to CouchDB itself. No Python, no SpatiaLite any longer, it's pure Erlang. GeoCouch is tightly integrated with CouchDB, so you'll get all the nice features you love about CouchDB.
Current implementation
Thanks to the feedback after the FOSS4G 2009 and "GeoCouch: The future" blog entry" it was clear that people prefer a simple, yet powerful and tightly integrated approach, rather than having to many external dependencies (which was a showstopper for quite a few people).
I implemented an R-tree (I call it vtree as the implementation is subject to change a lot) from scratch. The reason why I haven't used the already existing R-Tree implementation available at Github is that I needed something to learn Erlang, it doesn't contain test or examples and that it is always a good idea to implement a data structure yourself to understand the details/problems. My implementation is far from being perfect but works good enough for now. The vtree is implemented as an append-only data structure just as CouchDB's B-trees are. Currently it doesn't support bulk insertion.
If you want to know details on how to create your own indexer, have a look at my Indexer tutorial.
Feature set
Following the "Release early, release often" philosophy currently only points can be inserted, the only supported query is a bounding box search. Though other geometries should follow soon.
Using GeoCouch
GeoCouch is now hosted at Github. Giving GeoCouch a go is easy:
git clone http://github.com/vmx/couchdb.git
cd couchdb
./bootstrap
./configure
make dev
./utils/run
To try the spatial features when it's up and running is easy as well. Just add
a spatial
property and a named function to your Design Document as you
would to for
show or list functions:
function(doc) {
if (doc.loc) {
emit(doc._id, {
type: "Point",
coordinates: [doc.loc[0], doc.loc[1]]
});
}
};
All you need to do is emitting GeoJSON as the value
(Remember that point
is the only supported geometry at the moment), the
key is currently ignored.
curl -X PUT http://127.0.0.1:5984/places
curl -X PUT -d '{"spatial":{"points":"function(doc) {\n if (doc.loc) {\n emit(doc._id, {\n type: \"Point\",\n coordinates: [doc.loc[0], doc.loc[1]]\n });\n }};"}}' http://127.0.0.1:5984/places/_design/main
Before a bounding box query can return anything, you need to insert Documents that contain a location.
curl -X PUT -d '{"loc": [-122.270833, 37.804444]}' http://127.0.0.1:5984/places/oakland
curl -X PUT -d '{"loc": [10.898333, 48.371667]}' http://127.0.0.1:5984/places/augsburg
And finally you can make a bounding box request:
curl -X GET 'http://localhost:5984/places/_design/main/_spatial/points/%5B0,0,180,90%5D'
This one should return only augsburg
:
{"query1":[{"id":"augsburg","loc":[10.898333,48.371667]}]}
Next steps
The development of GeoCouch was quite slow in the past, but it gets up to speed as my diploma thesis (comparable to a master's thesis) will be about GeoCouch. Additionally Couchio kindly supports the development.
The next steps are (in no particular order):
- Better R-tree (better splitting algorithm, bulk operations)
- Supporting more geometries
- Polygon search
- Improving CouchDB's plugin capabilities
Thanks
I'd like to thank all the people that kept me motivated over the past two years with their tremendous feedback. Special thanks go to Jan Lehnardt for getting me onto the Couch, Cameron Shorter for introducing me into the geospatial open source business and all people from Couchio for the great two weeks in Oakland.