vmx

the blllog.

GeoCouch: Geospatial queries with CouchDB

2008-10-26 22:35

Notice: This blog post is outdated, please move on :)

Update (2009-09-19): There's a new GeoCouch release. More information at GeoCouch: New release 0.10.0.

After almost six months of silence I finally managed to get a prototype done (thanks Jan for keeping me motivated).

What do you get?

You get some code to play around with, to get a slight idea of how such a geospatial extension for CouchDB could look like. The code base isn’t polished yet, but it’s good enough to get it out of the door. The current version only supports one geometry type (POINT), and one operation (a bounding box search).

As CouchDB doesn’t allow an intersection of results gathered from an external service, the result of the bounding box search will be plain text document IDs and their coordinates.

How does it work?

GeoCouch consists of two parts, the indexer and the query processor. Both are connected through stdin/out with CouchDB.

Indexer (geostore)

In order to make the indexer understand which fields in the document contain geometries, a special design document is needed. As soon as a database has such a document, the database is geo-enabled and the indexer will store the geometries in a spatial index, which is a SpatiaLite database at the moment.

Everytime a database in CouchDB is altered (create, delete, update) the indexer gets notified and will act accordingly to keep the spatial index up to date with CouchDB.

Query processor (geoquery)

To process queries with an external service is possible with Paul Joseph Davis’ excellent external2 CouchDB branch. Queries to CouchDB can get passed along to an external service.

At the moment the result is the output of this service, it’s plain text in our case. In the future the external service will only return document IDs which will be passed back to the view. The result will be an intersection of document IDs of the view and the document IDs the external service returned.

How do I use it?

When everything is installed correctly it’s quite easy to get started.

Setting things up

  • Create a new database named geodata (could be anything).
  • Add a document named myhome, there you’ll store all the information of your home including the coordinates. As we are only interested in a bounding box search it’s enough to have a location:
    {
      "_id": "myhome",
      "_rev": "3358484250",
      "location": [ 151.208333, -33.869444 ]
    }
  • Add as many other documents like this, make sure all of them have a field called location with the coordinates as array. As for the database, the name of the field could be anything, but has to be the same in all documents.
  • Now we come to the interesting part, the special design view that geo-enables the database. The document has to be named “_design/_geocouch”. After creating it also needs some special fields and will look like this:

    {
      "_id": "_design/_geocouch",
      "_rev": "610069068",
      "srid": 4326,
      "loc": {
        "type": "POINT",
        "x": "location[0]",
        "y": "location[1]"
      }
    }

    The coordinate system that should be used is specified by an SRID. If you don’t know which value to use for srid, use 4326. It’s assumed that all geometries in your document belong to the same coordinate system.

    The other field is the information where to find the geometry in the documents. The name you choose will be used for the bounding box queries, I’ve chosen loc. It defines the type (POINT), and where to find the x/y coordinate (this will probably be changed to lat/lon in the future).

    The way to specify where to find the field is comparable to XPath, but much simpler. As JSON consists of nested dictionaries and arrays, you can get a property within an array with the index (e.g. location[0] is the first element in an Array called location). If it is a dictionary you specify it separated by a dot (e.g. location.x is a property named x within another one called location). It can of course be nested much deeper, the path always starts at the root of the document (e.g. bike.stolen.found[0]).

Bounding box search

And finally you can make a bounding box search. Simply browse a URL like this one (this is a bounding box that encloses the whole world):

http://localhost:5984/geodata/_external/geo?q={"geom":"loc","bbox":[-180,-90,180,90]}

The expected result is:

myhome 151.208333 -33.869444

Requirements

You’d like to give it a try? Here is a list of the software and their versions I used to get it work on my system, but others might work as well. GeoCouch includes installation/configuration instructions.

Download GeoCouch

Get SpacialCouch now! It’s new, it’s free (MIT licensed).

What’s next?

The current version is meant to play with, many things are not possible, many things needs to be improved. But with the power of SpatiaLite (and the underlying libraries) it shouldn’t be too hard.

Therefore I hope this will only be start and will end up in a discussion on what should be done, what other things might be possible. I’d love to hear your use cases for a geospatially enabled CouchDB.

Categories: en, CouchDB, Python, geo

Comments

  1. Sean Gillies:
    2008-10-28 07:40:22

    Cool. Format your points as GeoJSON, maybe? See http://geojson.org/geojson-spec.html#id2.

  2. adk:
    2008-10-29 20:02:19

    could you please tell me hwo to create a document by the name _design/_geocouch ??

    thanks

  3. Volker Mische:
    2008-11-01 05:17:22

    Just as any other document. If you use the web interface, just click on "Create Document..." and name it "_design/_geocouch".

  4. Andrew Turner:
    2008-11-01 16:34:01

    This is really cool - and totally agree with @Sean, GeoJSON support for the query would be terrific.

    And maybe an OpenSearch-Geo template for bonus points!

  5. Volker Mische:
    2008-11-02 05:54:23

    Andrew,
    I haven't thought that much about how the queries will look like, thus I really appreciate your recommendations.

  6. Scott Raymond:
    2008-11-06 07:13:38

    Thanks for this! I'm very keen on getting it working, and I think I'm close, but I have hit a snag. I'm able to run the geoquery.py and geocouch.py scripts without error. But when I add the configuration lines to local.ini and restart, couchdb dies after about 30 seconds. Here's the couch.log that shows an apparent problem with the script: http://pastie.org/308530

    I'm at a loss for what the problem might be. Any advice?

  7. Scott Raymond:
    2008-11-06 09:02:09

    Nevermind... my problem was simply that the couchdb user didn't have permission to run the scripts.

    One other thing that you might clarify: the example URL you posted for the bounding box search doesn't work as-is, because geoquery.py expects the "q" param to be a string that can be eval'd as JSON. So the URL needs to be formatted like this: http://localhost:5984/geodata/_external/geo?q="{\"geom\":\"loc\",\"bbox\":[-180,-90,180,90]}"

    Other than that, everything seems to be working for me. Thanks again!

Comments are closed after 14 days.

By Volker Mische

Powered by Kukkaisvoima version 7