<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Kukkaisvoima version 7" -->
<rss version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
>
<channel>
<title>vmx: CouchDB</title>
<link>http://vmx.cx/cgi-bin/blog/index.cgi</link>
<description>Blog of Volker Mische</description>
<pubDate>Wed, 14 Jul 2010 12:33:42 +0200</pubDate>
<lastBuildDate>Wed, 14 Jul 2010 12:33:42 +0200</lastBuildDate>
<generator>http://23.fi/kukkaisvoima/</generator>
<language>en</language>
<item>
<title>How I met CouchDB
</title>
<link>http://vmx.cx/cgi-bin/blog/index.cgi/how-i-met-couchdb%3A2010-07-14%3Aen%2CCouchDB</link>
<comments>http://vmx.cx/cgi-bin/blog/index.cgi/how-i-met-couchdb%3A2010-07-14%3Aen%2CCouchDB#comments</comments>
<pubDate>Wed, 14 Jul 2010 12:33:42 +0200</pubDate>
<dc:creator>Volker Mische</dc:creator>
<category>en</category>
<category>CouchDB</category>
<guid isPermaLink="false">http://vmx.cx/cgi-bin/blog/index.cgi/how-i-met-couchdb%3A2010-07-14%3Aen%2CCouchDB/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[

<p>It was a Saturday in late April 2008, I was sitting on my Laptop in my 5m²
room  down under. Chatting with some German people I used to chat with for
about 8 years by that time. Suddenly I discover that
<a href="http://jan.prima.de/">Jan</a> is there, who I haven't talked with
for years. Wondering why he was in there, he replied that he wanted to brag
about his apache.org email address. This is how I found out about
<a href="http://couchdb.apache.org/">CouchDB</a>.
</p>
<p>After several long discussions with Jan I finally wrapped my head around
the document oriented concept. I was blown away, it was exactly what I would
have liked to use on so many occasions at my one year internship at a
geospatial company. Though CouchDB wasn't ready, I needed spatial indexing.
One week later I
<a href="http://vmx.cx/cgi-bin/blog/index.cgi/couchdb-and-geodata%3A2008-05-03%3Aen%2Cgeo%2CCouchDB">had
a first idea of how such an extension might look like</a>.</p>

<p>And only 2 years later I'm really involved in CouchDB and people actually
<a href="http://pdxapi.com/">start</a>
<a href="http://simonmetson.posterous.com/import-a-bunch-of-geo-location-data-into-geoc">using</a>
<a href="http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-the-future-is-now:2010-05-03:en,CouchDB,Python,Erlang,geo">GeoCouch</a>
:) I'd like to use this blog post to thank the developers and the whole
community, it's been a great time and the
<a href="irc://irc.freenode.net/#couchdb">IRC channel</a> just kicks ass.
You all helped to make CouchDB 1.0 possible!</p>
]]></content:encoded>
<wfw:commentRss>http://vmx.cx/cgi-bin/blog/index.cgi/how-i-met-couchdb%3A2010-07-14%3Aen%2CCouchDB/feed/</wfw:commentRss>
</item>
<item>
<title>GeoCouch Vortrag in Augsburg
</title>
<link>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-vortrag-in-augsburg%3A2010-07-07%3Ade%2CCouchDB%2CGeoCouch%2CErlang%2Cgeo</link>
<comments>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-vortrag-in-augsburg%3A2010-07-07%3Ade%2CCouchDB%2CGeoCouch%2CErlang%2Cgeo#comments</comments>
<pubDate>Wed, 07 Jul 2010 12:09:42 +0200</pubDate>
<dc:creator>Volker Mische</dc:creator>
<category>de</category>
<category>CouchDB</category>
<category>GeoCouch</category>
<category>Erlang</category>
<category>geo</category>
<guid isPermaLink="false">http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-vortrag-in-augsburg%3A2010-07-07%3Ade%2CCouchDB%2CGeoCouch%2CErlang%2Cgeo/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[

<p>Im Rahmen des Diplomandencolloquium des
<a href="http://www.geo.uni-augsburg.de/lehrstuehle/humgeo/">Lehrstuhl für
Humangeographie und Geoinformatik</a> halte ich am 19.07.2010 um 17:30 Uhr
(Raum 2125) an der <a href="http://www.uni-augsburg.de/">Uni Augsburg</a>
einen Votrag über GeoCouch. Der genaue Titel lautet:
</p>
<p>GeoCouch: Eine Erweiterung für CouchDB zur Abfrage räumlicher Daten
</p>
<p>Er richtet sich an Geographen, wird also nicht zu sehr ins Detail der
Implementierung gehen. Es sind auch keine Vorkenntnisse zum Thema CouchDB
nötig. Wer also mehr über CouchDB und GeoCouch wissen will,
ist herzlich dazu eingeladen. Danach stehe ich natürlich zu Fragen zur
Verfügung.
</p>
<p>Ich habe keine Ahnung wie groß die CouchDB Community im Raum Augsburg ist,
aber sollte jemand dieser Einladung folgen, spricht auch nichts gegen ein
anschließendes kleines CouchDB/GeoCouch/NoSQL Meetup. Am besten meldet ihr
euch bei mir per Mail, denn wenn ein paar Leute sicher kommen, werden es
sich andere bestimmt auch überlegen.</p>

<p><em>Sorry Planet CouchDB for writing in German, but this is about a talk
in German.</em></p>
]]></content:encoded>
<wfw:commentRss>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-vortrag-in-augsburg%3A2010-07-07%3Ade%2CCouchDB%2CGeoCouch%2CErlang%2Cgeo/feed/</wfw:commentRss>
</item>
<item>
<title>Bolsena hacking event
</title>
<link>http://vmx.cx/cgi-bin/blog/index.cgi/bolsena-hacking-event-2010%3A2010-06-11%3Aen%2CCouchDB%2CJavaScript%2Cgeo</link>
<comments>http://vmx.cx/cgi-bin/blog/index.cgi/bolsena-hacking-event-2010%3A2010-06-11%3Aen%2CCouchDB%2CJavaScript%2Cgeo#comments</comments>
<pubDate>Fri, 11 Jun 2010 22:02:24 +0200</pubDate>
<dc:creator>Volker Mische</dc:creator>
<category>en</category>
<category>CouchDB</category>
<category>JavaScript</category>
<category>geo</category>
<guid isPermaLink="false">http://vmx.cx/cgi-bin/blog/index.cgi/bolsena-hacking-event-2010%3A2010-06-11%3Aen%2CCouchDB%2CJavaScript%2Cgeo/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[

<p>The <a href="http://wiki.osgeo.org/wiki/Bolsena_Code_Sprint_2010">OSGeo hacking event in Bolsena/Italy</a> was great. Many interesting people
sitting the whole day in front of their laptops surrounded by a beautiful
scenery and nice warm sunny weather. It gets even better when you get meat for
lunch and dinner.</p>

<p>I had the chance to tell people a bit more about
<a href="http://couchdb.apache.org/">CouchDB</a> and
<a href="http://github.com/couchapp/couchapp/">Couchapps</a>,</p>

<p>One project I haven't heard that much before of was
<a href="http://deegree.org/">Degree</a>. They build the whole stack of OGC services
you could imagine. For me it was of interest that they have a
blob storage in their upcoming 3.0 release. The data
isn't flattened into SQL tables but stored as blobs. This sounds like good use
for a CouchDB backend in the future.</p>

<p>I was working with <a href="http://wiki.osgeo.org/wiki/User:Simonp">Simon Pigot</a> on a
<a href="http://geonetwork-opensource.org/">GeoNetwork</a>
re-implementation
based on CouchDB using Couchapp. We got the basic stuff like putting an XML
document into the database, editing it and returning the new document, as well
as fulltext indexing with
<a href="http://github.com/rnewson/couchdb-lucene">couchdb-lucene</a>
work. Next steps are improving the JSON to XML mapping and integrating spatial
search based on <a href="http://github.com/vmx/couchdb">GeoCouch</a>.</p>

<p>The event was really enjoyable, thanks <a href="http://couch.io/">Couchio</a> for
sponsoring the trip, thanks <a href="http://www.ticheler.net/">Jeroen</a> for
organizing it, and thanks all other hackers that made it such a awesome event.
Hope to see you next year!</p>
]]></content:encoded>
<wfw:commentRss>http://vmx.cx/cgi-bin/blog/index.cgi/bolsena-hacking-event-2010%3A2010-06-11%3Aen%2CCouchDB%2CJavaScript%2Cgeo/feed/</wfw:commentRss>
</item>
<item>
<title>FOSS4G 2010: I'm speaking
</title>
<link>http://vmx.cx/cgi-bin/blog/index.cgi/foss4g-2010-im-speaking%3A2010-05-21%3Aen%2CGeoCouch%2CCouchDB%2CErlang%2Cgeo</link>
<comments>http://vmx.cx/cgi-bin/blog/index.cgi/foss4g-2010-im-speaking%3A2010-05-21%3Aen%2CGeoCouch%2CCouchDB%2CErlang%2Cgeo#comments</comments>
<pubDate>Fri, 21 May 2010 14:11:59 +0200</pubDate>
<dc:creator>Volker Mische</dc:creator>
<category>en</category>
<category>GeoCouch</category>
<category>CouchDB</category>
<category>Erlang</category>
<category>geo</category>
<guid isPermaLink="false">http://vmx.cx/cgi-bin/blog/index.cgi/foss4g-2010-im-speaking%3A2010-05-21%3Aen%2CGeoCouch%2CCouchDB%2CErlang%2Cgeo/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[

<div class="figure">
  <a href="http://2010.foss4g.org">
    <img src="http://2010.foss4g.org/images/logo_145x90_speaking.jpg"
    alt="FOSS4G Conference - I'm Speaking!"  width="145" height="90"/>
 </a>
</div>

<p>I did it! I'll speak at the <a href="http://2010.foss4g.org/">FOSS4G
Conference 2010</a> (Free and Open Source Software for Geospatial Conference),
6th–9th September in Barcelona about “GeoCouch: A spatial index for CouchDB”.
As soon as the abstract is available online I'll link to it. Hope to see you
there!
</p>
]]></content:encoded>
<wfw:commentRss>http://vmx.cx/cgi-bin/blog/index.cgi/foss4g-2010-im-speaking%3A2010-05-21%3Aen%2CGeoCouch%2CCouchDB%2CErlang%2Cgeo/feed/</wfw:commentRss>
</item>
<item>
<title>GeoCouch: The future is now
</title>
<link>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-the-future-is-now%3A2010-05-03%3Aen%2CCouchDB%2CPython%2CErlang%2Cgeo</link>
<comments>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-the-future-is-now%3A2010-05-03%3Aen%2CCouchDB%2CPython%2CErlang%2Cgeo#comments</comments>
<pubDate>Mon, 03 May 2010 10:29:55 +0200</pubDate>
<dc:creator>Volker Mische</dc:creator>
<category>en</category>
<category>CouchDB</category>
<category>Python</category>
<category>Erlang</category>
<category>geo</category>
<guid isPermaLink="false">http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-the-future-is-now%3A2010-05-03%3Aen%2CCouchDB%2CPython%2CErlang%2Cgeo/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[

<p>An idea has become reality. Exactly two years after the
<a href="http://vmx.cx/cgi-bin/blog/index.cgi/couchdb-and-geodata:2008-05-03:en,geo,CouchDB">blog post with the initial vision</a>,
a new version of GeoCouch is finished. It's a huge step forward. The first time
the dependencies were narrowed down to <a href="http://couchdb.apache.org">CouchDB</a>
itself. No <a href="http://www.python.org">Python</a>,
no <a href="http://www.gaia-gis.it/spatialite/">SpatiaLite</a> any longer, it's pure
<a href="http://www.erlang.org">Erlang</a>. GeoCouch is tightly integrated with
CouchDB, so you'll get all the nice features you love about CouchDB.</p>

<h3>Current implementation</h3>

<p>Thanks to the feedback after the FOSS4G 2009 and
<a href="http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-the-future:2009-12-20:en,CouchDB,Python,geo">"GeoCouch: The future" blog entry</a>"
it was clear that people prefer a simple, yet powerful and tightly integrated
approach, rather than having to many external dependencies (which was a
showstopper for quite a few people).</p>

<p>I implemented an R-tree (I call it vtree as the implementation is
subject to change a lot) from scratch. The reason why I haven't used
the <a href="http://github.com/cchandler/RTreeCouchDB">already existing R-Tree implementation available at
Github</a> is that I needed
something to learn Erlang, it doesn't contain test or examples and
that it is always a good idea to implement a data structure yourself
to understand the details/problems. My implementation is far from
being perfect but works good enough for now. The vtree is implemented
as an append-only data structure just as CouchDB's B-trees
are. Currently it doesn't support bulk insertion.</p>

<p>If you want to know details on how to create your own indexer, have a look at
my <a href="http://vmx.cx/couchdb/tutorial/indexer.html">Indexer tutorial</a>.</p>

<h3>Feature set</h3>

<p>Following the "Release early, release often" philosophy currently only points
can be inserted, the only supported query is a bounding box search. Though
other geometries should follow soon.</p>

<h3>Using GeoCouch</h3>

<p>GeoCouch is now <a href="http://github.com/vmx/couchdb/tree/geocouch">hosted
at Github</a>. Giving GeoCouch a go is easy:</p>

<pre><code>git clone http://github.com/vmx/couchdb.git
cd couchdb
./bootstrap
./configure
make dev
./utils/run
</code></pre>

<p>To try the spatial features when it's up and running is easy as well. Just add
a <code>spatial</code> property and a named function to your Design Document as you
would to for
<a href="http://wiki.apache.org/couchdb/Formatting_with_Show_and_List">show or list functions</a>:</p>

<pre><code>function(doc) {
    if (doc.loc) {
        emit(doc._id, {
            type: "Point",
            coordinates: [doc.loc[0], doc.loc[1]]
        });
    }
};
</code></pre>

<p>All you need to do is emitting <a href="http://geojson.org">GeoJSON</a> as the value
(Remember that <code>point</code> is the only supported geometry at the moment), the
key is currently ignored.</p>

<pre><code>curl -X PUT http://127.0.0.1:5984/places
curl -X PUT -d '{"spatial":{"points":"function(doc) {\n    if (doc.loc) {\n        emit(doc._id, {\n            type: \"Point\",\n            coordinates: [doc.loc[0], doc.loc[1]]\n        });\n    }};"}}' http://127.0.0.1:5984/places/_design/main
</code></pre>

<p>Before a bounding box query can return anything, you need to insert Documents
that contain a location.</p>

<pre><code>curl -X PUT -d '{"loc": [-122.270833, 37.804444]}' http://127.0.0.1:5984/places/oakland
curl -X PUT -d '{"loc": [10.898333, 48.371667]}' http://127.0.0.1:5984/places/augsburg
</code></pre>

<p>And finally you can make a bounding box request:</p>

<pre><code>curl -X GET 'http://localhost:5984/places/_design/main/_spatial/points/%5B0,0,180,90%5D'
</code></pre>

<p>This one should return only <code>augsburg</code>:</p>

<pre><code>{"query1":[{"id":"augsburg","loc":[10.898333,48.371667]}]}
</code></pre>

<h3>Next steps</h3>

<p>The development of GeoCouch was quite slow in the past, but it gets up
to speed as my diploma thesis (comparable to a master's thesis) will be
about GeoCouch. Additionally <a href="http://www.couch.io/">Couchio</a> kindly
supports the development.</p>

<p>The next steps are (in no particular order):</p>

<ul>
<li>Better R-tree (better splitting algorithm, bulk operations)</li>
<li>Supporting more geometries</li>
<li>Polygon search</li>
<li>Improving CouchDB's plugin capabilities</li>
</ul>

<h3>Thanks</h3>

<p>I'd like to thank all the people that kept me motivated over the past two
years with their tremendous feedback. Special thanks go to
<a href="http://jan.prima.de/">Jan Lehnardt</a> for getting me onto the Couch,
<a href="http://cameronshorter.blogspot.com/">Cameron Shorter</a> for introducing me
into the geospatial open source business and all people from
<a href="http://www.couch.io/">Couchio</a> for the
great two weeks in Oakland.</p>
]]></content:encoded>
<wfw:commentRss>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-the-future-is-now%3A2010-05-03%3Aen%2CCouchDB%2CPython%2CErlang%2Cgeo/feed/</wfw:commentRss>
</item>
<item>
<title>GeoCouch: The future
</title>
<link>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-the-future%3A2009-12-20%3Aen%2CCouchDB%2CPython%2Cgeo</link>
<comments>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-the-future%3A2009-12-20%3Aen%2CCouchDB%2CPython%2Cgeo#comments</comments>
<pubDate>Sun, 20 Dec 2009 16:37:21 +0200</pubDate>
<dc:creator>Volker Mische</dc:creator>
<category>en</category>
<category>CouchDB</category>
<category>Python</category>
<category>geo</category>
<guid isPermaLink="false">http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-the-future%3A2009-12-20%3Aen%2CCouchDB%2CPython%2Cgeo/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[

<p><a href="http://gitorious.org/geocouch/">GeoCouch</a> started as a <a href="/cgi-bin/blog/index.cgi/geocouch-geospatial-queries-with-couchdb:2008-10-26:en,CouchDB,Python,geo">proof of concept</a> and was heavily rewritten for the <a href="/cgi-bin/blog/index.cgi/geocouch-new-release-0.10.0:2009-09-19:en,CouchDB,Python,geo">0.10 release</a>. As more and more people got interested, I got feedback to see what people really want/need. And now it's time to determine the future of GeoCouch. It's your chance to shape the future. In this blog entry I'll explain my ideas for the future, but I'm more than happy to get further ideas/complains from you. So please check if my ideas match your use-cases for GeoCouch.
</p>
<h3>Stripping it down</h3>
<p>GeoCouch needs an external spatial index, at the moment I use <a href="http://www.gaia-gis.it/spatialite/">SpatiaLite</a> for it, but a <a href="http://postgis.refractions.net/">PostGIS</a> backend would be easily possible. My inital idea was that it is better to use the existing power of spatial databases, rather than reinventing the wheel. I though I could use all the power they have, that I can even use them for complex analytics, but I can't. As I only store the geometries, I need to “ask” CouchDB for the attributes (no, I don't want to store attributes in my spatial index).
<!--This would be possible, but I'll explain the “analytics use-case” later.-->
</p>
<p>If I don't use the full power of the spatial databases, but only a small fraction, there might be better solution. Therefore I propose that GeoCouch will use a simple spatial index for storing the geometries, not a full blown spatial database. I haven't decided yet which one it'll be, but I really think about moving this part to Erlang (I know that quite a few people would love that move).
</p>
<p>You will loose functionality like reprojection. The spatial index won't know anything about projections. So GeoCouch won't be projection aware anymore, but you application still can be. For example if you want to return your data in a different projection than it was stored, you do the transformation after you've queried GeoCouch.
</p>
<p>You would also loose fancy things for geometries, like boolean operations on them. But this is something I'd call complex analytics, and not simple querying.
</p>
<p>GeoCouch would only support three simple queries: bounding search, polygon search and radius/distance search. If the search would be within a union of polygons, let's say all countries of the European Union, you would simply make the union operation before you query GeoCouch.
</p>

<h3>Complex analytics</h3>
<p>What I call “complex analytics” is things like: “return all apple trees that are located with a 10km range around buildings that have are over 100m high, but only in countries with a population over 50 million people” is not possible with GeoCouch as you would need the attribute values as well. Those are stored in CouchDB, so you would need to request them. What GeoCouch only supports is a simple: give me all IDs within a bounding box/polygon/radius.
</p>

<h3>Conclusion</h3>
<p>Simple requests are needed for everyday use, thus they should be incredibly fast. Complex analytics don't necessarily need to handle thousands of requests per second, in most cases they don't even need to be processed in real-time. I'd like to see some layer build above GeoCouch, so CouchDB can even be used for analytics (which is a thing I wanted to have right from the start).
</p>
<p>This means that GeoCouch will be mainly for high performance and massive sized projects that need some simple spatial bits, what I think the majority of users need.
</p>
<p>If you either think you really need only those simple queries, but you want them to be fast, or you think this is wrong, that you need dynamic reprojection I can only invite you to leave a comment below or drop a mail to <a href="mailto:volker.mische@gmail.com">volker.mische@gmail.com</a>. Thanks.
</p>
]]></content:encoded>
<wfw:commentRss>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-the-future%3A2009-12-20%3Aen%2CCouchDB%2CPython%2Cgeo/feed/</wfw:commentRss>
</item>
<item>
<title>FOSS4G 2009: “Geodata and CouchDB” presentation is online
</title>
<link>http://vmx.cx/cgi-bin/blog/index.cgi/foss4g-2009-presentation-is-online%3A2009-11-17%3Aen%2CCouchDB%2CPython%2Cgeo</link>
<comments>http://vmx.cx/cgi-bin/blog/index.cgi/foss4g-2009-presentation-is-online%3A2009-11-17%3Aen%2CCouchDB%2CPython%2Cgeo#comments</comments>
<pubDate>Tue, 17 Nov 2009 11:48:43 +0200</pubDate>
<dc:creator>Volker Mische</dc:creator>
<category>en</category>
<category>CouchDB</category>
<category>Python</category>
<category>geo</category>
<guid isPermaLink="false">http://vmx.cx/cgi-bin/blog/index.cgi/foss4g-2009-presentation-is-online%3A2009-11-17%3Aen%2CCouchDB%2CPython%2Cgeo/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[

<p>The final wrap-up of the <a href="http://2009.foss4g.org/">FOSS4G 2009</a>,
<a href="http://2009.foss4g.org/presentations/#presentation_78">my presentation
on “Geodata and CouchDB”</a> is available online in several formats. It should
also be of interest for people who are new to CouchDB as huge parts of the
talk are an introduction into CouchDB.
</p>
<ul>
  <li>The raw slides
<a href="/blog/2009-11-17/geodata-and-couchdb.pdf">as PDF</a> (licensed under
<a href="http://creativecommons.org/licenses/by/3.0/de/">CC-BY-3.0-de</a>).</li>
  <li>The slides with comments
<a href="/blog/2009-11-17/geodata-and-couchdb.htm">as HTML</a> (licensed under
<a href="http://creativecommons.org/licenses/by/3.0/de/">CC-BY-3.0-de</a>).</li>
  <li>The <a href="http://www.fosslc.org/drupal/node/595">slides with audio</a>
(<a href="http://blip.tv/file/2795979">or at blib.tv</a>). It’s the
recording of the actual talk at the conference</a>. Thanks
<a href="http://georaffe.org/">Alex</a> and
<a href="http://www.fosslc.org/">FOSSLC</a> for recording it (licensed under
<a href="http://creativecommons.org/licenses/by-sa/3.0/">CC-BY-SA-3.0</a>).</li>
</ul>
]]></content:encoded>
<wfw:commentRss>http://vmx.cx/cgi-bin/blog/index.cgi/foss4g-2009-presentation-is-online%3A2009-11-17%3Aen%2CCouchDB%2CPython%2Cgeo/feed/</wfw:commentRss>
</item>
<item>
<title>Benchmarking is not easy
</title>
<link>http://vmx.cx/cgi-bin/blog/index.cgi/benchmarking-is-not-easy%3A2009-09-23%3Aen%2CCouchDB%2CPython%2CTileCache%2Cgeo</link>
<comments>http://vmx.cx/cgi-bin/blog/index.cgi/benchmarking-is-not-easy%3A2009-09-23%3Aen%2CCouchDB%2CPython%2CTileCache%2Cgeo#comments</comments>
<pubDate>Wed, 23 Sep 2009 17:39:06 +0200</pubDate>
<dc:creator>Volker Mische</dc:creator>
<category>en</category>
<category>CouchDB</category>
<category>Python</category>
<category>TileCache</category>
<category>geo</category>
<guid isPermaLink="false">http://vmx.cx/cgi-bin/blog/index.cgi/benchmarking-is-not-easy%3A2009-09-23%3Aen%2CCouchDB%2CPython%2CTileCache%2Cgeo/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[

<p>There are so many ways to have a play with
<a href="http://couchdb.apache.org">CouchDB</a>. This time I thought about using
CouchDB as a <a href="http://tilecache.org/">TileCache</a> storage. 
Sounds easy, so it was.
</p>

<h3>What is a tilecache</h3>
<p>Everyone knows <a href="http://maps.google.com/">Google Maps</a> and its
small images, called <em>tiles</em>. Rendering those tiles for the whole world
for every zoom level can be quite time consuming, therefore you can render
them on demand and cache them once they are rendered. This is the business of
a tilecache.
</p>
<p>You can use the tilecache as a proxy to a remote tile server as well, that's
what I did for this benchmark.</p>

<h3>Coding</h3>
<p><a href="/blog/2009-09-23/Couchdb.py">The implementation</a> looks quite
similar to the
<a href="http://svn.tilecache.org/trunk/tilecache/TileCache/Caches/Memcached.py">memcache
one</a>. I haven't implemented locking as I was just after something working,
not a full-fledged backend.
</p>
<p>When I finished coding, it was time to find out how it performs. That should
be easy, as there's a tilecache_seeding script bundled with TileCache to fill
the cache. So you fill the cache, then you switch the remote server off and
test how long it takes if all requests are hits without any fails (i.e. all
tiles are in your cache and don't need to be requested from a remote server).
</p>
<p>The two contestants for the benchmark are the CouchDB backend and the one
that stores the tiles directly on the filesystem.</p>

<h3>Everyone loves numbers</h3>
<p>We keep it simple and measure the time for seeding with
<a href="http://www.gnu.org/software/time/">time</a>. How long will it take to
request 780 tiles? The first number is the average (in seconds), the one in
brackets the standard deviation.
</p>
<ul>
  <li><p>Filesystem:</p>
<pre>
real 0.35 (0.04)
user 0.16 (0.02)
sys  0.05 (0.01)
</pre>
  </li>
  <li><p>CouchDB:</p>
<pre>
real 3.03 (0.18)
user 0.96 (0.05)
sys  0.21 (0.03)
</pre>
  </li>
</ul>
<p>Let's say CouchDB is 10 times slower that the file system based cache. Wow,
CouchDB really sucks! Why would you use it as tile storage? Although you could:
</p>
<ul>
  <li>easily store metadata with every tile, like a date when it should
expire.</li>
  <li>keep a history of tiles and show them as "travel through time layers"
in your mapping application</li>
  <li>easy replication to other servers</li>
</ul>
<p>You just don't want such a slow hog. And those
<a href="http://wiki.apache.org/couchdb/People_on_the_Couch">CouchDB
people</a> try to tell me that CouchDB would be fast. Pha!</p>

<h3>Really??</h3>
<p>You might already wonder, where the details are, the software version
numbers, the specification of the system and all that stuff? These things are
missing with a good reason. This benchmark just isn't right, even if I would
add these details. The problem lies some layers deeper.
</p>
<p>This benchmark is way to far away from a real-life usage. You would request
much more tiles and not the same 780 ones with every run. When I was
benchmarking the filesystem cache, all tiles were already in the system's
cache, therefore it was <em>that</em> fast.
</p>
<p>Simple solution: clear the system cache and run the tests again. Here are
the results after as <code>echo 3 > /proc/sys/vm/drop_caches</drop>
<ul>
  <li><p>Filesystem:</p>
<pre>
real 8.36 (0.71)
user 0.29 (0.04)
sys  0.18 (0.03)
</pre>
  </li>
  <li><p>CouchDB:</p>
<pre>
real 6.64 (0.15)
user 1.13 (0.07)
sys  0.29 (0.06)
</pre>
  </li>
</ul>
<p>Wow, the CouchDB cache is faster than the filesystem cache. Too nice to be
true. The reason is easy: loading the CouchDB database file, thus one file
access on the disk, is way faster that 780 accesses.
</p>

<h3>Does it really matter?</h3>
<p>Let's take the first benchmark, if CouchDB would be that much slower, but
isn't it perhaps <em>fast enough</em>? Even with those measures (ten times
slower than the filesystem cache) it would mean your cache can take 250
requests per second. Let's say a user requests 9 tiles per second it would be
about 25 users at the same time. With every user staying 2 minutes on the map
it would mean 18&#160;000 users per day. Not bad.
</p>
<p>Additionally you gain some nice things you won't have with other
caches (as outlined above). And if you really need more performance you could
always dump the tiles to the filesystem with a cron job.
</p>

<h3>Conclusion</h3>
<ol>
  <li>Benchmarking is not easy, but easy to get wrong.</li>
  <li>Slow might be fast enough.</li>
  <li>Read more about benchmarking on
<a href="http://jan.prima.de/plok/archives/176-Caveats-of-Evaluating-Databases.html">Jan's
blog</a>.</li>
</ol>
]]></content:encoded>
<wfw:commentRss>http://vmx.cx/cgi-bin/blog/index.cgi/benchmarking-is-not-easy%3A2009-09-23%3Aen%2CCouchDB%2CPython%2CTileCache%2Cgeo/feed/</wfw:commentRss>
</item>
<item>
<title>GeoCouch: New release (0.10.0)
</title>
<link>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-new-release-0.10.0%3A2009-09-19%3Aen%2CCouchDB%2CPython%2Cgeo</link>
<comments>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-new-release-0.10.0%3A2009-09-19%3Aen%2CCouchDB%2CPython%2Cgeo#comments</comments>
<pubDate>Sat, 19 Sep 2009 14:26:45 +0200</pubDate>
<dc:creator>Volker Mische</dc:creator>
<category>en</category>
<category>CouchDB</category>
<category>Python</category>
<category>geo</category>
<guid isPermaLink="false">http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-new-release-0.10.0%3A2009-09-19%3Aen%2CCouchDB%2CPython%2Cgeo/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[

<p>It has been way to long since the initial release, but it’s finally there:
a new release of GeoCouch. For all first time visitors, GeoCouch is an
extension for <a href="http://couchdb.apache.org/">CouchDB</a> to support
geo-spatial queries like bounding box or polygon searches.
</p>
<p>I keep this blog entry relatively short and only outline the highlights and
requirements for the new release as GeoCouch finally has a real home at
<a href="http://gitorious.org/geocouch/">http://gitorious.org/geocouch/</a>.
Feel free to contribute to the wiki or fork the source.
</p>

<h3>Highlights</h3>
<ul>
  <li>Many geometries
<a href="http://gitorious.org/geocouch/pages/GeometryDefinition">are
supported</a>: points, lines, polygons (using Shapely).</li>
  <li>Queries are largely along the lines of the
<a href="http://www.opensearch.org/Specifications/OpenSearch/Extensions/Geo/1.0/Draft_1">OpenSearch-Geo
extension draft</a>. Currently
<a href="http://gitorious.org/geocouch/pages/Queries">supported</a> are
bounding box and polygon searches.</li>
  <li>Adding new backends (in addition to SpatiaLite) is easily possible.</li>
</ul>

<h3>Requirements</h3>
<ul>
  <li><a href="http://www.kernel.org/">Linux 2.6.26</a></li>
  <li><a href="http://couchdb.apache.org/">CouchDB 0.10.0</a></li>
  <li><a href="http://www.python.org/">Python 2.6.0</a></li>
  <li><a href="http://code.google.com/p/couchdb-python/">couchdb-python 0.6.x (0.6.0 doesn't work)</a></li>
  <li><a href="http://trac.gispython.org/lab/wiki/Shapely">Shapely 1.0.12</a></li>
  <li><a href="http://code.google.com/p/apsw/">APSW - Another Python SQLite Wrapper 3.5.9-r2</a></li>
  <li><a href="http://www.gaia-gis.it/spatialite/">SpatiaLite 2.3.1</a></li>
</ul>
<p>Other versions might work.</p>

<h3>Download</h3>
<p>If you don’t like Git, you can
<a href="/geocouch/downloads/geocouch-0.10.0.tar.bz2">download GeoCouch 0.10.0
here</a>.
</p>
]]></content:encoded>
<wfw:commentRss>http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-new-release-0.10.0%3A2009-09-19%3Aen%2CCouchDB%2CPython%2Cgeo/feed/</wfw:commentRss>
</item>
<item>
<title>CouchDB: Returning all design documents with Python
</title>
<link>http://vmx.cx/cgi-bin/blog/index.cgi/couchdb-all-design-docs%3A2009-08-21%3Aen%2CCouchDB%2CPython</link>
<comments>http://vmx.cx/cgi-bin/blog/index.cgi/couchdb-all-design-docs%3A2009-08-21%3Aen%2CCouchDB%2CPython#comments</comments>
<pubDate>Fri, 21 Aug 2009 20:57:16 +0200</pubDate>
<dc:creator>Volker Mische</dc:creator>
<category>en</category>
<category>CouchDB</category>
<category>Python</category>
<guid isPermaLink="false">http://vmx.cx/cgi-bin/blog/index.cgi/couchdb-all-design-docs%3A2009-08-21%3Aen%2CCouchDB%2CPython/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[

<p>I just wanted to get all design documents of a
<a href="http://couchdb.apache.org/">CouchDB</a> database with
<a href="http://code.google.com/p/couchdb-python/">couchdb-python</a>. I
couldn’t find any hints how to do it, it took longer to find out than expected.
Therefore this blog entry, perhaps I save someone a few minutes of research.
</p>
<p>
  <pre>
<code>from couchdb.client import Server
couch_server = Server('http://localhost:5984/')
for designdoc in couch_server['yourdatabase']\
        .view('_all_docs', startkey='_design', endkey='_design0'):
    print 'designdoc: %s' % designdoc
</code></pre>
</p>
<p><strong>Update:</strong> even simpler with slicing:</p>
<p>
  <pre>
<code>from couchdb.client import Server
couch_server = Server('http://localhost:5984/')
for designdoc in couch_server['yourdatabase']\
        .view('_all_docs')['_design':'_design0']:
    print 'designdoc: %s' % designdoc
</code></pre>
</p>
]]></content:encoded>
<wfw:commentRss>http://vmx.cx/cgi-bin/blog/index.cgi/couchdb-all-design-docs%3A2009-08-21%3Aen%2CCouchDB%2CPython/feed/</wfw:commentRss>
</item>
</channel>
</rss>
