vmx

the blllog.

FOSS4G 2023

2023-07-22 21:50

Finally, after missing one virtual and one in person global FOSS4G I had again the chance to attend a global in-person FOSS4G conference. Thanks Protocol Labs for sending me. This year it was in Prizren, Kosovo. I’m a bit late with that post, but that’s due to doing some hiking in Albania right after the conference.

The organization and venue

Wow. It’s been my favourite venue of all FOSS4Gs I’ve been to so far. The exhibition hall was a great place to hang out, combined with the excellent idea of a 24h bar. I’m not sure if it was used at all times, but definitely for more than 20h a day. Outside, there was plenty of space and tables to hang out, and very close by another set of tables that formed the “work area”. Which was another great place to hang out, with enough power sockets and shade for the hot days.

The main stage was an open air stage with enough seating for everyone. It was converted for the gala dinner to a stage with an excellent live band and the usual big round tables.

For me, the best part was that even the accommodation was on-site. The barracks of the former military basis, which now serve as student dorms, were our home for a week. Pretty spartan, but at a conference I don’t really spend much time in my room, I mostly need just some place to sleep.

Having everything, the talks, exhibition, social events and accommodations on-site makes it easy to maximize the time for socializing, which for me is the number one reason to attend a conference.

Everything was well organized, and it was great to see so many volunteers around.

The talks

I haven’t really selected the talks I went to. I rather joined others where they were going, or listened to recommendations. Often, I just stayed in the rest of the slot to see what else is there. My favourite talks were:

  • Smart Maps for the UN and All - keeping web maps open: For me, it was the first time I saw someone speaking at a FOSS4G about using IPFS that wasn’t me. It’s great to see that it gains traction for the offline use case, where it just makes a lot of sense. UN Smart Maps is part of the UN OpenGIS initiative, it features a wide range of things, even an AI chatbot called TRIDENT that transforms the text into Overpass API calls. Try TRIDENT it out yourself, when you open the developer console, you can see the resulting Overpass API calls.
  • Offline web map server “UNVT Portable”: This talk got into more detail about using Raspberry Pis to have map data stored in IPFS for offline use. It’s very similar to what I envision, the only difference is that I’d also like to keep the storage in the browser. But I surely see a future, where those efforts are combined, to have a small easy server you can deploy, with in browser copies of subsets of the data to be able to work completely offline in the field. The original UNVT Portable repository doesn’t use IPFS, but Smart Maps Bazaar does, which seems to be its successor.
  • B6, Diagonal’s open source geospatial analysis engine: A presentation of the B6 tool for geospatial analysis for urban planning. It has a beautiful interface. I really like the idea of doing things directly on the map in a notebook-style way, where you perform certain steps after each other.
  • Elephant in the room: A talk about how many resources to computations take? Do we always need it? It’s very hard, often impossible, to find out how environmentally friendly some cloud services are. One of the conclusions was that cheaper providers likely use less power, hence are harming the environment less. I would like if there would be better ways (e.g. it misses things like economies of scale of large providers), but I agree that this might be the best metric we currently have. And I also hope there will be more economic pressure to save resources.
  • There was a closing keynote from Kyoung-Soo Eom, who was talking about his long journey in open source GIS, but also his history with Kosovo, where he was also on a mission in 1999. Quite inspiring.

My talk

My talk about Collaborative mapping without internet connectivity was about a browser based offline-first prototype that uses IPFS to enable replication to other peers. The project is called Colleemap and is dual-licensed under the MIT and Apache 2.0 license. Although I tried the demo bazillion times before my talk, it sadly didn’t work during my talk. Though, trying it later with various people, I was able to get 4 peers connected once. I even saw it working on a Windows machine. So it really works cross-platform.

For the future I hope to work closer with the people from the UN OpenGIS initiative, it would be great to combine it with their Raspberry Pi based prototype.

Things I’ve learnt

The Sentinel-2 satellite imagery is available from multiple sources, directly from Copernicus Open Access Hub or through cloud providers like AWS, Azure of Google Cloud. From the cloud providers you only get the level-2 data. They might use the original level-2 data or do their own atmospheric correction based on the level-1 data. Or even re-encode the data. So it’s hard to tell which kind of data you actually get.

As far as I know (please let me know if I’m wrong), there isn’t any mirror of the full level-1c data. You can only get it through the Copernicus Open Access Hub and there the older images are stored in the long term archive on tape, where it can take up to 24h for the data to be available for download (if it works).

Ideally, there would be a mirror of the full level-1c data (where the ESA would provide checksums of their files) and a level-2 version, where the exact process is openly published, so that you can verify how it was created. The problem is the storage cost. The current level-2 data is about 25 PiB, which leads to storage costs of over $500k USD a month if you would store it on AWS S3 Standard at the current pricing (I used the $0.021 per GB).

Final thoughts

It was great to meet Gresa and Valmir from the local organizing committee before the FOSS4G in March at the OSGeo German language chapter conference FOSSGIS in Berlin. That made it easy for me to connect to the event right from the start. If there’s one thing future FOSS4Gs should adapt, it’s the cheap on-site (or close by) accommodation. I think that shared bathrooms is also much smoother to have, if you know that everyone in the accommodation is from the conference. We had something similar with the BaseCamp in Bonn during the FOSS4G 2016 and the international code spring in 2018 during the FOSSGIS conference, where the whole place was rented for the time of the events.

Though, of course, I also missed some of my longtime FOSS4G friends I hadn’t seen in a long time. I hope you’re all doing well and will meet again soon.

Categories: en, IPFS, conference, geo

Video uploads for an online conference

2021-06-12 16:35

This blog post should give some insights on what happens behind the scenes in preparation of an online conference, and I also hope that some of the scripts I created might be useful for others as well. We were using pretalx for the submissions and Seafile for video uploads. Both systems are accessed over their HTTP API.

This year’s FOSSGIS 2021 conference was a pure online conference. Though it had the same format as every year. Three days of conference, with four tracks in parallel. This leads to about 100 talks. I joined the organizing team about 10 weeks before the conference took place. The task sounded easy. The speakers should be able to upload their talks prior to the conference, so that during the conference less could go wrong.

All scripts are available at https://github.com/vmx/conference-tools licensed under the MIT License.

The software

The speakers submitted their talks through pretalx, a conference management system I highly recommend. It is open source and has an active community. I’ve worked on/with it over the past few to make it suitable for OSGeo conferences. The latest addition is the public community voting plugin, which has been used for the FOSS4G 2021 as well as this conference. pretalx has a great HTTP API to get data out of the system. It doesn’t yet have much support for manipulating the data, but pull-requests are welcome.

For storing the video files, Seafile was used. I haven’t had any prior experience with it. It took me a bit to figure out, that the Python API is for local access only and that the public API is a pure HTTP API. You can clearly see that their API is tailored to their use in their web interface and not really designed for third party usage. Nonetheless, it guarantees that you can do everything via the HTTP API, that can be done through the web UI.

My scripts are heavily based on command line tools like b2sum, curl, cut, jq and jo, hence a lot of shell is used. For more complex data manipulation, like merging data, I use Python.

The task

The basic task is providing pre-recorded videos for a conference that were uploaded by the speakers themselves. The actual finer grained steps are:

  • Sending the speakers upload links
  • Looking through the videos to make sure they are good
  • Re-organizing the files suitable to be played back according to the schedule
  • Make the final files easily downloadable
  • Create a schedule which lists the live/pre-recorded talks

In Seafile you can create directories and make them publicly available so that people can upload files. Once uploaded, you won’t see what else in that directory. In order to be able to easily reference the uploaded videos back to the corresponding talk, it was important to create one dedicated directory per talk, as you won’t know which filenames people will use for their videos.

The speakers will receive an email containing dedicated upload links for each of their talks. See the email_upload_links directory for all the scripts that are needed for this step.

pretalx

First you need to get all the talks. In pretalx that’s easy, go to your conference, e.g. https://pretalx.com/api/events/democon/submissions/. We only care about the accepted talks, which can be done with selecting a filter. If you access it through curl, you’ll get a JSON response like that one: https://pretalx.com/api/events/democon/submissions/?format=json. pretalx returns 25 results per request. I’ve created a script called pretalx-get-all.py that automatically pages through all the results and concatenates them.

A talk might be associated with multiple speakers. Each speaker should get an email with an upload link. There were submissions that are not really talks in the traditional sense, so people shouldn’t get an email. The query for jq looks like that:

[.results[] | select((.submission_type[] | contains("Workshop")) or (.submission_type[] == "Anwendertreffen / BoF") | not) | { code: .code, speaker: .speakers[].code, title: .title, submission_type: .submission_type[]}]

The submissions contain only the speaker IDs and names, but not other details like their email address. We query the speakers API (e.g. https://pretalx.com/api/events/democon/speakers/) and post-process the data again with jq, as we care about their email addresses.

You can find all the requests and filter in the email_upload_links/upload_talks_to_seafile.sh script.

Seafile

Creating and upload link is a two-step process in Seafile. First create the directory, then creating a public accessible upload link for the directory. The directories are named after the pretalx ID of the talk (Full script for creating directories).

Creating emails

After acquiring the data, the next step is to process the data and creating the individual emails. Combining the data is done with the combine_talks_speakers_upload_links.py script, where the output is again post-processed with jq. The data_to_email.py script takes that data output and a template file to create the actual email as files. The template file is used as a Python format string, where the variables a filled with the data provided.

Those email files are then posted to pretalx, so that we can send them over their email system. That step is more complicated as currently there is no API in pretalx to do that. I logged in through the web interface and manually added a new email, while having the developer tools open. I then copied the POST request “as cURL” to have a look at the data it sent. There I manually extracted the session and cookie information in order to add emails from the command line. The script that takes the pre-generated emails and puts them into pretalx is called email_to_pretalx.sh.

Reviewing the uploaded videos

Once a video is uploaded, it gets reviewed. The idea was, that the speakers don’t need to care too much about the start and the end of the video, e.g. when they start the recording and there is a few seconds of silence while switching to the presentation. The reviewer will cut the beginning and end of the video and also convert it to a common format.

We wanted to preserve the original video quality, hence we use LosslessCut and converted it then to the Matroska format. The reviewers would also check that the video isn’t longer than the planned slot.

See the copy_uploads directory for all the scripts that are needed for this step.

pretalx

The reviewers get a file with things to check for each video file. We get the needed metadata again from pretalx and post-process it with jq. As above for the emails, there is again a template file which (this time) generates Markdown files with the information for the reviewers. The full script is called create_info_files.sh.

Seafile

Once videos are uploaded they should be available for the reviewers. The uploaded files are the primary source, hence it makes sense to always make copies of the talks, so that the original uploads are not lost. The sync_files_and_upload_info.sh script copies the talks into a new directory (together with the information files), which is then writeable for the reviewers. They will download the file, review it, cut it if needed, convert it to Matroska and upload it again. Once uploaded, they move the directory into one called fertig (“done” in German) as an indicator that no one else needs to review it.

I run the script daily as a cron job, it only copies the new uploads. Please note that it only checks the existence on a directory level. This means that if a talk was reviewed and a speaker uploads a new version of the talk, it won’t be copied. That case didn’t often happen often and speakers actually let us know about it, so it’s mostly a non-issue (also see the miscellaneous scripts section for more).

Last step is that someone looks through the filled out markdown files to check if everything was alright, respectively make sure that e.g. the audio volume is fixed, or asks the speaker for a new upload. The then checked videos are moved to yet another directory, which then contains all the talks that are ready to be streamed.

Re-org files for schedule

So far, the video files were organized by directories that are named after the pretalx ID of the talk. For running the conference we used OBS for streamer. The operator would need to play the right video at the right time. Therefore, it makes sense to sort them by the schedule. The cut_to_schedule.sh script does that re-organization, which can be found in the cut_to_schedule directory.

pretalx

To prevent accidental inconsistencies, the root directory is named after the current version of the pretalx schedule. So if you publish a new version of the schedule and run the script again, you’ll get a new directory structure. The video files still have an arbitrary name, chosen by the uploader/reviewer, we want a common naming scheme instead. The get_filepath.py script creates such a name that also sorts chronologically and contains all the information the OBS operators need. The current scheme is <room>/day<day-of-the-conference>/day<day-of-the-conference>_<day-of-the-week>_<date>_<time>_<pretalx-id>_<title>.mkv.

Seafile

The directories do not only contain the single final video, but also the metadata and perhaps the original video or a presentation. The file we actually copy is the *.mkv file which was modified last, which will be the cut video. The get_files_to_copy.sh script creates a list of the files that should be copied, it will only list the files that weren’t copied yet (based on the filename). The copy_files.sh script does the actual copying and is rather generic, it only depends on a file list and Seafile.

Easily downloadable files

Seafile has a feature to download a full directory as zip file. I originally planned to use that. It turns out that the size of the files can be too large, I got the error message Unable to download directory "day1": size is too large.. So I needed to provide another tool, as I didn’t want that people would need to click and download all individual talks.

The access to the files should as easy as possible, i.e. the operators that need the files shouldn’t need a Seafile account. As the videos also shouldn’t be public, the compromise was using a download link secured with a password. This means that an authentication step is needed, which isn’t trivial. The download_files.sh script does login and then downloads all the files in that directory. For simplicity, it doesn’t do recursively. This means that any stage would need to run this script for each day.

I also added a checksum check for more robustness. I created those checksums manually with running b2sum * > B2SUMS in each of the directories and then uploaded them to Seafile.

List of live/pre-recorded talks

Some talks are recorded and some are live, the list_recorded_talks.py script, creates a Markdown file that contains a schedule with that information, including the lengths of the talks if they are pre-recorded. This is useful for the moderators to know how much time for questions will be. At the FOSSGIS we have 5 minutes for questions, but if the talk runs longer, there will be less time.

You need the schedule and the length of the recorded talks. This time I haven’t fully automated the process, it’s a bit more manual than the other steps. All scripts can be found in the list_recorded_talks directory.

Get the schedule:

curl https://pretalx.com/<your-conference>/schedule.json > schedule.json

For getting the lengths of the videos, download them all with the download script from the Easily downloadable files section above. Then run the get_length.sh script in each of the directories and output then into a file. For example:

cd your-talks-day1
/path/to/get_lengths.sh > ../lengths/day1.txt

Then combine the lengths of all days into a single file:

cat ../lengths/*.txt > ../talk_lengths.txt

Now you can create the final schedule:

cd ..
python3 /path/to/list_recorded_talks.py schedule.json talk_lengths.txt

Here’s a sample schedule from the FOSSGIS 2021.

Miscellaneous Scripts

Speaker notification

The speakers didn’t get feedback whether their video was correctly uploaded/processed (other than seeing a successful upload in Seafile). A short time before the conference, we were sending out the latest information that speakers needs to know. We decided to take the chance to also add information whether their video upload was successful or not, so that they can contact us in case something with the upload didn’t go as they expected (there weren’t any issues :).

It is very similar to sending out the email with the upload links. You get the information about the speakers and talks in the same way. The only difference is we now also need the information whether the talk was pre-recorded or not. We get that from Seafile:

curl --silent -X GET --header 'Authorization: Token <seafile-token>' 'https://seafile.example.org/api2/repos/<repo-id>/?p=/<dir-with-talks>&t=d'|jq --raw-output '.[].name' > prerecorded_talks.txt

The full script to create the emails can be found at email_speaker_final.sh. In order to post them to pretalx, you can use the email_to_pretalx.sh script and follow the description in the creating emails section.

Number of uploads

It could happen that people upload a new version of the talk. The current scripts won’t recognize that if a previous version was already reviewed. Hence, I manually checked the directories for the ones with more than one file in it. This can easily be done with a single curl command to the Seafile HTTP API:

curl --silent -X GET --header 'Authorization: Token <seafile-token>' 'https://seafile.example.org/api2/repos/<repo-id>/dir/?p=/<dir-with-talks>&t=f&recursive=1'|jq --raw-output '.[].parent_dir'|sort|uniq -c|sort

The output is sorted by the number of files in that directory:

  1 /talks_conference/ZVAZQQ
  1 /talks_conference/DXCNKG
  2 /talks_conference/H7TWNG
  2 /talks_conference/M1PR79
  2 /talks_conference/QW9KTH
  3 /talks_conference/VMM8MX

Normalize volume level

If the volume of the talk was too low, it was normalized. I used ffmpeg-normalize for it:

ffmpeg_normalize --audio-codec aac --progress talk.mkv

Conclusion

Doing all this with scripts was a good idea. The less manual work the better. It also enabled me to process talks even during the conference in a semi-automated way. I created lots of small scripts and sometimes used just a subset of them, e.g. the copy_files.sh script, or quickly modified them to deal with a special case. For example, all lightning talks of a single slot (2-4) were merged together into one video file. That file of course then isn’t associated with a single pretalx ID any more.

During the conference, the volume level of the pre-recorded talks was really different. I think for next time I’d like to do some automated audio level normalization after the people have uploaded the file. It should be done before reviewers have a look, so that they can report in case the normalization broke the audio.

The speakers were confused whether the upload really worked. Seafile doesn’t have an “upload now” button or so, it does it’s JavaScript magic once you’ve selected a file. That’s convenient, but was also confusing me, when I used it for the first time. And if you reload the page, you also won’t see that something was uploaded already. So perhaps it could also be automated that speakers get an email “we received your upload” or so.

Overall I’m really happy how the whole process went, there weren’t major failures like lost videos. I also haven’t heard any complaints from the people that needed to use any of the videos at any stage of the pipeline. I’d also like to thank all the speakers that uploaded a pre-recorded video, it really helped a lot running the FOSSGIS conference as smooth as it was.

Categories: en, conference, geo

Joining Protocol Labs

2018-01-24 22:35

I’m pumped to announce that I’m joining Protocol Labs as a software engineer. Those following me on Twitter or looking on my GiHub activity might have already got some hints.

Short term

My main focus is currently on IPLD (InterPlanetary Linked Data). I’ll smooth things out and also work on the IPLD specs, mostly on IPLD Selectors. Those IPLD Selectors will be used to make the underlying graph more efficient to traverse (especially for IPFS). That’s a lot of buzzwords, I hope it will get clearer the more I’ll blog about this.

To get started I’ve done the JavaScript IPLD implementations for Bitcoin and Zcash. Those are the basis to make easy traversal through the Bitcoin and Zcash blockchains possible.

Longer term

In the longer term I’ll be responsible to bring IPLD to Rust. That’s especially exciting with Rust’s new WebAssembly backend. You’ll get a high performance Rust implementation, but also one that works in Browsers.

What about Noise?

Many of you probably know that I’ve been working full-time on Noise for the past 1.5 years. It shapes up nicely and is already quite usable. Of course I don’t want to see this project vanish and it won’t. At the moment I only work part-time at Protocol Labs, to also have some time for Noise. In addition to that there’s also interest within Protocol Labs to use Noise (or parts of it) for better query capabilities. So far it’s only rough ideas I mentioned briefly at the end of my talk about Noise at the [Lisbon IPFS Meetup] two weeks ago. But what’s the distributed web without search?

What about geo?

I’m also part of the OSGeo community and FOSS4G movement. So what’s the future there? I see a lot of potential in the Sneakernet. If geo-processing workflows are based around IPFS, you could use the same tools/scripts whether it is stored somewhere in the cloud, or access you local mirror/dump if your Internet connection isn’t that fast/reliable.

I expect non-realiable connectivity to be a hot topic at the FOSS4G 2018 conference in Dar es Salaam, Tansania.

Conclusion

I’m super excited. It’s a great team and I’m looking forward to push the distributed web a bit forward.

Categories: en, ProtocolLabs, IPLD, IPFS, JavaScript, Rust, geo

Introducing Noise

2017-09-19 22:35

I meant to write this blog post for quite some time. It's my view on the new project I'm working on called Noise. I work together with Damien Katz on it full-time for already about a year now. Damien already blogged a bit about the incarnation of Noise.

I can't recall when Damien first told me about the idea, but I surely remember one meeting we had at Couchbase, were plenty of developers were packed in a small room in the Couchbase Mountain View office. Damien was presenting his idea on how flexible JSON indexing should work. It was based on an idea that came up a long time ago at IBM (see Damien's blog post for more information).

Then the years passed without this project actually happening. I've heard again about it when I was visiting Damien while I was in the Bay Area. He told me about his plan actually doing this for real. If I would join early i would become a founder of the project. It wasn't a light-hearted decision, but I eventually decided to leave Couchbase to work full-time on Noise.

Originally Damien created a prototype in C++. But as I was really convinced that Rust is the future for systems programming and databases, I started to port it to Rust before I visited him in the US. Although Damien was skeptical at first, he at least wanted to give it a try and during my stay I convinced him that Rust is the way to go.

Damien did the hard parts on the core of Noise and the Node.js bindings. I mostly spent my time getting an R-tree working on top of RocksDB. It took several attempts, but I think finally I found a good solution. Currently it's a special purpose implementation for Noise, but it could easily be made more generic, or adapted to other specific use cases. If you have such needs, please let me know. At this year's Global FOSS4G conference I presented Noise and its spatial capabilities to a wider audience. I'm happy with the feedback I got. People especially seem to enjoy the query language we came up with.

So now we have a working version which does indexing and has many query features. You can try out Noise online. There's also basic geospatial bounding box query support, which I'll blog more about once I've cleaned up the coded-in-rush-for-a-conference mess and have merged into the master branch.

There are exciting times ahead as now it's time to get some funding for the project. Damien and I don't want to do the venture capital based startup kind of thing, but rather try to find funding through other channels. This will also define the next steps. Noise is a library so it can be the basis for a scaled up distributed system, and/or to scale down into a nice small analytics system that you can run on your local hardware when you don't have access to the cloud.

So in case you read this, tried it out and think that this is exactly what you've been looking for, please tell me about your use case and perhaps you even want to help funding this project.

Categories: en, Noise, RocksDB, Rust, geo

FOSS4G 2017

2017-09-01 22:35

The Global FOSS4G 2017 conference was a great experience as every year. Meeting all those people you know already for years, but also those you’ve so far met only virtually.

The talks

The program committee did a great job with the selection. Especially since there were so many to select from. Here are the most memorable talks:

  • “Optimizing Spatiotemporal Analysis Using Multidimensional Indexing with GeoWave” by Richard Fecher: The talked also touched the technical details on how they solve building a multidimensional index on top of distributed key-value stores. Currently they support Apache Accumulo, Apache HBase and Google’s Bigtable, but in theory they could also support any distributed key-value store, hence also [Apache CouchDB](http://couchdb.apache.org/ or Couchbase. I really enjoyed the technical depth and that it is based on solid research and evaluations.

  • “DIY mapping with drones and open source in a humanitarian context” by Dan Joseph: It was really nice to see that not everyone is using quadcopters for drone mapping, but that there’s also fixed-wing drones (they look like planes). It gave good details about failures and success. I wish good look with future models and the mapping itself.

  • “GPUs & Their Role in Geovisualization” by Todd Mostak: GPUs are now so powerful that you can do your multidimensional queries on points with doing table scans. That’s quite impressive. It’s also good to see that the core of MapD got open sourced under the Apache License 2.0.

Sadly I’ve missed two talks I wanted to see. One was [Steven Ottens](https://twitter.com/stvno_ speaking about “D3.js in postgres with plv8 V8/JavaScript”. It sounds daunting at first, but if you think about the numerous JavaScript libraries for geo processing that are out there, it makes sense for rapid prototyping.

The other one was Steven Feldman’s talk on “Fake Maps”. I always enjoy Steven’s talks as he digs into maps as much as I’d love to, but sadly don’t take the time to. Though he said that once the recording of the talk is out, I should grab a beer and enjoy watching it. I’m looking forward to do so.

My own talk went really well. Originally I thought being on the last slot on Friday — the last conference day — is bad, as people don’t have my time to approach you after the talk. But in the end it was actually good as I had several days to promote it to people who are interested. I loved that I was in the smallest room of the venue, hence it was packed. I’ll write more about the talk once I’ve cleaned up the code and pushed it to the master branch, so that you can all play with the spatial features yourself.

The keynotes

This year there were 5 keynotes, which I think is a good number. You always need to keep in mind that depending on the length, you might kick out 10–20 normal speaker slots. I enjoyed all of them, although in my opinion, for most of them 30 mins (instead of 45 mins) would’ve been sufficient. But I have to admit that I could probably see Paul Ramsey talking for hours and it would still be great.

Of course one keynote — the one from Richard Stallman — stood out. It surely lead to lively discussion within the community, which is really a great thing. I share the opinion of Jeff McKenna that I really respect what Stallman did and is doing and how much he is into it. Though it came clear to me that I am an Open Source developer who cares about openness and transparency.

The venue

The venue was a typical conference center, which had the benefit that the rooms were close together. This made switching rooms even within slots easily possible.

One thing I didn’t like was the air conditioning. Some rooms were cooled down way to much. Did anyone measure? I know, it’s a cultural thing and not the fault of the organizers. Though I wonder how much energy and money could’ve been saved when the temperature would’ve been lowered to an acceptable level only.

Sometimes there are discussions about the location of the OSGeo booth within the exhibition area. I think this year it was in a good spot. It wasn’t at the most prestigious place, that’s for Diamond sponsors. But at a spot where people actually gather/hang out, that’s a way better fit in my opinion.

The social events

The social events were nice and I was happy that I was able to bring in a well known and liked former community member into the icebreaker event. The icebreaker reminded me a bit of last year’s one. There it was possible to bring anyone who wanted to go there. I think the attendees had some vouchers, but I can’t recall really the details. Anyway, I think it’s a good idea to have one social event where you can bring in people that are in the area, but don’t attend the conference.

The code sprint

The code sprint was hosted at the District Hall which is a innovation/startup/co-working place. We had the whole space which was really nice. The different tribes, Java, MapServer, Postgres and Fancy Shit assembled at different spots and put up signs, so it was easy to find your way to the right group.

JS.geo

I also need to mention that the day before the FOSSS4G there was the JS.Geo at the same place as the code sprint. It was a really nice event and if I ever organize an English single track geo conference, I’ll get Brian Timoney as a moderator. He was so entertaining and really contributed to the great vibe this conference had.

Miscellaneous

This year there wasn’t a printed program brochure. It was all just available online at — the certainly cool — https://foss4g.guide/ or as app. I on my FirefoxOS phone was using the website. I think it could’ve been better to navigate, but it was OK and I didn’t really miss the brochure. The website based guide was OK when you were on your phone and on-site, to see which talks are up next. I don’t think it worked well if you tried to do some ahead of time planning.

The FOSS4G t-shirts look great, but I’m a bit sad that they were grey (a nice one though) and the Local Team had t-shirts in my favourite orange color.

Notes for future years

It might really make sense to not producing a printed program brochure anymore as probably all attendees have a smartphone anyway (though this needs to be checked by Local Team depending on the area). If you decide to go Web only, you should make sure it works offline and perhaps spending the time you would’ve spent on the printed one instead on the usability of the web one.

Categories: en, conference, geo

An R-tree implementation for RocksDB

2017-02-14 22:35

It's long been my plan to implement an R-tree on top of RocksDB. Now there is a first version of it.

Getting started

Checkout the source code from my RocksDB rtree-table fork on Github, build RocksDB and the R-tree example.

git clone https://github.com/vmx/rocksdb.git
cd rocksdb
make static_lib
cd examples
make rtree_example

If you run the example it should output augsburg:

$ ./rtree_example
augsburg

For more information about how to use the R-tree, see the Readme file of the project.

Implementation

The nice thing about LSM-trees is that the index data structures can be bulk loaded. For now for my R-tree it's just a simple bottom up building with a fixed node size (default is 4KiB). The data is pre-sorted by the low value of the first dimension. This means that data has a total order, hence also sorted results based on the first dimension. The idea is based on the paper On Support of Ordering in Multidimensional Data Structures by Filip Křižka, Michal Krátký, Radim Bača.

The tree is far from optimal, but it is a good starting point. Currently only doubles are supported. In the future I'd like to support integers, fixed size decimals and also strings.

If you have a look at the source code and cringe because of the coding style, feel free to submit pull requests (my current C++ skills are sure limited).

Next steps

Currently it's a fork of RocksDB which surely isn't ideal. I've already mentioned it in last year's FOSS4G talk about the R-tree in RocksDB (warning: autoplay) that there are several possibilities:

  • Best (unlikely): Upstream merge
  • Good: Add-on without additional patches
  • Still OK: Be an easy to maintain fork
  • Worst case: Stay a fork

I hope to work together with the RocksDB folks to find a way to make such extensions easily possible with no (or minimal) code changes. Perhaps having stable interfaces or classes that can easily be overloaded.

Categories: en, RocksDB, geo

FOSS4G 2014

2014-09-16 22:35

The FOSS4G 2014 conference was a blast as every year. I really enjoyed meeting old friends as well as people that I’ve know through the web only.

The talks

As I was in the program committee myself, I won’t say much about the selection of the talks (please see the “Things to improve” section, though), but I’ve heard only few complaints so far. This might be due to publishing the review process that we used. But if you have any complaints or ideas to improve it in coming years, please get in touch with me.

I haven’t spend all my time it talks but saw quite a few. As always you might end up in some decent talk where you expect it the least. Notable ones that I’ve attended:

  • “Gimme some YeSQL! – and a GIS” by Vincent Picavet: It was a good overview what is hot and new in PostgreSQLl. It’s good to see the Josh Berkus is getting closer to his envisioned CouchgreSQL.
  • “Spatial in Lucene and Solr” by David Smiley: For me it’s always interesting to hear from other spatial indexing solutions.
  • “Accurate polygon search in Lucene Spatial (with performance benefits to boot!)” by Jeffrey Gerard: That one was of interest for me as the problems that need to be solved are similar to the ones I have with GeoCouch.
  • “An Open Source Approach to Communicating Weather Risks” by Jonathan Wolfe: A talk about the NWS Enhanced Data Display, which is a huge web portal. There are a lot possible through that web interface, which contains a lot of information. Although they use a lot of open source I’d really love to see the portal itself to be open sourced.
  • “OnEarth: NASA’s Boundless Solution to Rapidly Serving Geographic Imagery” by Joe Roberts: They showed WorldView which is another example of a huge web portal, but this time the source code is available as open source on Github: https://github.com/nasa-gibs/worldview
  • “Introduction to the geospatial goodies in Elasticsearch” by Simon Hope and Jerome Anthony: It was a good introductory talk with a great live example at the end.

I certainly had fun with my own talk “GeoCouch: A distributed multidimensional index”, I had a good feeling afterwards. I hope the people in the audience enjoyed it as well. I’m still working on getting the Terminal output into a PDF.

During my talk I also announced that MapQuery is officially retired. Steven Ottens and I don’t really use it anymore, there weren’t much users. The JavaScript world has moved on with OpenLayers 3 as well as new kinds of web frameworks.

The venue

The catering was great. I heard a few people that weren’t happy about last year’s catering in Nottingham, I have to say that I was.

What I really enjoyed this year was that after the first day there was even food after the last session. On the second day there was the gala event (with food) and on the last day everyone was heading off anyway.

This year the venue was great as all sessions were close to each other. It was easily possible to switch rooms (unlike it was in some of the previous FOSS4Gs).

Everything was well organized, there were plenty of volunteers, you saw them at every corner. Also the guidance to the gala event was great. I think everyone who wanted to make it was easily able get onto the right lightrail.

The social events

There was an JS.Geo after-party I liked meeting some people that I haven’t seen in a while that weren’t even attending the FOSS4G. We then moved on to the FOSS4G Welcome Reception hosted by Ecotrust and Point 97.

On the first day there was the LocationTech Meetup which had plenty of free drinks and a lot of the people from the conference that just went over to that bar.

Second day was the gala event at the World Forestry Center. I think it was the best one from any of the FOSS4Gs that I’ve been to (since 2009). What I really enjoyed was that it wasn’t the normal “Gala Dinner Setup” with huge round tables you kind of feel locked onto. Instead there was a wide open space and you grabbed the food at some counter (kind like a food cart). You were able to walk around and chat with people, but if you’d like to be seated you could also sit down (at one of those round tables).

The last event was at the Refuge (hosted by MapBox. It was excited to have trains run by so closely. After the event some of us headed over to Boke Bowl which was really served great food.

The field trip

I booked a field trip on Saturday to Mount St. Helens. It was really great. Our guide Jill was just as enthusiastic as Darrell Fuhriman described her. It was a fun group, a lot to learn, beautiful views and certainly worth a visit. I was impressed by scale of the 1980 Mount St. Helens eruption. It looks way smaller on the pictures that I’ve seen in the past.

Miscellaneous

I really liked that you didn’t get huge conference bags with all sorts of things you never need and throw away anyway. Instead you were just pointed to a table with those things where you could take them. You would then proceed to pick a t-shirt of your size (if you wanted to).

I also really like the program brochure, I don’t think I’ve seen one done that well before. It’s small and handy and was insanely well designed. Having the talks split into tracks which fit one page and having the time axis horizontally is a great idea. Also having the abstracts right behind every day, rather than having the full schedule first and the abstracts next is good to keep things easy to browse. You don’t really care about yesterdays abstracts, do you? But even if you do, you can easily find them as the individual days had colored markers on the side of the page, very much like telephone books have. So it was again easy to browse. Perhaps the local team could upload it for reference for future conferences.

Thing to improve

There wasn’t much that could be done better. Though there’s one thing that I’ve discussed with another member of the program committee (that I was also part of). The conference is about Free and Open Source Software. For me this means that you are not only using, but also contributing something back. For me the conference talks should create value for the community.

Of course there should also be talks about “How we migrated our department from proprietary software to open source”, I don’t have a problem with that. Though things should be clearer. What I generally don’t want to see is talks about how people use open source software, build upon it, even improve it, but then not contributing it back. Such a talk has no real value for the attendees. It’s too much “Look what we’ve done but you can’t use it”. I’m well aware that there are cases where open sourcing things is not possible due to the contracts. It strikes me that we might have rejected a talk that would have been in the open source spirit.

One solution I came up together with Jacob Greer is that for future FOSS4Gs you need to include a link in your abstract submission to the source code. This could either be to the project itself, or to upstream projects that you’ve contributed to (and not only used).

Conclusion

It was an awesome, well organized event. I’d like to thank the local organizing committee very much for all the huge amount of work they’ve put into this. You’ve set the bar really high.

Categories: en, Couchbase, GeoCouch, MapQuery, conference, geo

OSGeo Code Sprint Bolsena 2012

2012-06-16 22:35

The OSGeo Code Sprint in Bolsena/Italy was great. Many interesting people sitting the whole day in front of their laptops surrounded by a beautiful scenery. This year I spent most of my time on GeoCouch, CouchDB, Couchbase, deegree, Imposm and GeoNetwork.

Already on the first hacking day we had a result, a Couchbase backend for the deegree blob storage. This means that you can now store your rich features (e.g. from INSPIRE) in Couchbase and serve them up as a WFS with deegree. In case you wonder what rich features are, it's the new, shiny and more descriptive name for complex features.

In the following days I worked together with Oliver Tonnhofer on a CouchDB/GeoCouch backend for Imposm. You are now able to store your OpenStreetMap data in CouchDB and make queries on it through GeoCouch. I've created a small demo that displays the some data import from Andorra with directly with MapQuery, without the need of any additional server/service. The CouchDB backend should be easily adaptable to Couchbase, if you want to help, let me know.

I've then spent some time on the GeoNetwork project and helped translating the language file to German. I cleaned it up a bit and fixed major wrong translations. It's not perfect yet, as I've only spent little time on it, but at least it should be way more understandable (and sadly less funny) than the original version which was generated by Google Translate.

When it was time for presentations, I give a quick overview over the Couch* ecosystem. From CouchDB to GeoCouch, BigCouch, PouchDB, TouchDB (TouchDB-iOS, TouchDB-Android), Couchbase Syncpoint and Couchbase. You can find the slides as PDF here.

On the last day I've spent my time on polishing GeoCouch a bit and getting it ready for the Couchhack in Vienna. I've backported all changes from Couchbase to the CouchDB 1.2.x branch and also ported the geometry search into an experimental branch. You can now search your GeoCouch with any Geometry GEOS supports.

The event was fun as always and I also get to know some new people (hello B3Partners guys). Thanks Jeroen from GeoCat for organizing it, and thanks all other hackers that made it such a awesome event. Hope to see you all next year!

Categories: en, GeoCouch, CouchDB, Couchbase, conference, geo

WhereCamp EU 2012 Amsterdam Part 2

2012-05-06 22:35

I surely enjoyed the WhereCamp EU in Amsterdam, but I didn't realise that I gained so much from it, until I told friends about it. Hence it's time for another blog post about one-dimensional mapping, psychogeograhpy and geo yoga.

The sessions

The topics of the sessions at there WhereCamp EU where widespread. I normally enjoy technical developer focused talks the most, but this time it was different. It was such a great mixture from developers to mapping people that lead to broad variety of talks. Here are my favourite ones.

One-dimensional maps

It started with a historic abstract about one-dimensional maps, which was already interesting by itself. I really got the point, why such maps make sense. Sorry for the lack of more information about it, I should probably ask Laurence Penney for a blog post on this topic.

The final goal of his endeavors is having a nice app for mobile devices, that shows your way to a certain location as a simple list you can scroll through. No panning or zooming would be needed, it's just a simple list that includes everything important you might see, together with simple explanations where to go. It's not about being super precise, but about being simple. An explanation like "cross the park" is easier than a detailed explanation of all the crossings you might hit while walking through the park.

Psychogeography

The talk about Psychogeography from Tim Waters was an eye-opener for me. If never really thought about the impact of geography on the psych. You should really talk with Tim about it, or visit a talk from him if you get the chance. His slides are available on slideshare.

I've recently read a blog post from Chris McDowall about An exercise in getting lost which fits nicely into the topic of psychogeorgaphy.

Canvas for map visualisations/analysis

I've already known the nice demo created by Steven Ottens with Leaflet and the Canvas element. His talk gave lots of background information how he did it and what can be done with the Canvas element. For example displaying a heightmap from a line you draw on the map, all client-sided.

Earthwatchers

Another nice presentation came from Geodan about saving the uranguton by satellite. The project is called Earthwatchers. There you can take the responsibility of a part of the rain forest on Borneo and monitor it for deforestation.

There are plans to have an HTML5 based interface (instead of the current Silverlight one. Given that it is a Geodan project, I hope they'll use MapQuery for it.

Geo yoga

At the end of the WhereCamp, there were some lightning talks, one of the most fun ones was by Tim Waters called geo yoga. You can find pictures at the official geo yoga website. It is all about pantomiming places (e.g. countries.)

My session

My session was about MapQuery, I've already blogged about it last week, hence here's the link.

I planned for another one for Sunday, which was a Q&A about all sorts of Couch things. It would have taken place on the couch in front of one of the rooms. I'm not sure if people didn't get where it was supposed to take place, or were just not interested in the (Geo)Couch topic.

Conclusion

The whole WhereCamp EU was well organized and the crowd was very diverse, all you need for a great unconference. Hope to see you all next year wherever the camp might be.

Categories: en, geoyoga, psychogeography, conference, geo

WhereCampEU 2012 Amsterdam

2012-04-28 22:35

It's still early on the first day of the WhereCampEU 2012, but as my first session (MapQuery and other web mapping clients) took already place, it's time to put up the slides.

It was interesting to see, that most people in the audience have already used OpenLayers, but very few of them Leaflet or other mapping libraries. What made me especially happy was, that after my session many people want to have a closer look at MapQuery.

So here they are the slides from my quick introduction into MapQuery.

Categories: en, OpenLayers, MapQuery, conference, geo

By Volker Mische

Powered by Kukkaisvoima version 7