we just finished our first year over at Paid Mappers - a start-up dedicated to high-def mapping in OpenStreetMap. We absolutely <3 OSM, and we’ve been busy over the last year populating southeastern Virginia with all sorts of great content. And while none of our edits were “paid”, I wanted to generate stats and maps of our first year in business. The final product of that is the “annual report” section (pic below) which features a static image from the studio project with corresponding stats. I’ve added to that an interactive map which allows you to explore the region in more detail. Included below are the steps I used to generate these products, enjoy!
QA-Tiles are a great resource for current & historic edits in OSM. Grab a download for your area of interest - in this example, North America
To reduce processing time, I used MBTiles-Extracts - this module will allow you to clip the area (North America) to a smaller extent (Southeastern Virginia), which will speed up any further steps in processing time.
I created a bounding box in geojson.io with a property name of “bbox”. Then at the command line:
npm install mbtiles-extracts
mbtiles-extracts united_states_of_america.mbtiles bbox.geojson bbox
This step reduced the QA-Tile download from 4.25gb to 445mb -> much easier to work with on the macbook pro.
To start analyzing the data, I looked into running some tile-reduce processes, which provides super-fast analysis on mbtiles. I originally looked at working with a fork that had the bones of what I was trying to do, but I ran into issues streaming a large amount of features to the output file. After that, I stumbled upon OSM-Tag-Stats which did a decent amount of what I wanted. I regularly try to not re-invent things that already work great, so I decided to go with it.
Clone or download the repo into your working directory and
npm install to get it running. Using the documentation , I created a command line based filter to extract edits from all my OSM accounts and specific time frame (1 year or so)
osm-tag-stats --geojson=output.geojson --mbtiles='input.mbtiles' --users='lots, of, comma, separated, osm, usernames' --dates='07/01/2015, 08/01/2016'
This took about 4 minutes to process 6,222 tiles.
It’s very important to note that QA-Tiles contains the latest and greatest of OSM. So, for example, if another OSM user edits any features you edited or created, they won’t be included. Getting the entire edit history per feature is a whole other process separate from what I was doing.
The resulting geojson was 256mb, down from 445mb. OSM-Tag-Stats is a great resource that makes querying the OSM data really simple by using the Mapbox GL Filter Spec.
Another way to get some info out of the resulting geojson, is using a sql query at the command line with ogr2ogr.
To get a count of new edits vs existing edits, I ran a query for each
version. An edit with a version of 1 would be a brand new feature to OSM, and anything above to would represent an existing feature that was edited.
ogrinfo alledits.geojson -sql " SELECT COUNT(*) FROM OGRGeoJSON WHERE '@version' = 1 "
Edits with a version of 1 = 345,450
ogrinfo alledits.geojson -sql " SELECT COUNT(*) FROM OGRGeoJSON WHERE '@version' != 1 "
Edits with a version >1 = 7,091
After filtering & querying for various types of stats, I want to map the output file. I used tippecanoe on the Bridges project, and knew it would be my go to here. With a fresh update (
brew update tippecanoe), and a brief look at the docs for command line options:
tippecanoe -y version -ag -pd -o output.mbtiles input.geojson -Z6
It runs like Katie Ledecky swims - fast. The output .mbtiles is about 25mb down from 256mb, down from 445mb, down from 4.2gb. Just crazy.
After loading up the mbtiles to studio, I do some basic attribute styling on the ‘version’ field and setup up scale dependent rendering - really just wanted to keep this simple and focus on the contrast of new vs existing edits with basic OSM data in the background.
Lots of good stuff to learn in these tools, mostly thanks to all the people involved in creating and maintaining them.