Geocoding is a process of assigning the geographical coordinate (normally X and Y) to a record. In most of cases we are talking about the dataset transformation that allows further spatial (map) visualization or analysis. Nowadays, there are more possible approaches to geocode a dataset, but we can define two big groups:
- manual geocoding - we are checking one record after another and manually assign a coordinate
- automatic geocoding - a script takes the dataset and with the help of an external service or gazetteer, it finds a most suitable coordinates for each record
Then we have a tools that combines both of the approaches. This "semiautomatic" geocoding tries to assist the user - provides shortcuts to search engines, displays the map and table next to each other... Examples of such tools are plugins for QGIS/ArcGIS or various web applications.
II Geocoding of a Historical Dataset
Geocoding a historical dataset (beside geocoding modern data) has some specificities. First of all, we are often working with place names that no longer exist or could be secondary in today's use. Then, we sometimes reference a place that "moves in the space" in time. Example could be a town that shifted it's town center after a war or flood. Also the place could have disappeared and it is not easy to locate it. For that reason, it is not always easy to work just with a modern basemap that consists of a today's place names. Also we need to find a gazetteer that consist of historical place names and respects disappeared or moved places.
Last factor but not least is the uncertanity. When coding a historical dataset we would like to use different measures to quantify for the certainty. For instance, one time we can reference a building that no longer exists so we place it in the neighborhood of a town, second time we know the exact location of that building even today. For that reason we need to work within a defined granularity (for example the level of cadaster) and define the system for certainty classification.
III Application "Historical Geocoding Assistant"
We tried to construct an application that would respect described specificities and needs of historical datasets geocoding. The application is still in development (but the actual version is in active use) and the source code could be found on github and is running online.
To summarize the advantages of this application in a short list:
- custom-made UI that is fit to the needs of a user and allows specific settings
- various basemap that could be toggled or combined (setting opacities for overlaying layers)
- custom overlays in the format of geojson vector or wms service
- integration with google spreadsheet (user needs to be signed with own google account)
- "inteligent" assistant that will link columns from imported table to a specific function (coordinate, placename, note columns...)
- integration of certainty levels system
- 1 - precise
- 2 - approximate
- 3 - localized but ambiguous
- 4 - not found
- switcher between records within the defined google spreadsheet
- form to edit all columns of the active record
- integration of various gazetteers (at this time wikipedia and geonames) that will look for a place name and return suggestions for coordinates (and display them on map)
- quick buttons for desperate users - link to the google and google maps with pre-searched place name
The user has to create a google spreadsheet table where each row is a record. The columns should consists (at this point empty) x and y coordiantes. Then there should be a columns for place name, note and certainty level. This table has to be shared ("Link sharing on" and "Anyone with the link can edit" options) and the id of this table will be written into the form at the initialisation phase of the application. At this point, the user could set the working environment and start geocoding. As he saves the geocoded records, the google spreadsheet table will fill.