Archive for the ‘Georeferencing’ Category

Wrigley Field: 1060 W Addison St, Chicago, IL or digits.bucked.talent? (3-Word Addresses)

Monday, June 13th, 2016

Elwood Blues says in The Blues Brothers that he falsified his drivers license renewal and listed:

“1060 W. Addision”

as his home address, somehow


doesn’t carry the same impact. Yes?

Mongolia has places as familiar as Wrigley Field is to Americans but starting next month, all locations in Mongolia are going to have three-word phrase addresses. Mongolia is changing all its addresses to three-word phrases by Joon Ian Wong.

From the post:

Mongolia will become a global pioneer next month, when its national post office starts referring to locations by a series of three-word phrases instead of house numbers and street names.

The new system is devised by a British startup called What3Words, which has assigned a three-word phrase to every point on the globe. The system is designed to solve the an often-ignored problem of 75% of the earth’s population, an estimated 4 billion people, who have no address for mailing purposes, making it difficult to open a bank account, get a delivery, or be reached in an emergency. In What3Words’ system, the idea is that a series of words is easier to remember than the strings of number that make up GPS coordinates. Each unique phrase corresponds to a specific 9-square-meter spot on the map.

For example, the White House, at 1600 Pennsylvania Avenue, becomes sulk.held.raves; the Tokyo Tower is located at fans.helpless.collects; and the Stade de France is at reporter.smoked.received.

Mongolians will be the first to use the system for government mail delivery, but organizations including the United Nations, courier companies, and mapping firms like Navmii already use What3Words’ system.

The most remarkable aspect of the is revealed if you try for:

Gandan Monastery (Gandantegchinlen Khiid), Gandan Monastery District, Ulaanbaatar 16040 (011 36 0354).

Use this URL:


Now try changing languages (upper-right).

Three-word phrase addresses for the Gandan Monastery:

  • picturing.backfired.riverside (English)
  • schneller.juwelen.schaffen (German)
  • aislados.grifo.acuerde (Spanish)
  • nuageux.lémurien.rejouer (French)
  • turbato.fotografate.tinozza (Italian)
  • chinelo.politicar.molhada (Portugese
  • matte.skivar.kasta (Swedish)
  • vücudu.ırmak.peşini (Turkish)
  • карьера.слог.шелка (Russian)

I have only had time to spot check the site but did find retraced.loudest.teaspoon for Yap Island in Micronesia.

More obscure places to try?

You can find a wealth of additional information, yes, including an API at:

A great opportunity for topic maps as previous ways of identifying locations are not going to wink out of existence. If 3-word addresses catch on, use of other locators may dwindle but that will be over generations. We are facing a very long transition period.

Thoughts on weaponizing 3-word addresses. First, using the wrong 3-word addresses to mis-lead agents of the state. Second, creating new 3-word addresses that can be embedded prose, song, without the dot separators.

Not to mention a server with proper authentication, returns the “correct” map location for a 3-word address, otherwise, you get the standard one.


Dodging the Morality Police

Friday, March 25th, 2016

This location-based app helps young Iranians avoid ‘morality police’ by Aleks Buczkowski.

From the post:

Many young Iranians are pretty liberated guys. They like to party and wear fancy clothes but they happened to live in a country where it’s prohibited. There is special police force dedicated to ensuring Iranians follow strict rules on clothing and conduct, called the Gasht-e-Ershad (or Guidance Patrol, commonly known as the “morality police”). Part of their activities include setting up checkpoints around cities and randomly inspecting vehicles driving by.

Now there is a way to avoid the Ershad controls. An anonymous team of Iranian developers have come up with a crowdsource app that allow users marking risky spots on the city map to help others avoid it. Something like Waze but for a much different purpose.

The Gershad app is pretty simple and easy to use. Users can mark where they encounter the “morality police”. The data is added to a database and visualised on a map. The more reports in one place, the bolder the warning on the map. When the number decreases, the alert will fade gradually from the map. Simple as it is.

Sounds quite adaptable to tracking police, FBI agents, narcs, etc. in modern urban environments.

Over time, with enough reports, patterns for police patrols would emerge from the data.


NewsStand: A New View on News (+ Underwear Down Under)

Saturday, March 28th, 2015

NewsStand: A New View on News by Benjamin E. Teitler, et al.


News articles contain a wealth of implicit geographic content that if exposed to readers improves understanding of today’s news. However, most articles are not explicitly geotagged with their geographic content, and few news aggregation systems expose this content to users. A new system named NewsStand is presented that collects, analyzes, and displays news stories in a map interface, thus leveraging on their implicit geographic content. NewsStand monitors RSS feeds from thousands of online news sources and retrieves articles within minutes of publication. It then extracts geographic content from articles using a custom-built geotagger, and groups articles into story clusters using a fast online clustering algorithm. By panning and zooming in NewsStand’s map interface, users can retrieve stories based on both topical signifi cance and geographic region, and see substantially diff erent stories depending on position and zoom level.

Of particular interest to topic map fans:

NewsStand’s geotagger must deal with three problematic cases in disambiguating terms that could be interpreted as locations: geo/non-geo ambiguity, where a given phrase might refer to a geographic location, or some other kind of entity; aliasing, where multiple names refer to the same geographic location, such as “Los Angeles” and “LA”; and geographic name ambiguity or polysemy , where a given name might refer to any of several geographic locations. For example, “Springfield” is the name of many cities in the USA, and thus it is a challenge for disambiguation algorithms to associate with the correct location.

Unless you want to hand disambiguate all geographic references in your sources, this paper merits a close read!

BTW, the paper dates from 2008 and I saw it in a tweet by Kirk Borne, where Kirk pointed to a recent version of NewsStand. Well, sort of “recent.” The latest story I could find was 490 days ago, a tweet from CBS News about the 50th anniversary of the Kennedy assassination in Dallas.

Undaunted I checked out TwitterStand but it seems to suffer from the same staleness of content, albeit it is difficult to tell because links don’t lead to the tweets.

Finally I did try PhotoStand, which judging from the pop-up information on the images, is quite current.

I noticed for Perth, Australia, “A special section of the exhibition has been dedicated to famous dominatrix Madame Lash.”

Sadly this appears to be one the algorithm got incorrect, so members of Congress should not select purchase on their travel arrangements just yet.

Sarah Carty for Daily Mail Australia reports in From modest bloomers to racy corsets: New exhibition uncovers the secret history of women’s underwear… including a unique collection from dominatrix Madam Lash:

From the modesty of bloomers to the seductiveness of lacy corsets, a new exhibition gives us a rare glimpse into the most intimate and private parts of history.

The Powerhouse Museum in Sydney have unveiled their ‘Undressed: 350 Years of Underwear in Fashion’ collection, which features undergarments from the 17th-century to more modern garments worn by celebrities such as Emma Watson, Cindy Crawford and even Queen Victoria.

Apart from a brief stint in Bendigo and Perth, the collection has never been seen by any members of the public before and lead curator Edwina Ehrman believes people will be both shocked and intrigued by what’s on display.

So the collection was once shown in Perth, but for airline reservations you had best book for Sydney.

And no, I won’t leave you without the necessary details:

Undressed: 350 Years of Underwear in Fashion opens at the Powerhouse Museum on March 28 and runs until 12 July 2015. Tickets can be bought here.

Ticket prices do not include transportation expenses to Sydney.

Spoiler alert: The exhibition page says:

Please note that photography is not permitted in this exhibition.


Twitter User Targeting Data

Sunday, May 11th, 2014

Geotagging One Hundred Million Twitter Accounts with Total Variation Minimization by Ryan Compton, David Jurgens, and, David Allen.


Geographically annotated social media is extremely valuable for modern information retrieval. However, when researchers can only access publicly-visible data, one quickly finds that social media users rarely publish location information. In this work, we provide a method which can geolocate the overwhelming majority of active Twitter users, independent of their location sharing preferences, using only publicly-visible Twitter data.

Our method infers an unknown user’s location by examining their friend’s locations. We frame the geotagging problem as an optimization over a social network with a total variation-based objective and provide a scalable and distributed algorithm for its solution. Furthermore, we show how a robust estimate of the geographic dispersion of each user’s ego network can be used as a per-user accuracy measure, allowing us to discard poor location inferences and control the overall error of our approach.

Leave-many-out evaluation shows that our method is able to infer location for 101,846,236 Twitter users at a median error of 6.33 km, allowing us to geotag roughly 89\% of public tweets.

If 6.33 km sounds like a lot of error, check out NUKEMAP by Alex Wellerstein.

Neo4j Spatial Part 2

Tuesday, February 11th, 2014

Neo4j Spatial Part 2 by Max De Marzi.

Max finishes up part 1 with sample spatial data on for restaurants and deploying his proof of concept using GrapheneDB on Heroku.

Restaurants are typical cellphone app fare but if I were in Kiev, I’d want an app with geo-locations of ingredients for a proper Molotov cocktail.

A jar filled with gasoline and a burning rag is nearly as dangerous to the thrower as the target.

Of course, substitutions for ingredients, in what quantities, in different languages, could be added features of such an app.

Data management is a weapon within the reach of all sides.

Geospatial (distance) faceting…

Tuesday, January 21st, 2014

Geospatial (distance) faceting using Lucene’s dynamic range facets by Mike McCandless.

From the post:

There have been several recent, quiet improvements to Lucene that, taken together, have made it surprisingly simple to add geospatial distance faceting to any Lucene search application, for example:

  < 1 km (147)
  < 2 km (579)
  < 5 km (2775)

Such distance facets, which allow the user to quickly filter their search results to those that are close to their location, has become especially important lately since most searches are now from mobile smartphones.

In the past, this has been challenging to implement because it’s so dynamic and so costly: the facet counts depend on each user’s location, and so cannot be cached and shared across users, and the underlying math for spatial distance is complex.

But several recent Lucene improvements now make this surprisingly simple!

As always, Mike is right on the edge so wait for Lucene 4.7 to try his code out or download the current source.

Distance might not be the only consideration. What if you wanted the shortest distance that did not intercept a a known patrol? Or known patrol within some window of variation.

Distance is still going to be a factor but the search required maybe more complex than just distance.

Server-side clustering of geo-points…

Sunday, August 4th, 2013

Server-side clustering of geo-points on a map using Elasticsearch by Gianluca Ortelli.

From the post:

Plotting markers on a map is easy using the tooling that is readily available. However, what if you want to add a large number of markers to a map when building a search interface? The problem is that things start to clutter and it’s hard to view the results. The solution is to group results together into one marker. You can do that on the client using client-side scripting, but as the number of results grows, this might not be the best option from a performance perspective.

This blog post describes how to do server-side clustering of those markers, combining them into one marker (preferably with a counter indicating the number of grouped results). It provides a solution to the “too many markers” problem with an Elasticsearch facet.

The Problem

The image below renders quite well the problem we were facing in a project:


The mass of markers is so dense that it replicates the shape of the Netherlands! These items represent monuments and other things of general interest in the Netherlands; for an application we developed for a customer we need to manage about 200,000 of them and they are especially concentrated in the cities, as you can see in this case in Amsterdam: The “draw everything” strategy doesn’t help much here.

Server-side clustering of geo-points will be useful for representing dense geo-points.

Such as an Interactive Surveillance Map.

Or if you were building a map of police and security force sightings over multiple days to build up a pattern database.

elasticsearch 0.90.2

Thursday, June 27th, 2013

0.90.2 released by Clinton Gormley.

From the post:

The Elasticsearch dev team are pleased to announce the release of elasticsearch 0.90.2, which is based on Lucene 4.3.1. You can download it here.

We recommend upgrading to 0.90.2 from 0.90.1, especially if you are using the terms-lookup filter, as this release includes some enhancements and bug fixes to terms lookup.

Besides the other enhancements and bug-fixes, which you can read about on the issues list, there is one new feature that is particularly worth mentioning: improved support for geohashes on geopoints:

A geohash is a string representing an area on earth – the longer the string the more precise the geohash. A geohash just one character long refers to an area with a very rough precision: +/- 2500 km. A geohash of length 8 would be accurate to within 20m, etc. Because a geohash is just a string, we can index it in Elasticsearch and take advantage of the inverted index to make blazingly fast geo-location queries.

Wikipedia on Geohash. Numerous external links, including Enter geo-coordinates or a geohash, displays map with location displayed.

Are You Near Me?

Saturday, June 8th, 2013

Lucene 4.X is a great tool for analyzing cellphone location data (Did you really think only the NSA has it?).

Chilamakuru Vishnu gets us started with a code heavy post with the promise of:

My Next Blog Post will talk about how to implement advanced spatial queries like

geoInterseting – where one polygon intersects with another polygon/line.

geoWithIn – where one polygon lies completely within another polygon.

Or you could obtain geolocation data by other means.

I first saw this at DZone.

CLAVIN [Geotagging – Some Proofing Required]

Sunday, May 26th, 2013


From the webpage:

CLAVIN (*Cartographic Location And Vicinity INdexer*) is an open source software package for document geotagging and geoparsing that employs context-based geographic entity resolution. It combines a variety of open source tools with natural language processing techniques to extract location names from unstructured text documents and resolve them against gazetteer records. Importantly, CLAVIN does not simply “look up” location names; rather, it uses intelligent heuristics in an attempt to identify precisely which “Springfield” (for example) was intended by the author, based on the context of the document. CLAVIN also employs fuzzy search to handle incorrectly-spelled location names, and it recognizes alternative names (e.g., “Ivory Coast” and “Côte d’Ivoire”) as referring to the same geographic entity. By enriching text documents with structured geo data, CLAVIN enables hierarchical geospatial search and advanced geospatial analytics on unstructured data.

See for an online demo, videos and other materials.

Your mileage may vary.

I used a quote from today’s New York Times (Rockets Hit Hezbollah Stronghold in Lebanon):

An ongoing battle in the Syrian town of Qusair on the Lebanese border has laid bare Hezbollah’s growing role in the Syrian conflict. The Iranian-backed militia and Syrian troops launched an offensive against the town last weekend. After dozens of Hezbollah fighters were killed in Qusair over the past week and buried in large funerals in Lebanon, Hezbollah could no longer play down its involvement.

Col. Abdul-Jabbar al-Aqidi, commander of the Syrian rebels’ Military Council in Aleppo, appeared in a video this week while apparently en route to Qusair, in which he threatened to strike in Beirut’s southern suburbs in retaliation for Hezbollah’s involvement in Syria.

“We used to say before, ‘We are coming Bashar.’ Now we say, ‘We are coming Bashar and we are coming Hassan Nasrallah,'” he said, in reference to Hezbollah’s leader.

“We will strike at your strongholds in Dahiyeh, God willing,” he said, using the Lebanese name for Hezbollah’s power center in southern Beirut. The video was still online on Youtube on Sunday.

Hezbollah lawmaker Ali Ammar said the incident targeted coexistence between the Lebanese and claimed the U.S. and Israel want to return Lebanon to the years of civil war. “They want to throw Lebanon backward into the traps of civil wars that we left behind,” he told reporters. “We will not go backward.”

The results from CLAVIN:

Locations Extracted and Resolved From Text

ID Name Lat, Lon Country Code #
272103 Lebanon 33.83333, 35.83333 LB 3
6951366 Lebanese 44.49123, 26.0877 RO 3
276781 Beirut 33.88894, 35.49442 LB 2
162037 Dahiyeh 38.19023, 57.00984 TM 1
6252001 U.S. 39.76, -98.5 US 1
103089 Qusair 25.91667, 40.45 SA 1
163843 Syria 35, 38 SY 1
163843 Syrian 35, 38 SY 1
294640 Israel 31.5, 34.75 IL 1
170062 Aleppo 36.25, 37.5 SY 1

(The highlight added to show incorrect resolutions.)


RO = Romania

SA = Saudia Arabia

TM = Turkmenistan

Plus “Qusair” appears twice in the quoted text.

For the ten locations mentioned a seventy (70%) percent accuracy rate.

Better than the average American but proofing is still an essential step in editorial workflow.

I first saw this in Pete Warden’s Five short links.

Georeferencer: Crowdsourced Georeferencing for Map Library Collections

Monday, November 19th, 2012

Georeferencer: Crowdsourced Georeferencing for Map Library Collections by Christopher Fleet, Kimberly C. Kowal and Petr Přidal.


Georeferencing of historical maps offers a number of important advantages for libraries: improved retrieval and user interfaces, better understanding of maps, and comparison/overlay with other maps and spatial data. Until recently, georeferencing has involved various relatively time-consuming and costly processes using conventional geographic information system software, and has been infrequently employed by map libraries. The Georeferencer application is a collaborative online project allowing crowdsourced georeferencing of map images. It builds upon a number of related technologies that use existing zoomable images from library web servers. Following a brief review of other approaches and georeferencing software, we describe Georeferencer through its five separate implementations to date: the Moravian Library (Brno), the Nationaal Archief (The Hague), the National Library of Scotland (Edinburgh), the British Library (London), and the Institut Cartografic de Catalunya (Barcelona). The key success factors behind crowdsourcing georeferencing are presented. We then describe future developments and improvements to the Georeferencer technology.

If your institution has a map collection or if you are interested in maps at all, you need to read this article.

There is an introduction video if you prefer:

Either way, you will be deeply impressed by this project.

And wondering: Can the same lessons be applied to crowd source the creation of topic maps?