Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 22, 2011

A history of the world in 100 seconds

Filed under: Data Mining,Geographic Data,Visualization — Patrick Durusau @ 3:17 pm

A history of the world in 100 seconds by Gareth Lloyd.

From the post:

Many Wikipedia articles are tagged with geographic coordinates. Many have references to historic events. Cross referencing these two subsets and plotting them year on year adds up to a dynamic visualization of Wikipedia’s view of world history.

The ‘spotlight’ is an overlay on the video that tries to keep about 90% of the datapoints within the bright area. It takes a moving average of all the latitudes and longitudes over the past 50 or so years and centres on the mean coordinate. I love the way it opens up, first resembling medieval maps of “The World” which included only Europe and some of Asia, then encompassing “The New World” and finally resembling a modern map.

This is based on the thing that me and Tom Martin built at Matt Patterson’s History Hackday. To make it, I built a python SAX Parser that sliced and diced an xml dump of all wikipedia articles (30Gb) and pulled out 424,000 articles with coordinates and 35,000 references to events. We managed to pair up 14,238 events with locations, and Tom wrote some Java to fiddle about with the coordinates and output frames. I’ve hacked around some more to add animation, because, you know, why not?

I wanted to point this post out separately for several reasons.

First, it is a good example of re-use of existing data in a new and/or interesting way. That avoids you having to spend time collecting up the original data.

Second, Gareth provides both the source code and data so you can verify his results for yourself or decide that some other visualization suits your fancy.

Third, you should read some of the comments about this work. That sort of thing is going to occur no matter what resource or visualization you make available. If you had a super-Wiki with 10 million articles in the top ten languages of the world, some wag would complain that X language wasn’t represented. Not that they would contribute to making it available, but they have the time to complain that you didn’t.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress