Archive for the ‘Geographic Information Retrieval’ Category

Restricted U.S. Army Geospatial Intelligence Handbook

Friday, August 26th, 2016

Restricted U.S. Army Geospatial Intelligence Handbook

From the webpage:

This training circular provides GEOINT guidance for commanders, staffs, trainers, engineers, and military intelligence personnel at all echelons. It forms the foundation for GEOINT doctrine development. It also serves as a reference for personnel who are developing doctrine; tactics, techniques, and procedures; materiel and force structure; and institutional and unit training for intelligence operations.

1-1. Geospatial intelligence is the exploitation and analysis of imagery and geospatial information to describe, assess, and visually depict physical features and geographically referenced activities on the Earth. Geospatial intelligence consists of imagery, imagery intelligence, and geospatial information (10 USC 467).

Note. TC 2-22.7 further implements that GEOINT consists of any one or any combination of the following components: imagery, IMINT, or GI&S.

1-2. Imagery is the likeness or presentation of any natural or manmade feature or related object or activity, and the positional data acquired at the same time the likeness or representation was acquired, including: products produced by space-based national intelligence reconnaissance systems; and likenesses and presentations produced by satellites, aircraft platforms, unmanned aircraft vehicles, or other similar means (except that such term does not include handheld or clandestine photography taken by or on behalf of human intelligence collection organizations) (10 USC 467).

1-3. Imagery intelligence is the technical, geographic, and intelligence information derived through the interpretation or analysis of imagery and collateral materials (10 USC 467).

1-4. Geospatial information and services refers to information that identifies the geographic location and characteristics of natural or constructed features and boundaries on the Earth, including: statistical data and information derived from, among other things, remote sensing, mapping, and surveying technologies; and mapping, charting, geodetic data, and related products (10 USC 467).

geospatial-intel-1-460

You may not have the large fixed-wing assets described in this handbook, the “value-added layers” are within your reach with open data.

geospatial-intel-2-460

In localized environments, your value-added layers may be more current and useful than those produced on longer time scales.

Topic maps can support geospatial collations of information along side other views of the same data.

A great opportunity to understand how a modern military force understands and uses geospatial intelligence.

Not to mention testing your ability to recreate that geospatial intelligence without dedicated tools.

Planet Platform Beta & Open California:…

Friday, October 16th, 2015

Planet Platform Beta & Open California: Our Data, Your Creativity by Will Marshall.

From the post:

At Planet Labs, we believe that broad coverage frequent imagery of the Earth can be a significant tool to address some of the world’s challenges. But this can only happen if we democratise access to it. Put another way, we have to make data easy to access, use, and buy. That’s why I recently announced at the United Nations that Planet Labs will provide imagery in support of projects to advance the Sustainable Development Goals.

Today I am proud to announce that we’re releasing a beta version of the Planet Platform, along with our imagery of the state of California under an open license.

The Planet Platform Beta will enable a pioneering cohort of developers, image analysts, researchers, and humanitarian organizations to get access to our data, web-based tools and APIs. The goal is to provide a “sandbox” for people to start developing and testing their apps on a stack of openly available imagery, with the goal of jump-starting a developer community; and collecting data feedback on Planet’s data, tools, and platform.

Our Open California release includes two years of archival imagery of the whole state of California from our RapidEye satellites and 2 months of data from the Dove satellite archive; and will include new data collected from both constellations on an ongoing basis, with a two-week delay. The data will be under an open license, specifically CC BY-SA 4.0. The spirit of the license is to encourage R&D and experimentation in an “open data” context. Practically, this means you can do anything you want, but you must “open” your work, just as we are opening ours. It will enable the community to discuss their experiments and applications openly, and thus, we hope, establish the early foundation of a new geospatial ecosystem.

California is our first Open Region, but shall not be the last. We will open more of our data in the future. This initial release will inform how we deliver our data set to a global community of customers.

Resolution for the Dove satellites is 3-5 meters and the RapidEye satellites is 5 meters.

Not quite goldfish bowl or Venice Beach resolution but useful for other purposes.

Now would be a good time to become familiar with managing and annotating satellite imagery. Higher resolutions, public and private are only a matter of time.

MrGeo (MapReduce Geo)

Wednesday, January 21st, 2015

MrGeo (MapReduce Geo)

From the webpage:

MrGeo was developed at the National Geospatial-Intelligence Agency (NGA) in collaboration with DigitalGlobe. The government has “unlimited rights” and is releasing this software to increase the impact of government investments by providing developers with the opportunity to take things in new directions. The software use, modification, and distribution rights are stipulated within the Apache 2.0 license.

MrGeo (MapReduce Geo) is a geospatial toolkit designed to provide raster-based geospatial capabilities that can be performed at scale. MrGeo is built upon the Hadoop ecosystem to leverage the storage and processing of hundreds of commodity computers. Functionally, MrGeo stores large raster datasets as a collection of individual tiles stored in Hadoop to enable large-scale data and analytic services. The co-location of data and analytics offers the advantage of minimizing the movement of data in favor of bringing the computation to the data; a more favorable compute method for Geospatial Big Data. This framework has enabled the servicing of terabyte scale raster databases and performed terrain analytics on databases exceeding hundreds of gigabytes in size.

The use cases sound interesting:

Exemplar MrGeo Use Cases:

  • Raster Storage and Provisioning: MrGeo has been used to store, index, tile, and pyramid multi-terabyte scale image databases. Once stored, this data is made available through simple Tiled Map Services (TMS) and or Web Mapping Services (WMS).
  • Large Scale Batch Processing and Serving: MrGeo has been used to pre-compute global 1 ArcSecond (nominally 30 meters) elevation data (300+ GB) into derivative raster products : slope, aspect, relative elevation, terrain shaded relief (collectively terabytes in size)
  • Global Computation of Cost Distance: Given all pub locations in OpenStreetMap, compute 2 hour drive times from each location. The full resolution is 1 ArcSecond (30 meters nominally)
  • I wonder if you started war gaming attacks on well known cities and posting maps on how the attacks could develop if that would be covered under free speech? Assuming your intent was to educate the general populace about areas that are more dangerous than others in case of a major incident.

    I first saw this in a tweet by Marin Dimitrov.

    Mapping the open web using GeoJSON

    Sunday, December 8th, 2013

    Mapping the open web using GeoJSON by Sean Gillies.

    From the post:

    GeoJSON is an open format for encoding information about geographic features using JSON. It has much in common with older GIS formats, but also a few new twists: GeoJSON is a text format, has a flexible schema, and is specified in a single HTML page. The specification is informed by standards such as OGC Simple Features and Web Feature Service and streamlines them to suit the way web developers actually build software today.

    Promoted by GitHub and used in the Twitter API, GeoJSON has become a big deal in the open web. We are huge fans of the little format that could. GeoJSON suits the web and suits us very well; it plays a major part in our libraries, services, and products.

    A short but useful review of why GeoJSON is important to MapBox and why it should be important to you.

    A must read if you are interested in geo-locating data of interest to your users to maps.

    Sean mentions that Github promotes GeoJSON but I’m curious if the NSA uses/promotes it as well? 😉

    ST_Geometry Aggregate Functions for Hive…

    Friday, August 16th, 2013

    ST_Geometry Aggregate Functions for Hive in Spatial Framework for Hadoop by Jonathan Murphy.

    From the post:

    We are pleased to announce that the ST_Geometry aggregate functions are now available for Hive, in the Spatial Framework for Hadoop. The aggregate functions can be used to perform a convex-hull, intersection, or union operation on geometries from multiple records of a dataset.

    While the non-aggregate ST_ConvexHull function returns the convex hull of the geometries passed like a single function call, the ST_Aggr_ConvexHull function accumulates the geometries from the rows selected by a query, and performs a convex hull operation over those geometries. Likewise, ST_Aggr_Intersection and ST_Aggr_Union aggregrate the geometries from multiple selected rows, to perform intersection and union operations, respectively.

    The example given covers earthquake data and California-county data.

    I have a weakness for aggregating functions as you know. 😉

    The other point this aggregate functions illustrates is that sometimes you want subjects to be treated as independent of each other and sometimes you want to treat them as a group.

    Depends upon your requirements.

    There really isn’t a one size fits all granularity of subject identity for all situations.

    Server-side clustering of geo-points…

    Sunday, August 4th, 2013

    Server-side clustering of geo-points on a map using Elasticsearch by Gianluca Ortelli.

    From the post:

    Plotting markers on a map is easy using the tooling that is readily available. However, what if you want to add a large number of markers to a map when building a search interface? The problem is that things start to clutter and it’s hard to view the results. The solution is to group results together into one marker. You can do that on the client using client-side scripting, but as the number of results grows, this might not be the best option from a performance perspective.

    This blog post describes how to do server-side clustering of those markers, combining them into one marker (preferably with a counter indicating the number of grouped results). It provides a solution to the “too many markers” problem with an Elasticsearch facet.

    The Problem

    The image below renders quite well the problem we were facing in a project:

    clustering

    The mass of markers is so dense that it replicates the shape of the Netherlands! These items represent monuments and other things of general interest in the Netherlands; for an application we developed for a customer we need to manage about 200,000 of them and they are especially concentrated in the cities, as you can see in this case in Amsterdam: The “draw everything” strategy doesn’t help much here.

    Server-side clustering of geo-points will be useful for representing dense geo-points.

    Such as an Interactive Surveillance Map.

    Or if you were building a map of police and security force sightings over multiple days to build up a pattern database.

    mapFAST Mobile

    Sunday, June 30th, 2013

    Explore the world: find books or other library materials about places with mapFAST Mobile

    From the post:

    The new mapFAST Mobile lets you search WorldCat.org from your smartphone or mobile browser for materials related to any location and find them in the nearest library.

    Available on the web and now as an Android app in the Google Play store, mapFAST is a Google Maps mashup that allows users to identify a point of interest and see surrounding locations or events using mapFAST’s Google Maps display with nearby FAST geographic headings (including location-based events), then jump to WorldCat.org, the world’s largest library catalog, to find specific items and the nearest holding library. WorldCat.org provides a variety of “facets” allowing users to narrow a search by type of item, year of publication, language and more.

    “Libraries hold and provide access to a wide variety of information resources related to geographic locations,” said Rick Bennett, OCLC Consulting Software Engineer and lead developer on the project. “When looking for information about a particular place, it’s often useful to investigate nearby locations as well. mapFAST’s Google Maps interface allows for easy selection of the location, with a link to enter a search directly into WorldCat.org.”

    With mapFAST Mobile, smartphone and mobile browser users can do a search based on their current location, or an entered search. The user’s location or search provides a center for the map, and nearby FAST subject headings are added as location pins. A “Search WorldCat” link then connects users to a list of records for materials about that location in WorldCat.org.

    This sounds cool enough to almost temp me into getting a cell phone. 😉

    I haven’t seen the app but if it works as advertised, this could be the first step in a come back by libraries.

    Very cool!

    gvSIG

    Saturday, March 30th, 2013

    gvSIG

    I encountered the gvSIG site while tracking down the latest release of i3Geo.

    From its mission statement:

    The gvSIG project was born in 2004 within a project that consisted in a full migration of the information technology systems of the Regional Ministry of Infrastructure and Transport of Valencia (Spain), henceforth CIT, to free software. Initially, It was born with some objectives according to CIT needs. These objectives were expanded rapidly because of two reasons principally: on the one hand, the nature of free software, which greatly enables the expansion of technology, knowledge, and lays down the bases on which to establish a community, and, on the other hand, a project vision embodied in some guidelines and a plan appropriate to implement it.

    Some of the software projects you will find at gvSIG are:

    gvSIG Desktop

    gvSIG is a Geographic Information System (GIS), that is, a desktop application designed for capturing, storing, handling, analyzing and deploying any kind of referenced geographic information in order to solve complex management and planning problems. gvSIG is known for having a user-friendly interface, being able to access the most common formats, both vector and raster ones. It features a wide range of tools for working with geographic-like information (query tools, layout creation, geoprocessing, networks, etc.), which turns gvSIG into the ideal tool for users working in the land realm.

    gvSIG Mobile

    gvSIG Mobile is a Geographic Information System (GIS) aimed at mobile devices, ideal for projects that capture and update data in the field. It’s known for having a user-friendly interface, being able to access the most common formats and a wide range of GIS and GPS tools which are ideal for working with geographic information.

    gvSIG Mobile aims at broadening gvSIG Desktop execution platforms to a range of mobile devices, in order to give an answer to the needings of a growing number of mobile solutions users, who wish to use a GIS on different types of devices.

    So far, gvSIG Mobile is a Geographic Information System, as well as a Spatial Data Infrastructures client for mobile devices. Such a client is also the first one licensed under open source.

    I3Geo

    i3Geo is an application for the development of interactive web maps. It integrates several open source applications into a single development platform, mainly Mapserver and OpenLayers. Developed in PHP and Javascript, it has functionalities that allows the user to have better control over the map output, allowing to modify the legend of layers, to apply filters, to perform analysis, etc.

    i3Geo is completely customizable and can be tailor to the different users using the interactive map. Furthermore, the spatial data is organized in a catalogue that offers online access services such as WMS, WFS, KML or the download of files.

    i3Geo was developed by the Ministry of the Environment of Brazil and it is actually part of the Brazilian Public Software Portal.

    gvSIG Educa

    What is gvSIG Educa?

    “If I can’t picture it, I can’t understand it (A. Einstein)”

    gvSIG Educa is a customization of the gvSIG Desktop Open Source GIS, adapted as a tool for the education of issues that have a geographic component.

    The aim of gvSIG Educa is to provide educators with a tool that helps students to analyse and understand space, and which can be adapted to different levels or education systems.

    gvSIG Educa is not only useful for the teaching of geographic material, but can also be used for learning any subject that contains a spatial component such as history, economics, natural science, sociology…

    gvSIG Educa facilitates learning by letting students interact with the information, by adding a spatial component to the study of the material, and by facilitating the assimilation of concepts through visual tools such as thematic maps.

    gvSIG Educa provides analysis tools that help to understand spatial relationships.

    Definitely a site to visit if you are interested in open source GIS software and/or projects.

    i3Geo

    Saturday, March 30th, 2013

    i3Geo

    From the homepage:

    i3Geo is an application for the development of interactive web maps. It integrates several open source applications into a single development platform, mainly Mapserver and OpenLayers. Developed in PHP and Javascript, it has functionalities that allows the user to have better control over the map output, allowing to modify the legend of layers, to apply filters, to perform analysis, etc.

    i3Geo is completely customizable and can be tailor to the different users using the interactive map. Furthermore, the spatial data is organized in a catalogue that offers online access services such as WMS, WFS, KML or the download of files.

    i3Geo was developed by the Ministry of the Environment of Brazil and it is actually part of the Brazilian Public Software Portal.

    I followed an announcement about i3Geo 4.7 being available when the line “…an application for the development of interactive web maps,” caught my eye.

    Features include:

    • Basic display: fix zoom, zoom by rectangle, panning, etc.
    • Advanced display: locator by attribute, zoom to point, zoom by geographical area, zoom by selection, zoom to layer
    • Integrated display: Wikipedia, GoogleMaps, Panoramio and Confluence
    • Integration with the OpenLayers, GoogleMaps and GoogleEarth APIs
    • Loading of WMS, KML, GeoRSS, shapefile, GPX and CSV layers
    • Management of independent databases
    • Layer catalog management system
    • Management of layers in maps: Change of the layers order, opacity change, title change, filters, thematic classification, legend and symbology changing
    • Analysis tools: buffers, regular grids, points distribution analysis, layer intersection, centroid calculation, etc.
    • Digitalization: vector editing that allows to create new geometries or edit xisting data.
    • Superposition of existing data at the data of the Google Maps and GoogleEarth catalogs.

    Unless you want to re-invent mapping software, this could be quite useful for location relevant topic map data.

    I first saw this at New final version of i3Geo available: i3Geo 4.7.

    User evaluation of automatically generated keywords and toponyms… [of semantic gaps]

    Tuesday, January 22nd, 2013

    User evaluation of automatically generated keywords and toponyms for geo-referenced images by Frank O. Ostermann, Martin Tomko, Ross Purves. (Ostermann, F. O., Tomko, M. and Purves, R. (2013), User evaluation of automatically generated keywords and toponyms for geo-referenced images. J. Am. Soc. Inf. Sci.. doi: 10.1002/asi.22738)

    Abstract:

    This article presents the results of a user evaluation of automatically generated concept keywords and place names (toponyms) for geo-referenced images. Automatically annotating images is becoming indispensable for effective information retrieval, since the number of geo-referenced images available online is growing, yet many images are insufficiently tagged or captioned to be efficiently searchable by standard information retrieval procedures. The Tripod project developed original methods for automatically annotating geo-referenced images by generating representations of the likely visible footprint of a geo-referenced image, and using this footprint to query spatial databases and web resources. These queries return raw lists of potential keywords and toponyms, which are subsequently filtered and ranked. This article reports on user experiments designed to evaluate the quality of the generated annotations. The experiments combined quantitative and qualitative approaches: To retrieve a large number of responses, participants rated the annotations in standardized online questionnaires that showed an image and its corresponding keywords. In addition, several focus groups provided rich qualitative information in open discussions. The results of the evaluation show that currently the annotation method performs better on rural images than on urban ones. Further, for each image at least one suitable keyword could be generated. The integration of heterogeneous data sources resulted in some images having a high level of noise in the form of obviously wrong or spurious keywords. The article discusses the evaluation itself and methods to improve the automatic generation of annotations.

    An echo of Steve Newcomb’s semantic impedance appears at:

    Despite many advances since Smeulders et al.’s (2002) classic paper that set out challenges in content-based image retrieval, the quality of both nonspecialist text-based and content-based image retrieval still appears to lag behind the quality of specialist text retrieval, and the semantic gap, identified by Smeulders et al. as a fundamental issue in content-based image retrieval, remains to be bridged. Smeulders defined the semantic gap as

    the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation. (p. 1353)

    In fact, text-based systems that attempt to index images based on text thought to be relevant to an image, for example, by using image captions, tags, or text found near an image in a document, suffer from an identical problem. Since text is being used as a proxy by an individual in annotating image content, those querying a system may or may not have similar worldviews or conceptualizations as the annotator. (emphasis added)

    That last sentence could have come out of a topic map book.

    Curious what you make of the author’s claim that spatial locations provide an “external context” that bridges the “semantic gap?”

    If we all use the same map of spatial locations, are you surprised by the lack of a “semantic gap?”

    Sitegeist:…

    Friday, December 14th, 2012

    Sitegeist: A mobile app that tells you about your data surroundings by Nathan Yau.

    Nathan writes:

    From businesses to demographics, there’s data for just about anywhere you are. Sitegeist, a mobile application by the Sunlight Foundation, puts the sources into perspective.

    App is free and the Sunlight site lists the following data for a geographic location:

    • Age Distribution
    • Political Contributions
    • Average Rent
    • Popular Local Spots
    • Recommended Restaurants
    • How People Commute
    • Record Temperatures
    • Housing Units Over Time

    If you have an iPhone or Android phone, can you report if other data is available?

    I was thinking along the lines of:

    • # of drug arrests
    • # type of drug arrests
    • # of arrests for soliciting (graphed by day/time)
    • # location of bail bond agencies

    More tourist type information. 😉

    How would you enhance this data flow with a topic map?

    Towards a Scalable Dynamic Spatial Database System [Watching Watchers]

    Tuesday, November 20th, 2012

    Towards a Scalable Dynamic Spatial Database System by Joaquín Keller, Raluca Diaconu, Mathieu Valero.

    Abstract:

    With the rise of GPS-enabled smartphones and other similar mobile devices, massive amounts of location data are available. However, no scalable solutions for soft real-time spatial queries on large sets of moving objects have yet emerged. In this paper we explore and measure the limits of actual algorithms and implementations regarding different application scenarios. And finally we propose a novel distributed architecture to solve the scalability issues.

    At least in this version, you will find two copies of the same paper, the second copy sans the footnotes. So read the first twenty (20) pages and ignore the second eighteen (18) pages.

    I thought the limitation of location to two dimensions understandable, for the use cases given, but am less convinced that treating a third dimension as an extra attribute is always going to be suitable.

    Still, the results here are impressive as compared to current solutions so an additional dimension can be a future improvement.

    The use case that I see missing is an ad hoc network of users feeding geo-based information back to a collection point.

    While the watchers are certainly watching us, technology may be on the cusp of answering the question: “Who watches the watchers?” (The answer may be us.)

    I first saw this in a tweet by Stefano Bertolo.

    Georeferencer: Crowdsourced Georeferencing for Map Library Collections

    Monday, November 19th, 2012

    Georeferencer: Crowdsourced Georeferencing for Map Library Collections by Christopher Fleet, Kimberly C. Kowal and Petr Přidal.

    Abstract:

    Georeferencing of historical maps offers a number of important advantages for libraries: improved retrieval and user interfaces, better understanding of maps, and comparison/overlay with other maps and spatial data. Until recently, georeferencing has involved various relatively time-consuming and costly processes using conventional geographic information system software, and has been infrequently employed by map libraries. The Georeferencer application is a collaborative online project allowing crowdsourced georeferencing of map images. It builds upon a number of related technologies that use existing zoomable images from library web servers. Following a brief review of other approaches and georeferencing software, we describe Georeferencer through its five separate implementations to date: the Moravian Library (Brno), the Nationaal Archief (The Hague), the National Library of Scotland (Edinburgh), the British Library (London), and the Institut Cartografic de Catalunya (Barcelona). The key success factors behind crowdsourcing georeferencing are presented. We then describe future developments and improvements to the Georeferencer technology.

    If your institution has a map collection or if you are interested in maps at all, you need to read this article.

    There is an introduction video if you prefer: http://www.klokantech.com/georeferencer/.

    Either way, you will be deeply impressed by this project.

    And wondering: Can the same lessons be applied to crowd source the creation of topic maps?

    zip-code-data-hacking

    Saturday, October 27th, 2012

    zip-code-data-hacking by Neil Kodner.

    From the readme file:

    sourcing publicly available files, generate useful zip code-county data.

    My goal is to be able to map zip codes to county FIPS codes, without paying. So far, I’m able to produce county fips codes for 41456 counties out of a list of 42523 zip codes.

    I was able to find a zip code database from unitedstateszipcodes.org, each zip code had a county name but not a county FIPS code. I was able to find County FIPS codes on the census.gov site through some google hacking.

    The data files are in the data directory – I’ll eventuall add code to make sure the latest data files are retrieved at runtime. I didn’t do this yet because I didn’t want to hammer the sites while I was quickly iterating – a local copy did just fine.

    In case you are wondering why this mapping between zip codes to county FIPS codes is important:

    Federal information processing standards codes (FIPS codes) are a standardized set of numeric or alphabetic codes issued by the National Institute of Standards and Technology (NIST) to ensure uniform identification of geographic entities through all federal government agencies. The entities covered include: states and statistically equivalent entities, counties and statistically equivalent entities, named populated and related location entities (such as, places and county subdivisions), and American Indian and Alaska Native areas. (From: Federal Information Processing Standard (FIPS)

    To use zip code based data against federal agency data (FIPS), requires this mapping.

    I suspect Neil would appreciate your assistance.

    I first saw this at Pete Warden’s Five Short Links.

    …[A] Common Operational Picture with Google Earth (webcast)

    Thursday, October 11th, 2012

    Joint Task Force – Homeland Defense Builds a Common Operational Picture with Google Earth

    October 25, 2012 at 02:00 PM Eastern Daylight Time

    The security for the Asia-Pacific Economic Collaboration summit in 2011 in Honolulu, Hawaii involved many federal, state & local agencies. The complex task of coordinating information sharing among agencies was the responsibility of Joint Task Force – Homeland Defense (JTF-HD). JTF-HD turned to Google Earth technology to build a visualization capability that enabled all agencies to share information easily & ensure a safe and secure summit.

    What you will learn:

    • Best practices for sharing geospatial information among federal, state & local agencies
    • How to incorporate data from many sources into your own Google Earth globe
    • How do get accurate maps with limited bandwidth or no connection at all.

    Speaker: Marie Kennedy, Joint Task Force – Homeland Defense

    Sponsored by Google.

    In addition to the techniques demonstrated, I suspect the main lesson will be leveraging information/services that already exist.

    Or information integration if you prefer a simpler description.

    Information can be integrated by conversion or mapping.

    Which one you choose depends upon your requirements and the information.

    Reusable information integration (RI2), where you leverage your own investment, well, that’s another topic altogether. 😉

    Ask: Are you spending money to be effective or spending money to maintain your budget relative to other departments?

    If the former, consider topic maps. If the latter, carry on.

    Research Data Australia down to Earth

    Friday, June 22nd, 2012

    Research Data Australia down to Earth

    From the post:

    Context: free cloud servers, a workshop and an idea

    In this post I look at some work we’ve been doing at the University of Western Sydney eResearch group on visualizing metadata about research data, in a geographical context. The goal is to build a data discovery service; one interface we’re exploring is the ability to ‘fly’ around Google Earth looking for data, from Research Data Australia (RDA). For example, a researcher could follow a major river and see what data collections there are along its course that might be of (re-)use. True, you can search the RDA site by dragging a marker on a map but this experiment is a more immersive approach to exploring the same data.

    The post is a quick update on a work in progress, with some not very original reflections on the use of cloud servers. I am putting it here on my own blog first, will do a human-readable summary over at UWS soon, any suggestions or questions welcome.

    You can try this out if you have Google Earth by downloading a KML file. This is a demo service only – let us know how you go.

    This work was inspired by a workshop on cloud computing: this week Andrew (Alf) Leahy and I attended a NeCTAR and Australian National Data Service (ANDS) one day event, along with several UWS staff. The unstoppable David Flanders from ANDS asked us to run a ‘dojo’, giving technically proficient researchers and eResearch collaborators a hand-on experience with the NeCTAR research cloud, where all Australian University researchers with access to the Australian Access Federation login system are entitled to run free cloud-hosted virtual servers. Free servers! Not to mention post-workshop beer[i]. So senseis Alf and and PT worked with a small group of ‘black belts’ in a workshop loosely focused on geo-spatial data. Our idea was “Visualizing the distribution of data collections in Research Data Australia using Google Earth”[ii]. We’d been working on a demo of how this might be done for a few days, which we more-or less got running on the train from the Blue Mountains in to Sydney Uni in the morning.

    When you read about “exploring” the data, bear in mind the question of how to record that “exploration?” Explorers used to keep journals, ships logs, etc. to record their explorations.

    How do you record (if you do), your explorations of data? How do you share them if you do?

    Given the ease of recording our explorations, no more long hand with a quill pen, is it odd that we don’t record our intellectual explorations?

    Or do we want others to see a result that makes us look more clever than we are?

    Gisgraphy

    Sunday, March 18th, 2012

    Gisgraphy

    From the website:

    Gisgraphy is a free, open source framework that offers the possibility to do geolocalisation and geocoding via Java APIs or REST webservices. Because geocoding is nothing without data, it provides an easy to use importer that will automagically download and import the necessary (free) data to your local database (Geonames and OpenStreetMap : 42 million entries). You can also add your own data with the Web interface or the importer connectors provided. Gisgraphy is production ready, and has been designed to be scalable(load balanced), performant and used in other languages than just java : results can be output in XML, JSON, PHP, Python, Ruby, YAML, GeoRSS, and Atom. One of the most popular GPS tracking System (OpenGTS) also includes a Gisgraphy client…read more

    Free webservices:

    • Geocoding
    • Street Search
    • Fulltext Search
    • Reverse geocoding / street search
    • Find nearby
    • Address parser

    Services that you could use with smart phone apps or in creating topic map based collections of data that involve geographic spaces.

    Integrating Lucene with HBase

    Wednesday, March 7th, 2012

    Integrating Lucene with HBase by Boris Lublinsky and Mike Segel.

    You have to get to the conclusion for the punch line:

    The simple implementation, described in this paper fully supports all of the Lucene functionality as validated by many unit tests from both Lucene core and contrib modules. It can be used as a foundation of building a very scalable search implementation leveraging inherent scalability of HBase and its fully symmetric design, allowing for adding any number of processes serving HBase data. It also avoids the necessity to close an open Lucene Index reader to incorporate newly indexed data, which will be automatically available to user with possible delay controlled by the cache time to live parameter. In the next article we will show how to extend this implementation to incorporate geospatial search support.

    Put why your article is important in the introduction as well.

    The second article does better:

    Implementing Lucene Spatial Support

    In our previous article [1], we discussed how to integrate Lucene with HBase for improved scalability and availability. In this article I will show how to extend this Implementation with the spatial support.

    Lucene spatial contribution package [2, 3, 4, 5] provides powerful support for spatial search, but is limited to finding the closest point. In reality spatial search often has significantly more requirements, for example, which points belong to a given shape (circle, bounding box, polygon), which shapes intersect with a given shape and so on. Solution, presented in this article allows solving all of the above problems.

    GeoMapApp

    Saturday, February 11th, 2012

    GeoMapApp

    From the webpage:

    GeoMapApp is an earth science exploration and visualization application that is continually being expanded as part of the Marine Geoscience Data System (MGDS) at the Lamont-Doherty Earth Observatory of Columbia University. The application provides direct access to the Global Multi-Resolution Topography (GMRT) compilation that hosts high resolution (~100 m node spacing) bathymetry from multibeam data for ocean areas and ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) and NED (National Elevation Dataset) topography datasets for the global land masses.

    See YouTube: GeoMapApp (21 video tutorial)

    More data for your merging pleasure. Not to mention a resource on how others prefer to understand/view their data.

    Designing Google Maps

    Wednesday, January 11th, 2012

    Designing Google Maps by Nathan Yau.

    From the post:

    Google Maps is one of Google’s best applications, but the time, energy, and thought put into designing it often goes unnoticed because of how easy it is to use, for a variety of purposes. Willem Van Lancker, a user experience and visual designer for Google Maps, describes the process of building a map application — color scheme, icons, typography, and “Googley-ness” — that practically everyone can use, worldwide.

    I don’t normally disagree with anything Nathan says, particularly about design but I have to depart from him on why we don’t notice the excellence of Google Maps.

    I think we have become accustomed to its excellence and since we don’t look elsewhere (most of us), then we don’t notice that it isn’t commonplace.

    In fact for most of us it is a universe with one inhabitant, Google Maps.

    That takes a lot of very hard work and skill.

    The question is do you have the chops to make your topic map of one or more infoverses the “only” inhabitant, by user choice?

    All the software a geoscientist needs. For free!

    Sunday, December 4th, 2011

    All the software a geoscientist needs. For free! by John A. Stevenson.

    It is quite an impressive list and what’s more, John has provided a script to install it on a Linux machine.

    If you any mapping or geoscience type needs, you would do well to consider some of the software listed here.

    A handy set of tools if you are working with geoscience types on topic map applications as well.

    GeoIQ API Overview

    Friday, November 25th, 2011

    GeoIQ API Overview

    From the webpage:

    GeoIQ is the engine that powers the GeoCommons Community. GeoIQ includes a full Application Programming Interface (API) that allows developers to build unique and powerful domain specific applications. The API provides capability for uploading and download data, searching for data and maps, building, embedding, and theming maps or charts, as well as general user, group, and permissions management.

    The GeoIQ API consists of a REST API and a JavaScript API. REST means that it uses simple URL’s and HTTP methods to perform all of the actions. For example, a dataset is a specific endpoint that a user can create, read, update or delete (CRUD).

    Another resource for topic mappers who want to link information to “real” locations. 😉

    Leaflet & GeoCommons JSON

    Thursday, November 24th, 2011

    Leaflet & GeoCommons JSON by Tim Waters.

    From the post:

    Hi, in this quick tutorial we will have a look at a new JavaScript mapping library, Leaflet using it to help load JSON features from a GeoCommons dataset. We will add our Acetate tile layer to the map, and use the cool API feature filtering functionalities to get just the features we want from the server, show them on a Leaflet map, add popups to the features, style the features according to what the feature is, and add some further interactivity. This blog follows up from two posts on my personal blog, showing GeoCommons features with OpenLayers and with Polymaps.

    We have all read about tweets being used to plot reports or locations from and about the various “occupy” movements. I suspect that effective civil unrest is going to require greater planning for the distribution of support and resources in particular locales. Conveniently, current authorities have created or allowed to be created, maps and other resources that can be used for such purposes. This is one of those resources.

    I don’t know of any research on such algorithms but occupiers might want to search for clusters of dense and confusing paths in urban areas. Those proved effective at times in struggles in Medieval times for control of walled cities. Once the walls were breached, would-be occupiers were confronted with warrens of narrow and confusing paths. As opposed to broad, open pathways that would enable a concentration of forces.

    Is there an algorithm for longest, densest path?

    However discovered, annotating a cluster of dense and confusing paths with tactical information and location of resources would be a natural use of topic maps. Or what to anticipate in such areas, if one is on the “other” side.

    ASTER Global Digital Elevation Model (ASTER GDEM)

    Thursday, November 24th, 2011

    ASTER Global Digital Elevation Model (ASTER GDEM)

    From the webpage:

    ASTER GDEM is an easy-to-use, highly accurate DEM covering all the land on earth, and available to all users regardless of size or location of their target areas.

    Anyone can easily use the ASTER GDEM to display a bird’s-eye-view map or run a flight simulation, and this should realize visually sophisticated maps. By utilizing the ASTER GDEM as a platform, institutions specialized in disaster monitoring, hydrology, energy, environmental monitoring etc. can perform more advanced analysis.

    In addition to the data, there is a GDEM viewer (freeware) at this site.

    All that is missing is your topic map and you.

    piecemeal geodata

    Sunday, November 6th, 2011

    piecemeal geodata

    Michal Migurski on the difficulties of using OpenStreetMap data:

    Two weeks ago, I attended the 5th annual OpenStreetMap conference in Denver, State of the Map. My second talk was called Piecemeal Geodata, and I hoped to communicate some of the pain (and opportunity) in dealing with OpenStreetMap data as a consumer of the information, downstream from the mappers but hoping to make maps or work with the dataset. Harry Wood took notes that suggested I didn’t entirely miss the mark, but after I was done Tom MacWright congratulated me on my “excellent stealth rage talk”. It wasn’t really supposed to be ragey as such, so here are some of my slides and notes along with some followup to the problems I talked about.

    Topic maps are in use in a number of commercial and governmental venues but aren’t the sort of thing you hear about like Twitter or Blackberries (mostly about outages).

    Anticipating more civil disturbances over the next several years, do topic maps have something to offer when coupled with a technology like Google Maps or OSM?

    It is one thing to indicate your location using an app, but can you report movement of forces in a way that updates the maps of some colleagues? In a secure manner?

    What features would a topic map need for such an environment?

    high road, for better OSM cartography

    Sunday, November 6th, 2011

    high road, for better OSM cartography

    From the post:

    High Road is a framework for normalizing the rendering of highways from OSM data, a critical piece of every OSM-based road map we’ve ever designed at Stamen. Deciding exactly which kinds of roads appear at each zoom level can really be done just once, and ideally shouldn’t be part of a lengthy database query in your stylesheet. In Cascadenik and regular Mapnik’s XML-based layer definitions, long queries balloon the size of a style until it’s impossible to scan quickly. In Carto’s JSON-based layer definitions the multiline-formatting of a complex query is completely out of the question. Further, each system has its own preferred way of helping you handle road casings.

    Useful rendering of geographic maps (and the data you attach to them) is likely to be useful in a number of topic map contexts.

    PS: OSM = OpenStreetMap.

    Factual Resolve

    Friday, October 28th, 2011

    Factual Resolve

    Factual has a new API – Resolve:

    From the post:

    The Internet is awash with data. Where ten years ago developers had difficulty finding data to power applications, today’s difficulty lies in making sense of its abundance, identifying signal amidst the noise, and understanding its contextual relevance. To address these problems Factual is today launching Resolve — an entity resolution API that makes partial records complete, matches one entity against another, and assists in de-duping and normalizing datasets.

    The idea behind Resolve is very straightforward: you tell us what you know about an entity, and we, in turn, tell you everything we know about it. Because data is so commonly fractured and heterogeneous, we accept fragments of an entity and return the matching entity in its entirety. Resolve allows you to do a number of things that will make your data engineering tasks easier:

    • enrich records by populating missing attributes, including category, lat/long, and address
    • de-dupe your own place database
    • convert multiple daily deal and coupon feeds into a single normalized, georeferenced feed
    • identify entities unequivocally by their attributes

    For example: you may be integrating data from an app that provides only the name of a place and an imprecise location. Pass what you know to Factual Resolve via a GET request, with the attributes included as JSON-encoded key/value pairs:

    I particularly like the line:

    identify entities unequivocally by their attributes

    I don’t know about the “unequivocally” part but the rest of it rings true. At least in my experience.

    Towards georeferencing archival collections

    Friday, October 21st, 2011

    Towards georeferencing archival collections

    From the post:

    One of the most effective ways to associate objects in archival collections with related objects is with controlled access terms: personal, corporate, and family names; places; subjects. These associations are meaningless if chosen arbitrarily. With respect to machine processing, Thomas Jefferson and Jefferson, Thomas are not seen as the same individual when judging by the textual string alone. While EADitor has incorporated authorized headings from LCSH and local vocabulary (scraped from terms found in EAD files currently in the eXist database) almost since its inception, it has not until recently interacted with other controlled vocabulary services. Interacting with EAC-CPF and geographical services is high on the development priority list.

    geonames.org

    Over the last week, I have been working on incorporating geonames.org queries into the XForms application. Geonames provides stable URIs for more than 7.5 million place names internationally. XML representations of each place are accessible through various REST APIs. These XML datastreams also include the latitude and longitude, which will make it possible to georeference archival collections as a whole or individual items within collections (an item-level indexing strategy will be offered in EADitor as an alternative to traditional, collection-based indexing soon).

    This looks very interesting.

    Details:

    EADitor project site (Google Code): http://code.google.com/p/eaditor/
    Installation instructions (specific for Ubuntu but broadly applies to all Unix-based systems): http://code.google.com/p/eaditor/wiki/UbuntuInstallation
    Google Group: http://groups.google.com/group/eaditor

    First experiences with GeoCouch

    Wednesday, October 19th, 2011

    First experiences with GeoCouch by tbuchwaldt.

    From the post:

    To learn some new stuff about cool databases and geo-aware services we started fiddling with GeoCouch, a CouchDB extension. To have a real scenario we could work on, we designed a small project: A CouchDB database contains documents with descriptions of fastfood restaurants. We agreed on 3 types of restaurants: KFC, Mc Donalds & Burgerking. We gave them some additonal information, namely opening and closing times and a boolean called “supersize”.

    It sounds to me like this sort of service, coupled with a topic map of campus locations/services, could prove to be very amusing during “rush” week when directions and locations are not well known.

    Geological Survey Austria launches thesaurus project

    Tuesday, October 18th, 2011

    Geological Survey Austria launches thesaurus project by Helmut Nagy.

    From the post:

    Throughout the last year the Semantic Web Company team has supported the Geological Survey of Austria (GBA) in setting up their thesaurusA thesaurus is a book that lists words grouped together according to similarity of meaning, in contrast to a dictionary, which contains definitions and pronunciations. The largest thesaurus in the world is the Historical Thesaurus of the Oxford English Dictionary, which contains more than … project. It started with a workshop in summer 2010 where we discussed use cases for using semantic web technologies as means to fulfill the INSPIRE directive. Now in fall 2011 GBA published their first thesauri as Linked Data using PoolParty’s new Linked Data front-end.

    The Thesaurus Project of the GBA aims to create controlled vocabularies for the semantic harmonization of map-based geodata. The content-related realization of this project is governed by the Thesaurus Editorial Team, which consists of domain experts from the Geological Survey of Austria. With the development of semantically and technically interoperable geo-data the Geological Survey of Austria implements its legal obligation defined by the EU-Directive 2007/2/EC INSPIRE and the national “Geodateninfrastrukturgesetz” (GeoDIG), respectively.

    I wonder if their “controlled vocabularies” are going to map to the terminology used over the history of Europe, in maps, art, accounts, histories, and other recorded materials?

    If not, I wonder if there would be any support to tie that history into current efforts or do they plan on simply cutting off the historical record and starting with their new thesaurus?