Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 18, 2011

Geological Survey Austria launches thesaurus project

Filed under: Geographic Data,Geographic Information Retrieval,Maps,Thesaurus — Patrick Durusau @ 2:41 pm

Geological Survey Austria launches thesaurus project by Helmut Nagy.

From the post:

Throughout the last year the Semantic Web Company team has supported the Geological Survey of Austria (GBA) in setting up their thesaurusA thesaurus is a book that lists words grouped together according to similarity of meaning, in contrast to a dictionary, which contains definitions and pronunciations. The largest thesaurus in the world is the Historical Thesaurus of the Oxford English Dictionary, which contains more than … project. It started with a workshop in summer 2010 where we discussed use cases for using semantic web technologies as means to fulfill the INSPIRE directive. Now in fall 2011 GBA published their first thesauri as Linked Data using PoolParty’s new Linked Data front-end.

The Thesaurus Project of the GBA aims to create controlled vocabularies for the semantic harmonization of map-based geodata. The content-related realization of this project is governed by the Thesaurus Editorial Team, which consists of domain experts from the Geological Survey of Austria. With the development of semantically and technically interoperable geo-data the Geological Survey of Austria implements its legal obligation defined by the EU-Directive 2007/2/EC INSPIRE and the national “Geodateninfrastrukturgesetz” (GeoDIG), respectively.

I wonder if their “controlled vocabularies” are going to map to the terminology used over the history of Europe, in maps, art, accounts, histories, and other recorded materials?

If not, I wonder if there would be any support to tie that history into current efforts or do they plan on simply cutting off the historical record and starting with their new thesaurus?

September 29, 2011

Indexed Nearest Neighbour Search in PostGIS

Filed under: Geographic Data,Geographic Information Retrieval,PostgreSQL — Patrick Durusau @ 6:36 pm

Indexed Nearest Neighbour Search in PostGIS

From the post:

An always popular question on the PostGIS users mailing list has been “how do I find the N nearest things to this point?”.

To date, the answer has generally been quite convoluted, since PostGIS supports bounding box index searches, and in order to get the N nearest things you need a box large enough to capture at least N things. Which means you need to know how big to make your search box, which is not possible in general.

PostgreSQL has the ability to return ordered information where an index exists, but the ability has been restricted to B-Tree indexes until recently. Thanks to one of our clients, we were able to directly fund PostgreSQL developers Oleg Bartunov and Teodor Sigaev in adding the ability to return sorted results from a GiST index. And since PostGIS indexes use GiST, that means that now we can also return sorted results from our indexes.

This feature (the PostGIS side of it) was funded by Vizzuality, and hopefully it comes in useful in their CartoDB work.

You will need PostgreSQL 9.1 and the PostGIS source code from the repository, but this is what a nearest neighbour search looks like:

PostgreSQL? Isn’t that SQL? 🙂

Indexed nearest neighbour search is a question of results, not ideology.

Better targeting through technology.

September 17, 2011

GRASS: Geographic Resources Analysis Support System

GRASS: Geographic Resources Analysis Support System

The post about satellite imagery analysis for Syria made me curious about tools for use for automated analysis of satellite images.

From the webpage:

Commonly referred to as GRASS, this is free Geographic Information System (GIS) software used for geospatial data management and analysis, image processing, graphics/maps production, spatial modeling, and visualization. GRASS is currently used in academic and commercial settings around the world, as well as by many governmental agencies and environmental consulting companies. GRASS is an official project of the Open Source Geospatial Foundation.

You may also want to visit the Open Dragon project.

From the Open Dragon site:

Availability of good software for teaching Remote Sensing and GIS has always been a problem. Commercial software, no matter how good a discount is offered, remains expensive for a developing country, cannot be distributed to students, and may not be appropriate for education. Home-grown and university-sourced software lacks long-term support and the needed usability and robustness engineering.

The OpenDragon Project was established in the Department of Computer Engineering of KMUTT in December of 2004. The primary objective of this project is to develop, enhance, and maintain a high-quality, commercial-grade software package for remote sensing and GIS analysis that can be distributed free to educational organizations within Thailand. This package, OpenDragon, is based on the Version 5 of the commercial Dragon/ips® software developed and marketed by Goldin-Rudahl Systems, Inc.

As of 2010, Goldin-Rudahl Systems has agreed that the Open Dragon software, based on Dragon version 5, will be open source for non-commercial use. The software source code should be available on this server by early 2011.

And there is always the commercial side, if you have funding ArcGIS. The makers of ArcGIS, Esri support a several open source GIS projects.

The results of using these or other software packages can be tied to other information using topic maps.

September 15, 2011

Spatial Search Plugin (SSP) for Solr

Filed under: Geographic Information Retrieval,Maps,Solr — Patrick Durusau @ 7:51 pm

Spatial Search Plugin (SSP) for Solr

From the webpage:

With the continuous efforts of adjusting search results to focused target audiences, there’s an increasing demand for incorporating geographical location information into the standard search functionality. Spatial Search Plugin (SSP) for Apache Solr is a free, standalone plug-in which enables Geo / Location Based Search, and is built on top of the open source projects Apache Solr and Apache Lucene. It’s main goals and characteristics are:

  • Provide a complete, consistent, robust and fast implementation of advanced geospatial algorithms
  • Act as a standalone pluggable extension to Solr
  • Written in 100% Java
  • Compatible with Apache Solr and Apache Lucene
  • Open source under the Apache2 license
  • Well documented and comes with support

Location plus information about the location is a topic mappish sort of thing.

September 12, 2011

LinkedGeoData Release 2

LinkedGeoData Release 2

From the webpage:

The aim of the LinkedGeoData (LGD) project is to make the OpenStreetMap (OSM) datasets easily available as RDF. As such the main target audience is the Semantic Web community, however it may turn out to be useful to a much larger audience. Additionally, we are providing interlinking with DBpedia and GeoNames and integration of class labels from translatewiki and icons from the Brian Quinion Icon Collection.

The result is a rich, open, and integrated dataset which we hope to be useful for research and application development. The datasets can be publicly accessed via downloads, Linked Data, and SPARQL-endpoints. We have also launched an experimental “Live-SPARQL-endpoint” that is synchronized with the minutely updates from OSM whereas the changes to our store are republished as RDF.

More geographic data.

September 5, 2011

How Hard is the Local Search Problem?

Filed under: Geographic Information Retrieval,Local Search,Mapping,Searching — Patrick Durusau @ 7:33 pm

How Hard is the Local Search Problem? by Matthew Hurst.

The “local search” problem that Matthew is addressing is illustrated with Google’s mapping of local restaurants in Matthew’s neighborhood.

The post starts:

The local search problem has two key components: data curation (creating and maintaining a set of high quality statements about what the world looks like) and relevance (returning those statements in a manner that satisfies a user need. The first part of the problem is a key enabler to success, but how hard is it?

There are many problems which involve bringing together various data sources (which might be automatically or manually created) and synthesizing an improved set of statements intended to denote something about the real world. The way in which we judge the results of such a process is to take the final database, sample it, and test it against what the world looks like.

In the local search space, this might mean testing to see if the phone number in a local listing is indeed that associated with a business of the given name and at the given location.

But do we quantify this challenge? We might perform the above evaluation and find out that 98% of the phone numbers are correctly associated. Is that good? Expected? Poor?

After following Matthew through his discussion of the various factors in “local search,” what are your thoughts on Google’s success with “local search?”

Could you do better?

How? Be specific, a worked example would be even more convincing.

July 13, 2011

GeoCommons Enterprise Features – Free!

Filed under: Geo Analytics,Geographic Data,Geographic Information Retrieval — Patrick Durusau @ 7:30 pm

GeoCommons Enterprise Features – Free!

From the email announcement:

  • Analytics: Easy-to-use, advanced spatial analytics that users and groups can utilize to answer mission-critical questions. Select among numerous analyses such as filtering, buffers, spatial aggregation and predictive analysis.
  • Private Data Support: Keep proprietary data private and unsearchable by others. Now you can upload proprietary data, analyze it with other data and create compelling maps, charts and graphs all within a secure interface.
  • Groups and Permissions: Allow others in your group or organization to access and collaborate with you. Enable permissions at various levels to limit or expand data sharing. See a step-by-step guide of how to create groups and make your data private here from @seangorman.

For groups and private data, see: Private Data and Groups for GeoCommons!!

GeoCommons has 70,000 datasets.

If you look around you might find something you like.

Topic mappers should ask themselves: Why does this work? (more on that anon)

July 12, 2011

Scaling Scala at Twitter by Marius Eriksen

Filed under: Geographic Information Retrieval,Scala — Patrick Durusau @ 7:10 pm

Scaling Scala at Twitter by Marius Eriksen

From the description:

Rockdove is the backend service that powers the geospatial features on Twitter.com and the Twitter API (“Twitter Places”). It provides a datastore for places and a geospatial search engine to find them. To throw out some buzzwords, it is:

  • a distributed system
  • realtime (immediately indexes updates and changes)
  • horizontally scalable
  • fault tolerant

Rockdove is written entirely in Scala and was developed by 2 engineers with no prior Scala experience (nor with Java or the JVM). We think the geospatial search engine provides an interesting case study as it presents a mix of algorithm problems and “classic” scaling and optimization issues. We will report on our experience using Scala, focusing especially on:

  • “functional” systems design
  • concurrency and parallelism
  • using a “research language” in practice
  • when, where and why we turned the “functional dial”
  • avoiding mutable state

Not to mention being a well done presentation!

June 12, 2011

clusterPy: Library of spatially constrained
clustering algorithms

Filed under: Clustering,Geo Analytics,Geographic Data,Geographic Information Retrieval — Patrick Durusau @ 4:13 pm

clusterPy: Library of spatially constrained clustering algorithms

From the webpage:

Analytical regionalization (also known as spatially constrained clustering) is a scientific way to decide how to group a large number of geographic areas or points into a smaller number of regions based on similarities in one or more variables (i.e., income, ethnicity, environmental condition, etc.) that the researcher believes are important for the topic at hand. Conventional conceptions of how areas should be grouped into regions may either not be relevant to the information one is trying to illustrate (i.e., using political regions to map air pollution) or may actually be designed in ways to bias aggregated results.

May 29, 2011

March 25, 2011

Open-source Data Science Toolkit

Filed under: Dataset,Geographic Data,Geographic Information Retrieval,Software — Patrick Durusau @ 4:32 pm

Open-source Data Science Toolkit

From Flowingdata.com:

Pete Warden does the data community a solid and wraps up a collection of open-source tools in the Data Science Toolkit to parse, geocode, and process data.

Mostly geographic material but some other interesting tools, such as extracting the “main” story from a document. (It has never encountered one of my longer email exchanges with Newcomb. 😉 )

It is interesting to me that so many tools and data sets related to geography appear so regularly.

GIS (geographic information systems) can be very hard but perhaps they are easier than the semantic challenges of say medical or legal literature.

That is it is easier to say here you are with regard to a geographic system than to locate a subject in a conceptual space which has been partially captured by a document.

Suspect the difference in hardness could only be illustrated by example and not by some test. Will have to give that some thought.

March 9, 2011

Neo4j Spatial, Part 1: Finding things close to other things

Filed under: Geographic Data,Geographic Information Retrieval,Neo4j — Patrick Durusau @ 4:25 pm

Neo4j Spatial, Part 1: Finding things close to other things

Start of a great series of posts on geographic information processing.

Topic maps for travel, military, disaster and other applications will face this type of issue.

Not to mention needing to map across different systems with different approaches to resolving these issues.

February 22, 2011

Quantum GIS

Filed under: Geographic Information Retrieval,Mapping,Maps — Patrick Durusau @ 1:31 pm

Quantum GIS

From the website:

QGIS is a cross-platform (Linux, Windows, Mac) open source application with many common GIS features and functions. The major features include:

1. View and overlay vector and raster data in different formats and projections without conversion to an internal or common format.

Supported formats include:

  • spatially-enabled PostgreSQL tables using PostGIS and SpatiaLite,
  • most vector formats supported by the OGR library*, including ESRI shapefiles, MapInfo, SDTS and GML.
  • raster formats supported by the GDAL library*, such as digital elevation models, aerial photography or landsat imagery,
  • GRASS locations and mapsets,
  • online spatial data served as OGC-compliant WMS , WMS-C (Tile cache), WFS and WFS-T

2. Create maps and interactively explore spatial data with a friendly graphical user interface. The many helpful tools available in the GUI include:

  • on the fly projection,
  • print composer,
  • overview panel,
  • spatial bookmarks,
  • identify/select features,
  • edit/view/search attributes,
  • feature labeling,
  • vector diagram overlay
  • change vector and raster symbology,
  • add a graticule layer,
  • decorate your map with a north arrow, scale bar and copyright label,
  • save and restore projects

3. Create, edit and export spatial data using:

  • digitizing tools for GRASS and shapefile formats,
  • the georeferencer plugin,
  • GPS tools to import and export GPX format, convert other GPS formats to GPX, or down/upload directly to a GPS unit

4. Perform spatial analysis using the fTools plugin for Shapefiles or the integrated GRASS plugin, including:

  • map algebra,
  • terrain analysis,
  • hydrologic modeling,
  • network analysis,
  • and many others

5. Publish your map on the internet using the export to Mapfile capability (requires a webserver with UMN MapServer installed)

6. Adapt Quantum GIS to your special needs through the extensible plugin architecture.

I didn’t find this on my own. 😉 This and the T I G E R data source were both mentioned Paul Smith’s Mapping with Location Data presentation.

Data and manipulations you usually find have no explicit basis in subject identity but that is your opportunity to really shine.

Assuming you can discover some user need that can be met with explicit subject identity or met better with explicit subject identity than not.

Let’s try not to be like some vendors I could mention where a user’s problem has to fit the solution they are offering. I turned down an opportunity like that, some thirty years ago now, and see no reason to re-visit that decision.

At least in my view, any software solution has to fit my problem, not vice versa.

January 23, 2011

geocommons

Filed under: Dataset,Geographic Information Retrieval,Mapping,Maps — Patrick Durusau @ 9:27 pm

geocommons

A very impressive resource for mapping data against a common geographic background.

Works for a lot of reasons, not the least of which is the amount of effort that has gone into the site and its tools.

But, I think having a common frame of reference, that is geographic locations, simplifies the problem addressed by topic maps.

That is the data is seen through the common lens of geographic boundaries and/or locations.

To make it closer to the problem faced by topic maps, what if geographic locations had to be brought into focus, before data could be mapped against them?

That seems to me to be the harder problem.

November 22, 2010

A Fun Application of Compact Data Structures to Indexing Geographic Data

Filed under: Geographic Information Retrieval,Indexing,Spatial Index — Patrick Durusau @ 6:07 am

A Fun Application of Compact Data Structures to Indexing Geographic Data Author(s): Nieves R. Brisaboa, Miguel R. Luaces, Gonzalo Navarro, Diego Seco Keywords: geographic data, MBR, range query, wavelet tree

Abstract:

The way memory hierarchy has evolved in recent decades has opened new challenges in the development of indexing structures in general and spatial access methods in particular. In this paper we propose an original approach to represent geographic data based on compact data structures used in other fields such as text or image compression. A wavelet tree-based structure allows us to represent minimum bounding rectangles solving geographic range queries in logarithmic time. A comparison with classical spatial indexes, such as the R-tree, shows that our structure can be considered as a fun, yet seriously competitive, alternative to these classical approaches.

I must confess that after reading this article more than once, I still puzzle over: “Our experiments, featuring GIS-like scenarios, show that our index is a relevant and funnier alternative to classical spatial indexes, such as the R-tree ….”

I admit to being drawn to esoteric and even odd solutions but I would not describe most of them as being “funnier” than an R-tree.

For all that, the article will be useful to anyone developing topic maps for use with spatial indexes and is a good introduction to wavelet trees.

Questions:

  1. Create an annotated bibliography of spatial indexes. (date limit, last five (5) years)
  2. Create an annotated bibliography of spatial data resources. (date limit, last five (5) years)
  3. How would you use MBRs (Minimum Bounding Rectangles) for merging purposes in a topic map? (3-5 pages, no citations)

November 20, 2010

From Documents To Targets: Geographic References

Filed under: Associations,Geographic Information Retrieval,Ontology,Spatial Index — Patrick Durusau @ 9:18 pm

Exploiting geographic references of documents in a geographical information retrieval system using an ontology-based index Author(s): Nieves R. Brisaboa, Miguel R. Luaces, Ángeles S. Places and Diego Seco Keywords: Geographic information retrieval, Spatial index, Textual index, Ontology, System architecture

Abstract:

Both Geographic Information Systems and Information Retrieval have been very active research fields in the last decades. Lately, a new research field called Geographic Information Retrieval has appeared from the intersection of these two fields. The main goal of this field is to define index structures and techniques to efficiently store and retrieve documents using both the text and the geographic references contained within the text. We present in this paper two contributions to this research field. First, we propose a new index structure that combines an inverted index and a spatial index based on an ontology of geographic space. This structure improves the query capabilities of other proposals. Then, we describe the architecture of a system for geographic information retrieval that defines a workflow for the extraction of the geographic references in documents. The architecture also uses the index structure that we propose to solve pure spatial and textual queries as well as hybrid queries that combine both a textual and a spatial component. Furthermore, query expansion can be performed on geographic references because the index structure is based in an ontology.

Obviously relevant to the Afghan War Diary materials.

The authors observe:

…concepts such as the hierarchical nature of geographic space and the topological relationships between the
geographic objects must be considered….

Interesting but topic maps would help with “What defensive or offensive assets I have in a geographic area?”

« Newer Posts

Powered by WordPress