Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 31, 2012

Using an RDF Data Pipeline to Implement Cross-Collection Search

Filed under: Heterogeneous Data,Museums,Searching,Solr — Patrick Durusau @ 4:10 pm

Using an RDF Data Pipeline to Implement Cross-Collection Search by David Henry and Eric Brown.

Abstract:

This paper presents an approach to transforming data from many diverse sources in support of a semantic cross-collection search application. It describes the vision and goals for a semantic cross-collection search and examines the challenges of supporting search of that kind using very diverse data sources. The paper makes the case for supporting semantic cross-collection search using semantic web technologies and standards including Resource Descriptive Framework (RDF), SPARQL Protocol and RDF Query Language (SPARQL ), and an XML mapping language. The Missouri History Museum has developed a prototype method for transforming diverse data sources into a data repository and search index that can support a semantic cross-collection search. The method presented in this paper is a data pipeline that transforms diverse data into localized RDF; then transforms the localized RDF into more generalized RDF graphs using common vocabularies; and ultimately transforms generalized RDF graphs into a Solr search index to support a semantic cross-collection search. Limitations and challenges of this approach are detailed in the paper.

A great report on the issues you will face with diverse data resources. (And who doesn’t have those?)

The “practical considerations” section is particularly interesting and I am sure the project participants would appreciate any suggestions you may have.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress