Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

June 20, 2011

Designing and Refining Schema Mappings via Data Examples

Filed under: Database,Mapping,Schema — Patrick Durusau @ 3:34 pm

Designing and Refining Schema Mappings via Data Examples by Bogdan Alexe, Balder ten Cate, Phokion G. Kolaitis, and Wang-Chiew Tan, from SIGMOD ’11.

Abstract:

A schema mapping is a specification of the relationship between a source schema and a target schema. Schema mappings are fundamental building blocks in data integration and data exchange and, as such, obtaining the right schema mapping constitutes a major step towards the integration or exchange of data. Up to now, schema mappings have typically been specified manually or have been derived using mapping-design systems that automatically generate a schema mapping from a visual specification of the relationship between two schemas. We present a novel paradigm and develop a system for the interactive design of schema mappings via data examples. Each data example represents a partial specification of the semantics of the desired schema mapping. At the core of our system lies a sound and complete algorithm that, given a finite set of data examples, decides whether or not there exists a GLAV schema mapping (i.e., a schema mapping specified by Global-and-Local-As-View constraints) that “fits” these data examples. If such a fitting GLAV schema mapping exists, then our system constructs the “most general” one. We give a rigorous computational complexity analysis of the underlying decision problem concerning the existence of a fitting GLAV schema mapping, given a set of data examples. Specifically, we prove that this problem is complete for the second level of the polynomial hierarchy, hence, in a precise sense, harder than NP-complete. This worst-case complexity analysis notwithstanding, we conduct an experimental evaluation of our prototype implementation that demonstrates the feasibility of interactively designing schema mappings using data examples. In particular, our experiments show that our system achieves very good performance in real-life scenarios.

Two observations:

1) The use of data examples may help overcome the difficulty of getting users to articulate “why” a particular mapping should occur.

2) Data examples that support mappings, if preserved, could be used to illustrate for subsequent users “why” particular mappings were made or even should be followed in mappings to additional schemas.

Mapping across revisions of a particular schema or across multiple schemas at a particular time is likely to benefit from this technique.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress