Semantic-Distance Based Clustering for XML Keyword Search Authors(s): Weidong Yang, Hao Zhu Keywords: XML, Keyword Search, Clustering
Abstract:
XML Keyword Search is a user-friendly information discovery technique, which is well-suited to schema-free XML documents. We propose a novel scheme for XML keyword search called XKLUSTER, in which a novel semantic-distance model is proposed to specify the set of nodes contained in a result. Based on this model, we use clustering approaches to generate all meaningful results in XML keyword search. A ranking mechanism is also presented to sort the results.
The author’s develop an interesting notion of “semantic distance” and then say:
Strictly speaking, the searching intentions of users can never be confirmed accurately; so different than existing researches, we suggest that all keyword nodes are useful more or less and should be included in
results. Based on the semantic distance model, we divide the set of keyword nodes X into a group of smaller sets, and each of them is called a “cluster”.
Well…, but the goal is to present the user with results relevant to their query, not results relevant to some query.
Still, an interesting paper and one that XML types will enjoy reading.