Mining Travel Resources on the Web Using L-Wrappers Authors Elvira Popescu, Amelia Bădică , and Costin Bădică
Abstract:
The work described here is part of an ongoing research on the application of general-purpose inductive logic programming, logic representation of wrappers (L-wrappers) and XML technologies (including the XSLT transformation language) to information extraction from the Web. The L-wrappers methodology is based on a sound theoretical approach and has already proved its efficacy on a smaller scale, in the area of collecting product information. This paper proposes the use of L-wrappers for tuple extraction from HTML in the domain of e-tourism. It also describes a method for translating L-wrappers into XSLT and illustrates it with the example of a real-world travel agency Web site.
Deeply interesting work in part due to the use of XSLT to extract tuples from HTML pages but also because a labeled ordered tree is used as an interpretive domain for patterns matched against the tree.
If that latter sounds familiar, it should, most data mining techniques specifying a domain in which results (intermediate or otherwise), are going to be interpreted.
I will look around for other material on L-wrappers and inductive logic programming.