Aggregate Skyline Join Queries: Skylines with Aggregate Operations over Multiple Relations by Arnab Bhattacharya and B. Palvali Teja.
(Submitted on 28 Jun 2012)
Abstract:
The multi-criteria decision making, which is possible with the advent of skyline queries, has been applied in many areas. Though most of the existing research is concerned with only a single relation, several real world applications require finding the skyline set of records over multiple relations. Consequently, the join operation over skylines where the preferences are local to each relation, has been proposed. In many of those cases, however, the join often involves performing aggregate operations among some of the attributes from the different relations. In this paper, we introduce such queries as “aggregate skyline join queries”. Since the naive algorithm is impractical, we propose three algorithms to efficiently process such queries. The algorithms utilize certain properties of skyline sets, and processes the skylines as much as possible locally before computing the join. Experiments with real and synthetic datasets exhibit the practicality and scalability of the algorithms with respect to the cardinality and dimensionality of the relations.
The authors illustrate a “skyline” query with a search for a hotel that has a good price and it close to the beach. A “skyline” set of hotels excludes hotels that are not as good on those points as hotels in the set. They then observe:
In real applications, however, there often exists a scenario when a single relation is not sufficient for the application, and the skyline needs to be computed over multiple relations [16]. For example, consider a flight database. A person traveling from city A to city B may use stopovers, but may still be interested in flights that are cheaper, have a less overall journey time, better ratings and more amenities. In this case, a single relation specifying all direct flights from A to B may not suffice or may not even exist. The join of multiple relations consisting of flights starting from A and those ending at B needs to be processed before computing the preferences.
The above problem becomes even more complex if the person is interested in the travel plan that optimizes both on the total cost as well as the total journey time for the two flights (other than the ratings and amenities of each
airline). In essence, the skyline now needs to be computed on attributes that have been aggregated from multiple relations in addition to attributes whose preferences are local within each relation. The common aggregate operations are sum, average, minimum, maximum, etc.
No doubt the travel industry thinks it has conquered semantic diversity in travel arrangements. If they have, it has since I stopped traveling several years ago.
Even simple tasks such as coordination of air and train schedules was unnecessarily difficult.
I suspect that is still the case and so mention “skyline” queries as a topic to be aware of and if necessary, to include in a topic map application that brings sanity to travel arrangements.
True, you can get a travel service that handles all the details, but only for a price and only if you are that trusting.