Skyline query processing for incomplete data in cloud environment
Many research works have been conducted focusing on pro-cessing skyline queries on databases. Recently, some approaches have been proposed to address the issue of skyline queries for a partially complete da-tabase in which data item values might not be presented (missing). Howev-er, these approaches...
Main Authors: | , , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
Universiti Utara Malaysia(UUM)
2017
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/57271/ http://irep.iium.edu.my/57271/ http://irep.iium.edu.my/57271/7/57271%20SKYLINE%20QUERY%20PROCESSING.pdf |
Summary: | Many research works have been conducted focusing on pro-cessing skyline queries on databases. Recently, some approaches have been proposed to address the issue of skyline queries for a partially complete da-tabase in which data item values might not be presented (missing). Howev-er, these approaches are tailored for centralized database and accessed only one table to identify the skylines. Nevertheless, in many contemporary data-base applications, this is might not be the case, particularly for a database with incomplete data and many tables spread over various remote locations such as cloud environment. Applying skyline approaches designed for cen-tralized database directly on cloud databases is undesirable due to the pro-hibitive cost of transferring the amount of data from one datacenter to an-other during skyline process. An approach is needed taking into considera-tion the unique features of cloud environment when processing skyline que-ries on a database with incomplete data. This paper proposes an approach that evaluates skyline queries in a database with partially incomplete data over the cloud. The approach aims at reducing the number of pairwise com-parisons that needs to be conducted between data items and the amount of data transferred in identifying skylines. Several experiments over synthetic and real datasets have been conducted to evaluate the performance of our approach. The result shows that our approach outperforms the previous ap-proach in terms of a number of pairwise comparisons and amount of data transferred. |
---|