Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Decomposing federated queries in presence of replicated fragments
Montoya G., Skaf-Molli H., Molli P., Vidal M.  Journal of Web Semantics 42  1-18, 2017. Type: Article
Date Reviewed: Oct 25 2017

Linked data means that data stored in heterogeneous and autonomous information sources can be integrated, making the information more valuable than what could be obtained from isolated sources. Aggregation of related data offers value; it can provide additional knowledge not available in individual sources. Linked data sources offer resource description framework (RDF) data by means of SPARQL endpoints that can be queried with the SPARQL query language.

The problem of decomposing queries in distributed environments is based on the information integration problem; thus, it is not new. However, the increasing relevance of the linked open data (LOD) cloud poses new challenges to the information integration community: the data model is different and the federation gets to its maximal expression because data sources are completely autonomous from the agent in charge of query distribution.

This paper tackles query distribution in LOD data sources when there are replicated data fragments. The particularity of the replication problem in the LOD context is that data fragmentation and replication cannot be designed in advance to obtain better performance when querying the data sources. Moreover, the availability of sources is unpredictable.

The query decomposition problem is treated in this paper, and a solution to query decomposition with fragment replication (QDP-FR) is offered. It is called LILAC (SPARQL query decomposition against federations of replicated data sources). Its main components are four algorithms: a decompose algorithm, a reduceunions algorithm, a reducebgps algorithm, and an increaseselectivity algorithm. They locate the relevant sources (that is, select nonredundant sets of fragments and candidate endpoints) and join the relevant fragments obtained from the different sources.

The paper presents the problem and formalizes it. Then, the LILAC solution is proposed. The algorithms that constitute LILAC are formalized, their complexity is measured, and proofs of theorems are presented. A validation with experiments on four real datasets and one synthetic dataset is included. In these experiments, the performance of LILAC in two query engines, FedX and ANAPSID, is compared with the performance of other competitors. Performance is measured in terms of execution time, answer completeness, and number of transferred tuples.

This is a sound paper, of interest to the community of researchers working with information integration in linked data environments. I particularly appreciated the “Related Work” section, which proves to be an admirable effort in comparing the problem with other known problems (and solutions), such as distributed databases, data fragmentation, and data replication.

Reviewer:  Mercedes Martínez González Review #: CR145616 (1712-0817)
Bookmark and Share
Query Processing (H.2.4 ... )
World Wide Web (WWW) (H.3.4 ... )
Would you recommend this review?
Other reviews under "Query Processing": Date
Web log analysis: a review of a decade of studies about information acquisition, inspection and interpretation of user interaction
Agosti M., Crivellari F., Di Nunzio G.  Data Mining and Knowledge Discovery 24(3): 663-696, 2020. Type: Article
Jan 11 2022
Querying graphs
Bonifati A., Fletcher G., Voigt H., Yakovets N.,  Morgan&Claypool Publishers, San Rafael, CA, 2018. 184 pp. Type: Book (978-1-681734-30-9)
Nov 21 2019
Waves: a fast multi-tier top-k query processing algorithm
Daoud C., Silva de Moura E., Fernandes D., Soares da Silva A., Rossi C., Carvalho A.  Information Retrieval 20(3): 292-316, 2017. Type: Article
Dec 28 2017

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright © 2000-2022 ThinkLoud, Inc.
Terms of Use
| Privacy Policy