Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
RDF in the clouds: a survey
Kaoudi Z., Manolescu I. The VLDB Journal: The International Journal on Very Large Data Bases24 (1):67-91,2015.Type:Article
Date Reviewed: Apr 8 2016

With the ongoing explosion of cloud computing, graph knowledge bases, and the requisite resource description framework (RDF) data encoding, it’s only apt to ask about the typologies of these knowledge bases. The main thrust here is the provision of a good survey of RDF data management in cloud environments, especially with regard to storage, query processing, and reasoning. The paper starts with a cogent description of RDF, RDF schema (RDFS), SPARQL, and NoSQL stores, setting down a set of RDF triples that are used and reused to expound on particular points. The topics discussed include techniques of reasoning, query processing, and storage.

While many cloud RDF storage systems, such as Rya, H2RDF, AMADA, MAPSIN, Stratustore, and CumulusRDF, have relied significantly on NoSQL and MapReduce architectures, the authors state the two questions they deem important in large-scale distributed RDF storage platforms: How should data be partitioned across nodes? How should the corresponding partition be stored on each node? There is also a third question, which they overlook in their classification: How do these systems deal with cache misses? Their two questions form the basis of the four categories used to classify distributed/cloud RDF storage systems: “systems relying on a distributed file system, such as HDFS; systems that use existing ‘NoSQL’ key–value stores as back-ends; systems warehousing RDF data in a ‘federation’ of single-site (centralized) RDF stores, ... [and] hybrid systems using a combination of the [previous three].”

For cloud-based SPARQL query processing, the systems are classified into two types: (a) “systems using graph exploration techniques based on the graph structure of the data,” and (b) “systems following a relational-style query processing strategy.” This is further divided into two types: systems using data access in evaluating a SPARQL query, and systems using the join operation “merging together the different pieces of the data.”

The authors use three categories to classify the cloud RDFS reasoning used by these systems: (a) “closure computation [in the cloud]: compute and materialize all entailed triples”; (b) “query reformulation: reformulate a given query to take into account entailed triples [using RDF schema and a set of RDFS entailment rules]”; and (c) “hybrid: some mix of the two [previous] approaches,” for example, precompute the RDFS closure of the RDF schema so that query reformulation can happen faster.

The paper ends with comparisons of the tradeoffs that need be taken into account to know which system(s) to choose in order to fulfill particular functionalities. This survey paper is well written, starting from the first principles of RDF and SPARQL, and gradually progressing to more arcane topics of cloud-based RDF storage, query processing, and reasoning. Linked data/semantic web systems experts will be well advised to adopt the design ideas, as exemplified in the systems described, as part of their heuristics.

Reviewer:  Tope Omitola Review #: CR144308 (1606-0409)
Bookmark and Share
 
Cloud Computing (C.2.4 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Cloud Computing": Date
Cloud security and privacy: an enterprise perspective on risks and compliance
Mather T., Kumaraswamy S., Latif S., O’Reilly Media, Inc., Sebastopol, CA, 2009.  336, Type: Book (9780596802769), Reviews: (1 of 3)
Dec 14 2009
Cloud security and privacy: an enterprise perspective on risks and compliance
Mather T., Kumaraswamy S., Latif S., O’Reilly Media, Inc., Sebastopol, CA, 2009.  336, Type: Book (9780596802769), Reviews: (2 of 3)
Jan 26 2010
Cloud security and privacy: an enterprise perspective on risks and compliance
Mather T., Kumaraswamy S., Latif S., O’Reilly Media, Inc., Sebastopol, CA, 2009.  336, Type: Book (9780596802769), Reviews: (3 of 3)
Mar 18 2010
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy