Effective Query Processing for Web-scale RDF Data using Hadoop Components

Authors

  • C. Lakshmi, K. Usha Rani

Abstract

 Semantic Web Data is an add-on for the World Wide Web, and the main objective of this is to make the internet data machine-readable. Resource Description Framework (RDF) is one of the technologies used to encode and represent the semantics data in the form of metadata. Generation of the semantic data is growing day by day into large number and it’s becoming complicated to Process and Store using the Traditional database systems, Hadoop and Spark are the popular open-source tools for Processing (Map-Reduce) and Storing (HDFS) a large amount of data. Using these bigdata tools can analyze the terabytes of the data in a distributed parallel process. In this paper, by executing the benchmark queries in Hive and Spark by using RDF data, Spark has an in-memory computation that can give faster results using Resilient Distributed Datasets (RDD). A scalable and faster framework can be obtained based on practical evaluation and analysis. Hence, by experimenting with the proposed system Spark has been given better performance results in processing the semantic web data when compared with the Hive.

Downloads

Published

2020-05-17

Issue

Section

Articles