BIBLIOGRAPHY

Publications, working papers, and other research using data resources from IPUMS.

Full Citation

Title: A Systematic Review of SQL-on-Hadoop by Using Compact Data Formats

Citation Type: Journal Article

Publication Year: 2017

Abstract: There are huge volumes of raw data generated every day. The question is how to store these data in order to provide faster data access. The research direction in Big Data projects using Hadoop Technology, MapReduce kind of framework and compact data formats shows that two data formats (Avro and Parquet) support schema evolution and compression in order to utilize less storage space. In this paper, a systematic review of SQL-on-Hadoop by using Avro and Parquet has been performed over the past seven years (2010–2016) using publications of conference proceedings and journals of IEEEXplore, ACM Digital Library, ScienceDirect. With the help of search strategy followed, 152 research papers have been identified out of which 27 (from year 2013-2016) have been analyzed deeply as relevant papers. At the end, the conclusion has been made that direct comparison by compactness and fastness between Avro and Parquet do not exist in analyzed scientific articles.

Url: https://www.bjmc.lu.lv/fileadmin/user_upload/lu_portal/projekti/bjmc/Contents/5_2_06_Plase.pdf

User Submitted?: No

Authors: Plase, Daiga

Periodical (Full): Baltic Journal of Modern Computing

Issue: 2

Volume: 5

Pages: 233-250

Data Collections: IPUMS Terra

Topics: Methodology and Data Collection, Population Data Science

Countries:

IPUMS NHGIS NAPP IHIS ATUS Terrapop