Big Data The NoSQL Column Store Implementation Uses Apache Accumulo to Build a Big Data Infrastructure Model
The NoSQL Column Store Implementation Uses Apache Accumulo to Build a Big Data Infrastructure Model
Abstract
As the development of the digital world and technology, making data growth becomes very fast. This fast data growth gave rise to a new phenomenon, Big Data. However, the database that has been used, namely RDBMS has limitations in managing Big Data. Therefore, in this final project a Big Data infrastructure model will be built by implementing NoSQL using a single node and multi node cluster Apache Accumulo with seven nodes. The infrastructure that has been built will be tested for performance based on the results of runtime, throughput and latency using YCSB. The test is done based on variations in data size (500MB, 1GB, 1.5GB and 2GB), variations in the number of nodes (1, 4, 5, 6 and 7 nodes) used and variations in testing time (morning, afternoon and night). Testing uses YCSB core workload, namely workload A, B, C, D, E, and F which consists of two processes, load and run. From the results of testing and analysis of data readings obtained, runtime is influenced by the throughput and latency generated. The optimal infrastructure model is a multi node cluster using 4 nodes and the optimal time is at night.Published
Issue
Section
License
Copyright info for authors
1. Authors hold the copyright in any process, procedure, or article described in the work and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2. Authors retain publishing rights to re-use all or portion of the work in different work but can not granting third-party requests for reprinting and republishing the work.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) as it can lead to productive exchanges, as well as earlier and greater citation of published work.