International Journal of Applied Information Systems |
Foundation of Computer Science (FCS), NY, USA |
Volume 12 - Number 31 |
Year of Publication: 2020 |
Authors: Terungwa Simon Yange, Ishaya Peni Gambo, Rhoda Ikono, Hettie A. Soriyan |
10.5120/ijais2020451868 |
Terungwa Simon Yange, Ishaya Peni Gambo, Rhoda Ikono, Hettie A. Soriyan . A Multi-Nodal Implementation of Apriori Algorithm for Big Data Analytics using MapReduce Framework. International Journal of Applied Information Systems. 12, 31 ( July 2020), 8-28. DOI=10.5120/ijais2020451868
This paper developed a distributed algorithm for Big Data Analytics to address the delay in the processing of big data. In order to achieve the aim of this research, an inspection of organizational documents, direct observation and collection of existing data from the National Health Insurance Scheme (NHIS) in Nigeria. The algorithm was formulated using Apriori Association Rule Mining and was specified using the enterprise application diagram. The implementation of the prototype for the algorithm was using MongoDB as the big data storage mechanism for the input. Comma Separated Values (CSV) files was used as the storage facility for the intermediate results generated during processing, and MySQL was used as the storage mechanism for the final output. Finally, Apache MapReduce as the big data multi-nodal processing platform and Java programming language as the implementation technology. This prototype was able to analyze different formats of data (i.e., pdf, excel, csv and images) with high volume and velocity. The result showed that the response time was 0.25 seconds, and the throughput was 8865.29 records per second. The stability of the prototype was also evaluated using the confidence of the rules generated. In conclusion, this research has shown that unnecessary delays in the processing of big data were due to the lack of appropriate data analytics tool to enhance the process. This study eliminated these irregularities which paved the way for quicker disbursement of funds to providers and other stakeholders, as well as, a quicker response to requests on enrollment, update and referral.