@article{ART002218283},
author={Byungho Jung and Dong Hoon Lim},
title={Performance Comparison of Logistic Regression Algorithms on RHadoop},
journal={Journal of The Korea Society of Computer and Information},
issn={1598-849X},
year={2017},
volume={22},
number={4},
pages={9-16},
doi={10.9708/jksci.2017.22.04.009}
TY - JOUR
AU - Byungho Jung
AU - Dong Hoon Lim
TI - Performance Comparison of Logistic Regression Algorithms on RHadoop
JO - Journal of The Korea Society of Computer and Information
PY - 2017
VL - 22
IS - 4
PB - The Korean Society Of Computer And Information
SP - 9
EP - 16
SN - 1598-849X
AB - Machine learning has found widespread implementations and applications in many different domains in our life. Logistic regression is a type of classification in machine leaning, and is used widely in many fields, including medicine, economics, marketing and social sciences.
In this paper, we present the MapReduce implementation of three existing algorithms, this is, Gradient Descent algorithm, Cost Minimization algorithm and Newton-Raphson algorithm, for logistic regression on RHadoop that integrates R and Hadoop environment applicable to large scale data.
We compare the performance of these algorithms for estimation of logistic regression coefficients with real and simulated data sets. We also compare the performance of our RHadoop and RHIPE platforms.
The performance experiments showed that our Newton-Raphson algorithm when compared to Gradient Descent and Cost Minimization algorithms appeared to be better to all data tested, also showed that our RHadoop was better than RHIPE in real data, and was opposite in simulated data.
KW - Big data;Hadoop;Logistic regression;R;RHadoop
DO - 10.9708/jksci.2017.22.04.009
ER -
Byungho Jung and Dong Hoon Lim. (2017). Performance Comparison of Logistic Regression Algorithms on RHadoop. Journal of The Korea Society of Computer and Information, 22(4), 9-16.
Byungho Jung and Dong Hoon Lim. 2017, "Performance Comparison of Logistic Regression Algorithms on RHadoop", Journal of The Korea Society of Computer and Information, vol.22, no.4 pp.9-16. Available from: doi:10.9708/jksci.2017.22.04.009
Byungho Jung, Dong Hoon Lim "Performance Comparison of Logistic Regression Algorithms on RHadoop" Journal of The Korea Society of Computer and Information 22.4 pp.9-16 (2017) : 9.
Byungho Jung, Dong Hoon Lim. Performance Comparison of Logistic Regression Algorithms on RHadoop. 2017; 22(4), 9-16. Available from: doi:10.9708/jksci.2017.22.04.009
Byungho Jung and Dong Hoon Lim. "Performance Comparison of Logistic Regression Algorithms on RHadoop" Journal of The Korea Society of Computer and Information 22, no.4 (2017) : 9-16.doi: 10.9708/jksci.2017.22.04.009
Byungho Jung; Dong Hoon Lim. Performance Comparison of Logistic Regression Algorithms on RHadoop. Journal of The Korea Society of Computer and Information, 22(4), 9-16. doi: 10.9708/jksci.2017.22.04.009
Byungho Jung; Dong Hoon Lim. Performance Comparison of Logistic Regression Algorithms on RHadoop. Journal of The Korea Society of Computer and Information. 2017; 22(4) 9-16. doi: 10.9708/jksci.2017.22.04.009
Byungho Jung, Dong Hoon Lim. Performance Comparison of Logistic Regression Algorithms on RHadoop. 2017; 22(4), 9-16. Available from: doi:10.9708/jksci.2017.22.04.009
Byungho Jung and Dong Hoon Lim. "Performance Comparison of Logistic Regression Algorithms on RHadoop" Journal of The Korea Society of Computer and Information 22, no.4 (2017) : 9-16.doi: 10.9708/jksci.2017.22.04.009