Recently, studies on the detection and classification of Android malware based on API Call sequence have been actively carried out. However, API Call sequence based malware classification has serious limitations such as excessive time and resource consumption in terms of malware analysis and learning model construction due to the vast amount of data and high-dimensional characteristic of features. In this study, we analyzed various classification models such as LightGBM, Random Forest, and k-Nearest Neighbors after significantly reducing the dimension of features using PCA(Principal Component Analysis) for CICAndMal2020 dataset containing vast API Call information. The experimental result shows that PCA significantly reduces the dimension of features while maintaining the characteristics of the original data and achieves efficient malware classification performance. Both binary classification and multi-class classification achieve higher levels of accuracy than previous studies, even if the data characteristics were reduced to less than 1% of the total size.
@article{ART002899738}, author={Dong-Ha Jeon and Soojin Lee}, title={Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA}, journal={Journal of The Korea Society of Computer and Information}, issn={1598-849X}, year={2022}, volume={27}, number={11}, pages={123-130}, doi={10.9708/jksci.2022.27.11.123}
TY - JOUR AU - Dong-Ha Jeon AU - Soojin Lee TI - Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA JO - Journal of The Korea Society of Computer and Information PY - 2022 VL - 27 IS - 11 PB - The Korean Society Of Computer And Information SP - 123 EP - 130 SN - 1598-849X AB - Recently, studies on the detection and classification of Android malware based on API Call sequence have been actively carried out. However, API Call sequence based malware classification has serious limitations such as excessive time and resource consumption in terms of malware analysis and learning model construction due to the vast amount of data and high-dimensional characteristic of features. In this study, we analyzed various classification models such as LightGBM, Random Forest, and k-Nearest Neighbors after significantly reducing the dimension of features using PCA(Principal Component Analysis) for CICAndMal2020 dataset containing vast API Call information. The experimental result shows that PCA significantly reduces the dimension of features while maintaining the characteristics of the original data and achieves efficient malware classification performance. Both binary classification and multi-class classification achieve higher levels of accuracy than previous studies, even if the data characteristics were reduced to less than 1% of the total size. KW - API-Call;PCA;Dimensional Reduction;LGBM;RF;KNN DO - 10.9708/jksci.2022.27.11.123 ER -
Dong-Ha Jeon and Soojin Lee. (2022). Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA. Journal of The Korea Society of Computer and Information, 27(11), 123-130.
Dong-Ha Jeon and Soojin Lee. 2022, "Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA", Journal of The Korea Society of Computer and Information, vol.27, no.11 pp.123-130. Available from: doi:10.9708/jksci.2022.27.11.123
Dong-Ha Jeon, Soojin Lee "Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA" Journal of The Korea Society of Computer and Information 27.11 pp.123-130 (2022) : 123.
Dong-Ha Jeon, Soojin Lee. Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA. 2022; 27(11), 123-130. Available from: doi:10.9708/jksci.2022.27.11.123
Dong-Ha Jeon and Soojin Lee. "Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA" Journal of The Korea Society of Computer and Information 27, no.11 (2022) : 123-130.doi: 10.9708/jksci.2022.27.11.123
Dong-Ha Jeon; Soojin Lee. Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA. Journal of The Korea Society of Computer and Information, 27(11), 123-130. doi: 10.9708/jksci.2022.27.11.123
Dong-Ha Jeon; Soojin Lee. Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA. Journal of The Korea Society of Computer and Information. 2022; 27(11) 123-130. doi: 10.9708/jksci.2022.27.11.123
Dong-Ha Jeon, Soojin Lee. Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA. 2022; 27(11), 123-130. Available from: doi:10.9708/jksci.2022.27.11.123
Dong-Ha Jeon and Soojin Lee. "Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA" Journal of The Korea Society of Computer and Information 27, no.11 (2022) : 123-130.doi: 10.9708/jksci.2022.27.11.123