본문 바로가기
  • Home

Dimensionality Reduction of Feature Set for API Call based Android Malware Classification

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2021, 26(11), pp.41-49
  • DOI : 10.9708/jksci.2021.26.11.041
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : October 25, 2021
  • Accepted : November 12, 2021
  • Published : November 30, 2021

Hee-Jin Hwang 1 Soojin Lee 1

1국방대학교

Accredited

ABSTRACT

All application programs, including malware, call the Application Programming Interface (API) upon execution. Recently, using those characteristics, attempts to detect and classify malware based on API Call information have been actively studied. However, datasets containing API Call information require a large amount of computational cost and processing time. In addition, information that does not significantly affect the classification of malware may affect the classification accuracy of the learning model. Therefore, in this paper, we propose a method of extracting a essential feature set after reducing the dimensionality of API Call information by applying various feature selection methods. We used CICAndMal2020, a recently announced Android malware dataset, for the experiment. After extracting the essential feature set through various feature selection methods, Android malware classification was conducted using CNN (Convolutional Neural Network) and the results were analyzed. The results showed that the selected feature set or weight priority varies according to the feature selection methods. And, in the case of binary classification, malware was classified with 97% accuracy even if the feature set was reduced to 15% of the total size. In the case of multiclass classification, an average accuracy of 83% was achieved while reducing the feature set to 8% of the total size.

Citation status

* References for papers published after 2022 are currently being built.