본문 바로가기
  • Home

Detection of Malicious PDF based on Document Structure Features and Stream Object

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2018, 23(11), pp.85-93
  • DOI : 10.9708/jksci.2018.23.11.085
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : October 1, 2018
  • Accepted : November 1, 2018
  • Published : November 30, 2018

Ah Reum Kang 1 Young-Seob Jeong 1 Se Lyeong Kim 2 Jonghyun Kim 3 Jiyoung Woo 1 Sunoh Choi ORD ID 3

1순천향대학교
2한국인터넷진흥원
3한국전자통신연구원

Accredited

ABSTRACT

In recent years, there has been an increasing number of ways to distribute document-based malicious code using vulnerabilities in document files. Because document type malware is not an executable file itself, it is easy to bypass existing security programs, so research on a model to detect it is necessary. In this study, we extract main features from the document structure and the JavaScript contained in the stream object In addition, when JavaScript is inserted, keywords with high occurrence frequency in malicious code such as function name, reserved word and the readable string in the script are extracted. Then, we generate a machine learning model that can distinguish between normal and malicious. In order to make it difficult to bypass, we try to achieve good performance in a black box type algorithm. For an experiment, a large amount of documents compared to previous studies is analyzed. Experimental results show 98.9% detection rate from three different type algorithms. SVM, which is a black box type algorithm and makes obfuscation difficult, shows much higher performance than in previous studies.

Citation status

* References for papers published after 2022 are currently being built.