본문 바로가기
  • Home

A Study on Patent Literature Classification Using Distributed Representation of Technical Terms

  • Journal of the Korean Society for Library and Information Science
  • 2019, 53(2), pp.179-199
  • DOI : 10.4275/KSLIS.2019.53.2.179
  • Publisher : 한국문헌정보학회
  • Research Area : Interdisciplinary Studies > Library and Information Science
  • Received : March 28, 2019
  • Accepted : May 20, 2019
  • Published : May 31, 2019

Yunsoo Choi 1 Sung-Pil Choi 2

1경기대학교 일반대학원 문헌정보학과
2경기대학교

Excellent Accredited

ABSTRACT

In this paper, we propose optimal methodologies for classifying patent literature by examining various feature extraction methods, machine learning and deep learning models, and provide optimal performance through experiments. We compared the traditional BoW method and a distributed representation method (word embedding vector) as a feature extraction, and compared the morphological analysis and multi gram as the method of constructing the document collection. In addition, classification performance was verified using traditional machine learning model and deep learning model. Experimental results show that the best performance is achieved when we apply the deep learning model with distributed representation and morphological analysis based feature extraction. In Section, Class and Subclass classification experiments, We improved the performance by 5.71%, 18.84% and 21.53%, respectively, compared with traditional classification methods.

Citation status

* References for papers published after 2023 are currently being built.