본문 바로가기
  • Home

Deep Learning-based Target Masking Scheme for Understanding Meaning of Newly Coined Words

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2021, 26(10), pp.157-165
  • DOI : 10.9708/jksci.2021.26.10.157
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : September 13, 2021
  • Accepted : October 6, 2021
  • Published : October 29, 2021

Gun-Min Nam 1 Namgyu Kim 1

1국민대학교

Accredited

ABSTRACT

Recently, studies using deep learning to analyze a large amount of text are being actively conducted. In particular, a pre-trained language model that applies the learning results of a large amount of text to the analysis of a specific domain text is attracting attention. Among various pre-trained language models, BERT(Bidirectional Encoder Representations from Transformers)-based model is the most widely used. Recently, research to improve the performance of analysis is being conducted through further pre-training using BERT's MLM(Masked Language Model). However, the traditional MLM has difficulties in clearly understands the meaning of sentences containing new words such as newly coined words. Therefore, in this study, we newly propose NTM(Newly coined words Target Masking), which performs masking only on new words. As a result of analyzing about 700,000 movie reviews of portal 'N' by applying the proposed methodology, it was confirmed that the proposed NTM showed superior performance in terms of accuracy of sensitivity analysis compared to the existing random masking.

Citation status

* References for papers published after 2022 are currently being built.