본문 바로가기
  • Home

Automatic Construction of Korean Unknown Word Dictionary using Occurrence Frequency in Web Documents

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2008, 13(3), pp.27-34
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science

박소영 1

1상명대학교

Accredited

ABSTRACT

In this paper, we propose a method of automatically constructing a dictionary by extracting unknown words from given eojeols in order to improve the performance of a Korean morphological analyzer. The proposed method is composed of a dictionary construction phase based on full text analysis and a dictionary construction phase based on web document frequency. The first phase recognizes unknown words from strings repeatedly occurred in a given full text while the second phase recognizes unknown words based on frequency of retrieving each string, once occurred in the text, from web documents. Experimental results show that the proposed method improves 32.39% recall by utilizing web document frequency compared with a previous method.

Citation status

* References for papers published after 2023 are currently being built.