본문 바로가기
  • Home

A Corpus-based Hybrid Model for Morphological Analysis and Part-of-Speech Tagging

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2008, 13(7), pp.11-18
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science

Seung-Wook Lee 1 이도길 2 임해창 2

1고려대학교 정보통신대학원
2고려대학교

Accredited

ABSTRACT

Korean morphological analyzer generally generates multiple candidates, and then selects the most likely one among multiple candidates. As the number of candidates increases, the chance that the correctly analyzed candidate is included in the candidate list also grows. This process, however, increases ambiguity and then deteriorates the performance. In this paper, we propose a new rule-based model that produces one best analysis. The analysis rules are automatically extracted from large amount of Part-of-Speech tagged corpus, and the proposed model does not require any manual construction cost of analysis rules, and has shown high success rate of analysis. Futhermore, the proposed model can reduce the ambiguities and computational complexities in the candidate selection phase because the model produces one analysis when it can successfully analyze the given word. By combining the conventional probability-based model, the model can also improve the performance of analysis when it does not produce a successful analysis.

Citation status

* References for papers published after 2022 are currently being built.

This paper was written with support from the National Research Foundation of Korea.