본문 바로가기
  • Home

A Study on Statistical Feature Selection with Supervised Learning for Word Sense Disambiguation

  • Journal of the Korean Biblia Society for Library and Information Science
  • 2011, 22(2), pp.5-25
  • Publisher : Journal Of The Korean Biblia Society For Library And Information Science
  • Research Area : Interdisciplinary Studies > Library and Information Science

LeeYong-Gu 1

1계명대학교

Accredited

ABSTRACT

This study aims to identify the most effective statistical feature selecting method and context window size for word sense disambiguation using supervised methods. In this study, features were selected by four different methods: information gain, document frequency, chi-square, and relevancy. The result of weight comparison showed that identifying the most appropriate features could improve word sense disambiguation performance. Information gain was the highest. SVM classifier was not affected by feature selection and showed better performance in a larger feature set and context size. Naive Bayes classifier was the best performance on 10 percent of feature set size. kNN classifier on under 10 percent of feature set size. When feature selection methods are applied to word sense disambiguation, combinations of a small set of features and larger context window size, or a large set of features and small context windows size can make best performance improvements.

Citation status

* References for papers published after 2022 are currently being built.