본문 바로가기
  • Home

A Study on Feature Selection for kNN Classifier using Document Frequency and Collection Frequency

  • Journal of Korean Library and Information Science Society
  • Abbr : JKLISS
  • 2013, 44(1), pp.27-47
  • DOI : 10.16981/kliss.44.1.201303.27
  • Publisher : Korean Library And Information Science Society
  • Research Area : Interdisciplinary Studies > Library and Information Science

LeeYong-Gu 1

1계명대학교

Accredited

ABSTRACT

This study investigated the classification performance of a kNN classifier using the feature selection methods based on document frequency(DF) and collection frequency(CF). The results of the experiments, which used HKIB-20000 data, were as follows. First, the feature selection methods that used high-frequency terms and removed low-frequency terms by the CF criterion achieved better classification performance than those using the DF criterion. Second, neither DF nor CF methods performed well when low-frequency terms were selected first in the feature selection process. Last, combining CF and DF criteria did not result in better classification performance than using the single feature selection criterion of DF or CF.

Citation status

* References for papers published after 2023 are currently being built.