@article{ART001103549},
author={Youngkee Kim},
title={Comparative Study of Feature Selection Methods for Korean Web Documents Clustering},
journal={Journal of the Korean Society for Library and Information Science},
issn={1225-598X},
year={2005},
volume={39},
number={1},
pages={45-58}
TY - JOUR
AU - Youngkee Kim
TI - Comparative Study of Feature Selection Methods for Korean Web Documents Clustering
JO - Journal of the Korean Society for Library and Information Science
PY - 2005
VL - 39
IS - 1
PB - 한국문헌정보학회
SP - 45
EP - 58
SN - 1225-598X
AB - This paper is a comparative study of feature selection methods for Korean web documents clustering. First, we
focused on how the term feature and the co-link of web documents affect clustering performance. We clustered web
documents by native term feature, co-link and both, and compared the output results with the originally allocated
category. And we selected term features for each category using X2, Information Gain (IG), and Mutual Information
(MI) from training documents, and applied these features to other experimental documents. In addition we suggested
a new method named Max Feature Selection, which selects terms that have the maximum count for a category in
each experimental document, and applied X2 (or MI or IG) values to each term instead of term frequency of
documents, and clustered them. In the results, X2 shows a better performance than IG or MI, but the difference
appears to be slight. But when we applied the Max Feature Selection Method, the clustering performance improved
notably. Max Feature Selection is a simple but effective means of feature space reduction and shows powerful
performance for Korean web document clustering.
KW - Clustering;Feature Selection Methods;Korean Web Documents;Max Feature Selection
DO -
UR -
ER -
Youngkee Kim. (2005). Comparative Study of Feature Selection Methods for Korean Web Documents Clustering. Journal of the Korean Society for Library and Information Science, 39(1), 45-58.
Youngkee Kim. 2005, "Comparative Study of Feature Selection Methods for Korean Web Documents Clustering", Journal of the Korean Society for Library and Information Science, vol.39, no.1 pp.45-58.
Youngkee Kim "Comparative Study of Feature Selection Methods for Korean Web Documents Clustering" Journal of the Korean Society for Library and Information Science 39.1 pp.45-58 (2005) : 45.
Youngkee Kim. Comparative Study of Feature Selection Methods for Korean Web Documents Clustering. 2005; 39(1), 45-58.
Youngkee Kim. "Comparative Study of Feature Selection Methods for Korean Web Documents Clustering" Journal of the Korean Society for Library and Information Science 39, no.1 (2005) : 45-58.
Youngkee Kim. Comparative Study of Feature Selection Methods for Korean Web Documents Clustering. Journal of the Korean Society for Library and Information Science, 39(1), 45-58.
Youngkee Kim. Comparative Study of Feature Selection Methods for Korean Web Documents Clustering. Journal of the Korean Society for Library and Information Science. 2005; 39(1) 45-58.
Youngkee Kim. Comparative Study of Feature Selection Methods for Korean Web Documents Clustering. 2005; 39(1), 45-58.
Youngkee Kim. "Comparative Study of Feature Selection Methods for Korean Web Documents Clustering" Journal of the Korean Society for Library and Information Science 39, no.1 (2005) : 45-58.