@article{ART003307492},
author={KANG WOOJIN and SANGO NA and Jongwook Lee},
title={A Study on Automatic DDC Classification of Documents in Technology},
journal={Journal of the Korean Society for Library and Information Science},
issn={1225-598X},
year={2026},
volume={60},
number={1},
pages={173-194}
TY - JOUR
AU - KANG WOOJIN
AU - SANGO NA
AU - Jongwook Lee
TI - A Study on Automatic DDC Classification of Documents in Technology
JO - Journal of the Korean Society for Library and Information Science
PY - 2026
VL - 60
IS - 1
PB - 한국문헌정보학회
SP - 173
EP - 194
SN - 1225-598X
AB - This study investigates the automatic classification of documents in the Dewey Decimal Classification (DDC) Technology class (600) using machine learning models, with the aim of overcoming the limitations of title-based classification approaches. To enhance classification performance, descriptive document information, such as summaries and introductions, was incorporated as additional classification features. Three machine learning models—Omikuji, FastText, and BERT—were employed, and classification performance was evaluated at both the main class and division levels. Accuracy and F1-score were used as evaluation metrics. The results demonstrate that BERT consistently outperformed FastText and Omikuji across most experimental conditions. With the exception of the division-level F1-score of the Omikuji model, all models showed improved performance when descriptive information was added. In particular, the BERT-based model achieved an accuracy of 79.52% at the division level, representing an improvement of approximately 8.62 percentage points compared to previous studies. The findings also indicate that classification performance generally improves as the volume of documents used in model training increases, underscoring the importance of data scale in addition to feature selection. These results suggest that competitive automatic classification performance can be achieved through appropriate model selection and enriched classification features, even within single-model approaches. Future research should expand the scope to all DDC classes and examine the applicability of the proposed approach to the Korean Decimal Classification (KDC), as well as explore additional features and alternative machine learning models.
KW - Automatic Classification;Dewey Decimal Classification (DDC);Technology;Machine Learning;Classification Features
DO -
UR -
ER -
KANG WOOJIN, SANGO NA and Jongwook Lee. (2026). A Study on Automatic DDC Classification of Documents in Technology. Journal of the Korean Society for Library and Information Science, 60(1), 173-194.
KANG WOOJIN, SANGO NA and Jongwook Lee. 2026, "A Study on Automatic DDC Classification of Documents in Technology", Journal of the Korean Society for Library and Information Science, vol.60, no.1 pp.173-194.
KANG WOOJIN, SANGO NA, Jongwook Lee "A Study on Automatic DDC Classification of Documents in Technology" Journal of the Korean Society for Library and Information Science 60.1 pp.173-194 (2026) : 173.
KANG WOOJIN, SANGO NA, Jongwook Lee. A Study on Automatic DDC Classification of Documents in Technology. 2026; 60(1), 173-194.
KANG WOOJIN, SANGO NA and Jongwook Lee. "A Study on Automatic DDC Classification of Documents in Technology" Journal of the Korean Society for Library and Information Science 60, no.1 (2026) : 173-194.
KANG WOOJIN; SANGO NA; Jongwook Lee. A Study on Automatic DDC Classification of Documents in Technology. Journal of the Korean Society for Library and Information Science, 60(1), 173-194.
KANG WOOJIN; SANGO NA; Jongwook Lee. A Study on Automatic DDC Classification of Documents in Technology. Journal of the Korean Society for Library and Information Science. 2026; 60(1) 173-194.
KANG WOOJIN, SANGO NA, Jongwook Lee. A Study on Automatic DDC Classification of Documents in Technology. 2026; 60(1), 173-194.
KANG WOOJIN, SANGO NA and Jongwook Lee. "A Study on Automatic DDC Classification of Documents in Technology" Journal of the Korean Society for Library and Information Science 60, no.1 (2026) : 173-194.