본문 바로가기
  • Home

A Study on Automatic Classification of Subject Headings Using BERT Model

  • Journal of the Korean Society for Library and Information Science
  • 2023, 57(2), pp.435-452
  • DOI : 10.4275/KSLIS.2023.57.2.435
  • Publisher : 한국문헌정보학회
  • Research Area : Interdisciplinary Studies > Library and Information Science
  • Received : April 21, 2023
  • Accepted : May 19, 2023
  • Published : May 31, 2023

LeeYong-Gu 1

1경북대학교

Excellent Accredited

ABSTRACT

This study experimented with automatic classification of subject headings using BERT-based transfer learning model, and analyzed its performance. This study analyzed the classification performance according to the main class of KDC classification and the category type of subject headings. Six datasets were constructed from Korean national bibliographies based on the frequency of the assignments of subject headings, and titles were used as classification features. As a result, classification performance showed values of 0.6059 and 0.5626 on the micro F1 and macro F1 score, respectively, in the dataset (1,539,076 records) containing 3,506 subject headings. In addition, classification performance by the main class of KDC classification showed good performance in the class General works, Natural science, Technology and Language, and low performance in Religion and Arts. As for the performance by the category type of the subject headings, the categories of plant, legal name and product name showed high performance, whereas national treasure/treasure category showed low performance. In a large dataset, the ratio of subject headings that cannot be assigned increases, resulting in a decrease in final performance, and improvement is needed to increase classification performance for low-frequency subject headings.

Citation status

* References for papers published after 2022 are currently being built.