A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs) (서지 메타데이터 자동 생성 성능 비교 연구 - 국내외 대규모 언어모델(Large Language Model)을 중심으로 -)

Kim SeonWook (김선욱); LeeHyekyung (이혜경)

doi:10.14699//kbiblia.2025.36.4.303

@article{ART003279192},
author={Kim SeonWook and LeeHyekyung},
title={A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs)},
journal={Journal of the Korean Biblia Society for Library and Information Science},
issn={1229-2435},
year={2025},
volume={36},
number={4},
pages={303-331},
doi={10.14699//kbiblia.2025.36.4.303}

TY - JOUR
AU - Kim SeonWook
AU - LeeHyekyung
TI - A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs)
JO - Journal of the Korean Biblia Society for Library and Information Science
PY - 2025
VL - 36
IS - 4
PB - Journal Of The Korean Biblia Society For Library And Information Science
SP - 303
EP - 331
SN - 1229-2435
AB - This study aims to examine the feasibility of using domestic sovereign AI models and global large language models (LLMs) for automated creation of library metadata by comparing their performance in MARC record generation. To this end, six generative AI models (GPT, Gemini, Grok, HyperCLOVA, EXAONE, and A.X) were used to generate MARC records for 40 domestic and foreign monographs, and their field-level performance was evaluated using three criteria: completeness, correctness, and rule compliance. The analysis showed, first, that the three global LLMs (GPT, Gemini, Grok) generally outperformed domestic sovereign AI models, with fewer missing fields and more stable handling of formal elements such as indicators and codes. However, their performance tended to decline when the cataloguing target shifted from English-language to Korean books, as errors increased in field configuration and statement of responsibility. Second, the domestic sovereign AI models (HyperCLOVA, EXAONE, A.X) exhibited relatively low overall performance in both MARC21 and KORMARC, and did not show clear performance gains even for Korean books. Third, at the field level, most models generated relatively stable results for title and statement of responsibility (245), whereas rule-dependent fields such as series statements (490/830) and the choice of main entry showed large performance gaps between models and revealed structural misunderstandings of cataloguing rules for example, mechanically transferring MARC21 practices for series treatment to KORMARC. These findings suggest that, at present, generative AI should be introduced into library metadata workflows primarily as an assistive tool for generating draft records and supporting error detection and correction, rather than as a fully automated cataloguing system. The results also indicate that, in order to ensure stable performance of domestic sovereign AI models, systematic training on Korean bibliographic data, including KORMARC records, is required. Furthermore, the careful selection and curation of training data emerges as a key task in building sovereign AI systems for library applications.
KW - Generative AI;Sovereign AI;Automatic Metadata Generation;Korean Machine Readable Cataloging Format;KORMARC;MARC21
DO - 10.14699//kbiblia.2025.36.4.303
ER -

Kim SeonWook and LeeHyekyung. (2025). A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs). Journal of the Korean Biblia Society for Library and Information Science, 36(4), 303-331.

Kim SeonWook and LeeHyekyung. 2025, "A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs)", Journal of the Korean Biblia Society for Library and Information Science, vol.36, no.4 pp.303-331. Available from: doi:10.14699//kbiblia.2025.36.4.303

Kim SeonWook, LeeHyekyung "A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs)" Journal of the Korean Biblia Society for Library and Information Science 36.4 pp.303-331 (2025) : 303.

Kim SeonWook, LeeHyekyung. A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs). 2025; 36(4), 303-331. Available from: doi:10.14699//kbiblia.2025.36.4.303

Kim SeonWook and LeeHyekyung. "A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs)" Journal of the Korean Biblia Society for Library and Information Science 36, no.4 (2025) : 303-331.doi: 10.14699//kbiblia.2025.36.4.303

Kim SeonWook; LeeHyekyung. A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs). Journal of the Korean Biblia Society for Library and Information Science, 36(4), 303-331. doi: 10.14699//kbiblia.2025.36.4.303

Kim SeonWook; LeeHyekyung. A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs). Journal of the Korean Biblia Society for Library and Information Science. 2025; 36(4) 303-331. doi: 10.14699//kbiblia.2025.36.4.303

Kim SeonWook, LeeHyekyung. A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs). 2025; 36(4), 303-331. Available from: doi:10.14699//kbiblia.2025.36.4.303

Kim SeonWook and LeeHyekyung. "A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs)" Journal of the Korean Biblia Society for Library and Information Science 36, no.4 (2025) : 303-331.doi: 10.14699//kbiblia.2025.36.4.303

KJCKorea
Journal Central

Journal of the Korean Biblia Society for Library and Information Science 2024 KCI Impact Factor : 1.0

A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs)

ABSTRACT

KEYWORDS

Citation status

* References for papers published after 2024 are currently being built.

Journal of the Korean Biblia Society for Library and Information Science 2024 KCI Impact Factor : 1.0

A Comparative Study of Automatic Bibliographic Metadata Generation Performance: Focusing on Domestic and International Large Language Models (LLMs)

ABSTRACT

KEYWORDS

Statistics

Tools

Issue List

Citation status

KCI Citation Counts (0)

REFERENCES (0) * References for papers published after 2024 are currently being built.

Search PDF

Citation

* References for papers published after 2024 are currently being built.