본문 바로가기
  • Home

Named entity normalization for traditional herbal formula mentions

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2024, 29(10), pp.105-111
  • DOI : 10.9708/jksci.2024.29.10.105
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : September 3, 2024
  • Accepted : October 4, 2024
  • Published : October 31, 2024

Jang Ho 1

1한국한의학연구원

Accredited

ABSTRACT

In this paper, we propose methods for the named entity normalization of traditional herbal formula found in medical texts. Specifically, we developed methodologies to determine whether mentions, such as full names of herbal formula and their abbreviations, refer to the same concept. Two different approaches were attempted. First, we built a supervised classification model that uses BERT-based contextual vectors and character similarity features of herbal formula mentions in medical texts to determine whether two mentions are identical. Second, we applied a prompt-based querying method using GPT-4o mini and GPT-4o to perform the same task. Both methods achieved over 0.9 in Precision, Recall, and F1-score, with the GPT-4o-based approach demonstrating the highest Precision and F1-Score. The results of this study demonstrate the effectiveness of machine learning-based approaches for named entity normalization in traditional medicine texts, with the GPT-4o-based method showing superior performance. This suggests its potential as a valuable foundation for the development of intelligent information extraction systems in the traditional medicine domain.

Citation status

* References for papers published after 2023 are currently being built.

This paper was written with support from the National Research Foundation of Korea.