본문 바로가기
  • Home

Improving Deep Learning Performance on Imbalanced Medical Data Using Natural Language Data Augmentation Technique

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2025, 30(6), pp.11~20
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : April 3, 2025
  • Accepted : May 28, 2025
  • Published : June 30, 2025

Tae-Hyeong Kwon 1 Dae-Ho Kim 1 Se Young Kim 2 Ok-Ran Jeong 1

1가천대학교
2국립창원대학교

Accredited

ABSTRACT

In this study, we developed a model to support nurse decision-making using Korean nursing record data and explored methods to enhance performance by applying data augmentation techniques. Previous research primarily focused on English medical data, resulting in a lack of studies on Korean medical data. To address this gap, we utilized electronic medical record (EMR) data from abdominal surgery patients and developed a KoBERT-based model for predicting nursing actions. Additionally, we applied techniques such as up/down sampling, few-shot augmentation, back-translation, and synonym replacement to mitigate data imbalance and compared their performance. Experimental results show that the Few-shot Augmentation achieved the highest performance, confirming that data augmentation is effective in increasing the diversity of EMR data.

Citation status

* References for papers published after 2023 are currently being built.

This paper was written with support from the National Research Foundation of Korea.