본문 바로가기
  • Home

Classification of Infant Crying Audio based on 3D Feature-Vector through Audio Data Augmentation

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2023, 28(9), pp.47-54
  • DOI : 10.9708/jksci.2023.28.09.047
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : August 25, 2023
  • Accepted : September 15, 2023
  • Published : September 27, 2023

JeongHyeon Park 1 JunHyeok Go 1 SiUng Kim 1 Nammee Moon 1

1호서대학교

Accredited

ABSTRACT

Infants utilize crying as a non-verbal means of communication [1]. However, deciphering infant cries presents challenges. Extensive research has been conducted to interpret infant cry audios [2,3]. This paper proposes the classification of infant cries using 3D feature vectors augmented with various audio data techniques. A total of 5 classes (belly pain, burping, discomfort, hungry, tired) are employed in the study dataset. The data is augmented using 5 techniques (Pitch, Tempo, Shift, Mixup-noise, CutMix). Tempo, Shift, and CutMix augmentation techniques demonstrated improved performance. Ultimately, applying effective data augmentation techniques simultaneously resulted in a 17.75% performance enhancement compared to models using single feature vectors and original data.

Citation status

* References for papers published after 2022 are currently being built.

This paper was written with support from the National Research Foundation of Korea.