Classification of Infant Crying Audio based on 3D Feature-Vector through Audio Data Augmentation (음성데이터 증강을 통한 3D 특징 벡터 기반 신생아 울음소리 분류)

JeongHyeon Park (박정현); JunHyeok Go (고준혁); SiUng Kim (김시웅); Nammee Moon (문남미)

doi:10.9708/jksci.2023.28.09.047

Classification of Infant Crying Audio based on 3D Feature-Vector through Audio Data Augmentation

Journal of The Korea Society of Computer and Information
Abbr : JKSCI
2023, 28(9), pp.47~54
DOI : 10.9708/jksci.2023.28.09.047
Publisher : The Korean Society Of Computer And Information
Research Area : Engineering > Computer Science
Received : August 25, 2023
Accepted : September 15, 2023
Published : September 27, 2023

JeongHyeon Park ¹, JunHyeok Go ¹, SiUng Kim ¹, Nammee Moon ¹

¹호서대학교

Accredited

ABSTRACT

Infants utilize crying as a non-verbal means of communication [1]. However, deciphering infant cries presents challenges. Extensive research has been conducted to interpret infant cry audios [2,3]. This paper proposes the classification of infant cries using 3D feature vectors augmented with various audio data techniques. A total of 5 classes (belly pain, burping, discomfort, hungry, tired) are employed in the study dataset. The data is augmented using 5 techniques (Pitch, Tempo, Shift, Mixup-noise, CutMix). Tempo, Shift, and CutMix augmentation techniques demonstrated improved performance. Ultimately, applying effective data augmentation techniques simultaneously resulted in a 17.75% performance enhancement compared to models using single feature vectors and original data.

KEYWORDS

3D Feature Vector, Data Augmentation, Infant, MFCC, Nonverbal sound

Citation status

* References for papers published after 2024 are currently being built.

[thesis] H. R Jang / 2012 / Acoustic characteristic of crying infants related to communicating intent / M.S / Yonsei University

[journal] Lichuan Liu / 2019 / Infant cry language analysis and recognition: an experimental approach / IEEE/CAA Journal of Automatica Sinica / Institute of Electrical and Electronics Engineers (IEEE) 6(3) : 778~788

[journal] Chunyan Ji / 2021 / A review of infant cry analysis and classification / EURASIP Journal on Audio, Speech, and Music Processing / Springer Science and Business Media LLC 2021(1)

[journal] 박경자 / 2007 / Children's Cortisol Patterning at ChildCare Centers / 아동학회지 / 한국아동학회 28(6) : 201~216

[journal] P. S. Zeskind / 1978 / coustic features and auditory perceptions of the cries of newborns with prenatal and perinatal complications / Child Dev 49(3) : 580~589

[journal] T. Murry / 1977 / Acoustical characteristic of infant cries : fundamental frequency / Child Lang 4(3) : 321~328

[web] / 2019 / WhyCry Technology / http://www.why-cry.com

[web] Priscilla Dunstan / dunstan baby language / https://dunstan-babies.com/

[confproc] C . A. Bratan / 2021 / Dunstan Baby Language Classification with CNN / 2021International Conference on Speech Technology and Human-Computer Dialogue (SpeD) / IEEE : 2021~

[journal] 강민정 / 2020 / Development of the Deep Learning System for Bird Classification Using Birdsong / 한국지식정보기술학회 논문지 / 한국지식정보기술학회 15(2) : 195~203

[journal] 임신철 / 2011 / Multiple octave-band based genre classification algorithm for music recommendation / 한국정보통신학회논문지 / 한국정보통신학회 15(7) : 1487~1494

[web] H. Zhang / 2018 / mixup:Beyond Empirical Risk Minimization / arXiv:1710.09412v2

[web] S. D. Yun / 2019 / CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features / arXiv:1905.04899v2

[journal] 최효정 / 2021 / Data augmentation in voice spoofing problem / 응용통계연구 / 한국통계학회 34(3) : 449~460

[journal] 이승관 / 2019 / Data Augmentation for DNN-based Speech Enhancement / 멀티미디어학회논문지 / 한국멀티미디어학회 22(7) : 749~758

[confproc] YUAN YUAN Wang / 2010 / Speech synthesis based on PSOLA algorithm and modified pitch parameters / International Conference on Computational Problem-Solving / IEEE : 296~299

[journal] 임재덕 / 2009 / 콘텐츠 분류를 위한 오디오 신호 특징 추출 기술 / 전자통신동향분석 / 한국전자통신연구원 24(6) : 121~

[journal] Junhee Park / 2022 / Design and Implementation of Attention Depression Detection Model Based on Multimodal Analysis / Sustainability / MDPI AG 14(6) : 3569~

[web] / Donate a cry-corpus / https://github.com/gveres/donateacry-corpus

This paper was written with support from the National Research Foundation of Korea.

KJCKorea
Journal Central

Journal of The Korea Society of Computer and Information 2024 KCI Impact Factor : 0.81

Classification of Infant Crying Audio based on 3D Feature-Vector through Audio Data Augmentation

ABSTRACT

KEYWORDS

Citation status

* References for papers published after 2024 are currently being built.

Journal of The Korea Society of Computer and Information 2024 KCI Impact Factor : 0.81

Classification of Infant Crying Audio based on 3D Feature-Vector through Audio Data Augmentation

ABSTRACT

KEYWORDS

Statistics

Tools

Issue List

Citation status

KCI Citation Counts (0)

REFERENCES (19) * References for papers published after 2024 are currently being built.

Search PDF

Citation

* References for papers published after 2024 are currently being built.