Sound Classification Performance of Deep Neural Networks in the Presence of Disturbing Sounds (딥러닝을 이용한 소리 분류 시 방해음의 영향 분석)

Wongeun Oh (오원근); Lim, Dong Kyun (임동균)

doi:10.34163/jkits.2020.15.6.006

Sound Classification Performance of Deep Neural Networks in the Presence of Disturbing Sounds

Journal of Knowledge Information Technology and Systems
Abbr : JKITS
2020, 15(6), pp.973~981
DOI : 10.34163/jkits.2020.15.6.006
Publisher : Korea Knowledge Information Technology Society
Research Area : Interdisciplinary Studies > Interdisciplinary Research
Received : November 27, 2020
Accepted : December 11, 2020
Published : December 31, 2020

Wongeun Oh ¹, Lim, Dong Kyun ²

¹순천대학교
²한양사이버대학교

Accredited

ABSTRACT

Environmental sound classification is an area that automatically classifies sounds in our surroundings. It can be applied to home automation, security, and surveillance. Recently, the deep learning approaches have been adopted as a classifier for increasing performance. In this method, a deep neural network is trained using many sound data, and after the learning is completed, the microphone pickup sound is applied for classification. However, during this stage, the ambient noise can be put into the microphone along with the sound to be identified. And the sound cannot be properly classified due to this disturbing noise. The recognition rate of the deep neural network decreases as the loudness of the disturbing sound increases, but the analysis about the noise effect on the classification have been limited. In this paper, we present the effect of the disturbing noise on the classification rate. For this purpose, UrbanSound8K, which is composed of 10 types of urban environmental sounds, is used for training and test data. And the VGG16-based CNN which shows good performance in image classification was adopted as a baseline model. For the disturbing noise, we use three types of sounds that are consist of daily noises (hairdryer, vacuum cleaner, faucet water, and hammer), voices(male, female, and synthesized sound), and music(cello, piano, and trumpet). In the experiment, these disturbing noises are mixed with the clean sounds so that the signal-to-noise ratio is in the range of -50dB to 50dB. Then the mixed sound was applied to a deep neural network to obtain a relative recognition rate when compared to the clean cases. The results show that the recognition rate is more than 90% compared to the clean sound cases when the SNR is between 10 and 15 dB, and 95% or more when the SNR is greater than 20 dB, regardless of the type of the disturbing sound.

KEYWORDS

Environmental sound classification, Noise, CNN, VGG16, UrbanSound8K, Signal to noise ratio

Citation status

* References for papers published after 2024 are currently being built.

[journal] S. Abdoli / 2019 / End-to-end environmental sound classification using a 1D convolutional neural network / Expert Systems with Applications 136 : 252~263

[confproc] B. Zhu / 2018 / Learning environmental sounds with multi-scale convolutional neural network / Proceedings of the International Joint Conference on Neural Networks : 1~8

[journal] V. Boddapati / 2017 / Classifying environmental sounds using image recognition networks / Procedia Computer Science 112 : 2048~2056

[confproc] Y. Tokozume / 2017 / Learning environmental sounds with end-to-end convolutional neural network / 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing / : 2721~2725

[journal] D. Chong / 2019 / Multi-channel convolutional neural networks with multi-level feature fusion for environmental sound classification / Lecture Notes in Computer Science : 157~168

[confproc] K. J. Piczak / 2015 / Environmental sound classification with convolutional neural networks / Proceeding of IEEE 25th International Workshop on Machine Learning for Signal Processing : 1~6

[journal] Y. Su / 2019 / Environment sound classification using a two-stream CNN based on decision-level fusion / Sensors 19 : 1733~1747

[journal] 서상민 / 2019 / Effective Implementation for Fast Deep Learning Algorithm / 한국지식정보기술학회 논문지 / 한국지식정보기술학회 14(5) : 553~561

[journal] 강민정 / 2020 / Development of the Deep Learning System for Bird Classification Using Birdsong / 한국지식정보기술학회 논문지 / 한국지식정보기술학회 15(2) : 195~203

[confproc] J. Salamon / 2014 / A dataset and taxonomy for urban sound research / Proceedings of the 22nd ACM International Conference on Multimedia : 1041~1044

[other] K. Simonyan / 2015 / Very deep convolutional networks for large-scale image recognition /

[confproc] W. Oh / 2019 / Audio classification performance of CNN according to audio feature extraction methods / Proceedings of Acoustical Society of Korea : 64~

[journal] 오원근 / 2020 / Comparison of environmental sound classification performance of convolutional neural networks according to audio preprocessing methods / 한국음향학회지 / 한국음향학회 39(3) : 143~149

[journal] O. Russakovsky / 2015 / ImageNet large scale visual recognition challenge / International Journal of Computer Vision 115 : 211~252

[web] / 2020 / Freesound / https://freesound.org/

This paper was written with support from the National Research Foundation of Korea.

KJCKorea
Journal Central

Journal of Knowledge Information Technology and Systems KCI Impact Factor : 0.0

Sound Classification Performance of Deep Neural Networks in the Presence of Disturbing Sounds

ABSTRACT

KEYWORDS

Citation status

* References for papers published after 2024 are currently being built.

Journal of Knowledge Information Technology and Systems KCI Impact Factor : 0.0

Sound Classification Performance of Deep Neural Networks in the Presence of Disturbing Sounds

ABSTRACT

KEYWORDS

Statistics

Tools

Issue List

Citation status

KCI Citation Counts (3)

REFERENCES (15) * References for papers published after 2024 are currently being built.

Search PDF

Citation

* References for papers published after 2024 are currently being built.