본문 바로가기
  • Home

Error Correction for Korean Speech Recognition using a LSTM-based Sequence-to-Sequence Model

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2021, 26(10), pp.1-7
  • DOI : 10.9708/jksci.2021.26.10.001
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : June 23, 2021
  • Accepted : August 31, 2021
  • Published : October 29, 2021

Hye-won Jin 1 A-Hyeon Lee 1 Ye-Jin Chae 1 Su-Hyun Park 1 Yu-Jin Kang 1 Soowon Lee 1

1숭실대학교

Accredited

ABSTRACT

Recently, since most of the research on correcting speech recognition errors is based on English, there is not enough research on Korean speech recognition. Compared to English speech recognition, however, Korean speech recognition has many errors due to the linguistic characteristics of Korean language, such as Korean Fortis and Korean Liaison, thus research on Korean speech recognition is needed. Furthermore, earlier works primarily focused on editorial distance algorithms and syllable restoration rules, making it difficult to correct the error types of Korean Fortis and Korean Liaison. In this paper, we propose a context-sensitive post-processing model of speech recognition using a LSTM-based sequence-to-sequence model and Bahdanau attention mechanism to correct Korean speech recognition errors caused by the pronunciation. Experiments showed that by using the model, the speech recognition performance was improved from 64% to 77% for Fortis, 74% to 90% for Liaison, and from 69% to 84% for average recognition than before. Based on the results, it seems possible to apply the proposed model to real-world applications based on speech recognition.

Citation status

* References for papers published after 2023 are currently being built.

This paper was written with support from the National Research Foundation of Korea.