본문 바로가기
  • Home

MLM-based Misrecognized Word Correction for Speech Recognition

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2025, 30(11), pp.79~89
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : September 22, 2025
  • Accepted : October 24, 2025
  • Published : November 28, 2025

Yonghun Jang 1 Jung Min Lim 2 Seong-Guk Nam 1 Minhyung Ryu 1 Eunjin Yoo 1 Myung-Sub Lee 3 Jong Wook Kwak 2

1니어네트웍스
2영남대학교
3영남이공대학교

Accredited

ABSTRACT

In this study, we propose an integrated approach to improving the accuracy of Korean speech recognition by addressing phonetic similarity-induced misrecognitions. The proposed system combines three key components: (1) enhancing the signal-to-noise ratio through frequency-domain noise reduction using Minimum Mean Square Error (MMSE)-based log-spectral estimation and a high-pass emphasis filter, (2) detecting contextually inappropriate words using KoBERT-based Masked Language Modeling (MLM), and (3) selecting the final correction word using Jamo-level Levenshtein Distance, which reflects the phonetic characteristics of the Korean language. In an experiment conducted on 1,000 Korean sentences containing misrecognized words, the proposed method reduced the Word Error Rate (WER) from 9.2% to 4.7% compared to the baseline. In addition, the proposed method achieved a maximum detection accuracy of 96.4% for misrecognized words. In conclusion, the proposed method was verified to significantly improve the performance of real-world speech recognition systems.

Citation status

* References for papers published after 2024 are currently being built.