본문 바로가기
  • Home

UA Tree-based Reduction of Speech DB in a Large Corpus-based Korean TTS

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2010, 15(7), pp.91-98
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science

Jungchul Lee 1

1울산대학교

Accredited

ABSTRACT

Large corpus-based concatenating Text-to-Speech (TTS) systems can generate natural synthetic speech without additional signal processing. Because the improvements in the natualness, personality, speaking style, emotions of synthetic speech need the increase of the size of speech DB, it is necessary to prune the redundant speech segments in a large speech segment DB. In this paper, we propose a new method to construct a segmental speech DB for the Korean TTS system based on a clustering algorithm to downsize the segmental speech DB. For the performance test, the synthetic speech was generated using the Korean TTS system which consists of the language processing module, prosody processing module, segment selection module, speech concatenation module, and segmental speech DB. And MOS test was executed with the a set of synthetic speech generated with 4 different segmental speech DBs. We constructed 4 different segmental speech DB by combining CM1(or CM2) tree clustering method and full DB (or reduced DB). Experimental results show that the proposed method can reduce the size of speech DB by 23% and get high MOS in the perception test. Therefore the proposed method can be applied to make a small sized TTS.

Citation status

* References for papers published after 2023 are currently being built.