Prosody Improvement for HMM-based Mandarin Speech Synthesis Using the Tone Nucleus Model
スポンサーリンク
概要
- 論文の詳細を見る
The HMM-based Text-to-Speech System has attracted great interest due to its compact and flexible modeling of spectral, F0 and duration parameters. The synthesized speech is highly dependent on the context model. However, the complex F0 variations make it rather difficult to define the tone type of Mandarin continuous speech. Then the F0 and duration trajectories, generated by HMM-based speech synthesis are often excessively smoothed and lack of prosodic variance. Tone nucleus of a syllable is assumed to be the target F0 of the associated lexical tone, and usually conforms more likely to the standard tone pattern. In this paper, by modeling F0 variations at different levels ranging from segmental factors to tone co-articulations, and apply the tone nucleus model to HMM-based Mandarin speech synthesis.
- 2011-07-14
著者
-
Keikichi Hirose
Department Of Information And Communication Engineering The University Of Tokyo
-
Nobuaki Minematsu
Department Of Information And Communication Engineering The University Of Tokyo
-
Miaomiao Wang
Department of Electrical Engineering and Information Systems, the University of Tokyo
-
Miaomiao Wen
Department of Electrical Engineering and Information Systems, the University of Tokyo
-
Miaomiao Wang
Department Of Electrical Engineering And Information Systems The University Of Tokyo
関連論文
- An Investigation of Hidden Structure Model
- Prosody Conversion for Emotional Mandarin Speech Synthesis Using the Tone Nucleus Model
- Prosody Improvement for HMM-based Mandarin Speech Synthesis Using the Tone Nucleus Model
- A Preliminary Perceptual Analysis on the Relationship of Phoneme Duration and Speaking Rate
- 言語クラスタリングと自動単位選抜による波形重畳音声合成