A context clustering technique for improvement of tone intelligibility of average-voice-based Thai speech synthesis (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
スポンサーリンク
概要
- 論文の詳細を見る
This paper describes a novel approach to the context clustering process in a speaker independent HMM-based Thai speech synthesis for improvement of the tone intelligibility of the average voice and also the speaker adapted voice. In our previous work, phrase intonation features extracted from a generative model were proposed to improve the tone intelligibility. In the present work, we propose a number of tonal features including tone-geometrical features and phrase intonation features to be exploited in the context clustering process of HMM training stage. In experiments, subjective evaluations of both average voice and adapted voice in terms of the intelligibility of tone are conducted. Effects on decision trees of the extracted features are also evaluated. By considering gender in training speech, two core experiments were conducted. The first experiment shows that the proposed tonal features can improve the tone intelligibility for female speech model above that of male speech model, while the second experiment shows that the proposed tonal features give the better improvement of the tone intelligibility for gender dependent model than for gender independent model. Both experimental results confirm that the tone correctness of the synthesized speech is significantly improved when using most of the extracted features.
- 一般社団法人電子情報通信学会の論文
- 2008-03-13
著者
-
Kobayashi Takao
Interdisciplinary Graduate School Of Science And Engineering Tokyo Institute Of Technology
-
Chomphan Suphattharachai
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
関連論文
- A Style Control Technique for HMM-Based Expressive Speech Synthesis(Speech and Hearing)
- A Style Adaptation Technique for Speech Synthesis Using HSMM and Suprasegmental Features(Speech Synthesis, Statistical Modeling for Speech Processing)
- Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing(Life-like Agent and its Communication)
- Acoustic Modeling of Speaking Styles and Emotional Expressions in HMM-Based Speech Synthesis(Speech Synthesis and Prosody, Corpus-Based Speech Technologies)
- A Hidden Semi-Markov Model-Based Speech Synthesis System(Speech and Hearing)
- State Duration Modeling for HMM-Based Speech Synthesis(Speech and Hearing)
- A Training Method of Average Voice Model for HMM-Based Speech Synthesis(Digital Signal Processing)
- A Context Clustering Technique for Average Voice Models (Special Issue on Speech Information Processing)
- Speaker Adaptation of Pitch and Spectrum for HMM-Based Speech Synthesis
- Multi-Space Probability Distribution HMM(Special Issue on the 2000 IEICE Excellent Paper Award)
- Robust F_0 Estimation of Speech Signal Using Harmonicity Measure Based on Instantaneous Frequency(Speech and Hearing)
- A Rapid Model Adaptation Technique for Emotional Speech Recognition with Style Estimation Based on Multiple-Regression HMM
- HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation
- Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training(Speech and Hearing)
- A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM
- HMM-Based Voice Conversion Using Quantized F0 Context
- Human Walking Motion Synthesis with Desired Pace and Stride Length Based on HSMM(Life-like Agent and its Communication)
- FOREWORD
- A context clustering technique for improvement of tone intelligibility of average-voice-based Thai speech synthesis (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")