Generating F0 Contours by Statistical Manipulation of Natural F0 Shapes(Speech Analysis, <Special Section> Statistical Modeling for Speech Processing)
スポンサーリンク
概要
- 論文の詳細を見る
This paper describes a method of generating F0 contours from natural F0 segmental shapes for speech synthesis. The extracted shapes of the F0 units are basically held invariant by eliminating any averaging operations in the analysis phase and by minimizing modification operations in the synthesis phase. The use of natural F0 shapes has great potential to cover a wide variety of speaking styles with the same framework, including not only read-aloud speech, but also dialogues and emotional speech. A linear-regression statistical model is used to "manipulate" the stored raw F0 shapes to build them up into a sentential F0 contour. Through experimental evaluations, the proposed model is shown to provide stable and robust F0 contour prediction for various speakers. By using this model, linguistically derived information about a sentence can be directly mapped, in a purely data-driven manner, to acoustic F0 values of the sentential intonation contour for a given target speaker.
- 一般社団法人電子情報通信学会の論文
- 2006-03-01
著者
-
SAITO Takashi
IBM Research, Tokyo Research Laboratory, IBM Japan Ltd.
-
Saito Takashi
Ibm Research Tokyo Research Laboratory Ibm Japan Ltd.
関連論文
- A VoiceFont Creation Framework for Generating Personalized Voices(Speech Synthesis and Prosody, Corpus-Based Speech Technologies)
- Generating F0 Contours by Statistical Manipulation of Natural F0 Shapes(Speech Analysis, Statistical Modeling for Speech Processing)