A VoiceFont Creation Framework for Generating Personalized Voices(Speech Synthesis and Prosody, <Special Section>Corpus-Based Speech Technologies)
スポンサーリンク
概要
- 論文の詳細を見る
This paper presents a new framework for effectively creating VoiceFonts for speech synthesis. A VoiceFont in this paper represents a voice inventory aimed at generating personalized voices. Creating well-formed voice inventories is a time-consuming and laborious task. This has become a critical issue for speech synthesis systems that make an attempt to synthesize many high quality voice personalities. The framework we propose here aims to drastically reduce the burden with a twofold approach. First, in order to substantially enhance the accuracy and robustness of automatic speech segmentation, we introduce a multi-layered speech segmentation algorithm with a new measure of segmental reliability. Secondly, to minimize the amount of human intervention in the process of VoiceFont creation, we provide easy-to-use functions in a data viewer and compiler to facilitate checking and validation of the automatically extracted data. We conducted experiments to investigate the accuracy of the automatic speech segmentation, and its robustness to speaker and style variations. The results of the experiments on six speech corpora with a fairly large variation of speaking styles show that the speech segmentation algorithm is quite accurate and robust in extracting segments of both phonemes and accentual phrases. In addition, to subjectively evaluate VoiceFonts created by using the framework, we conducted a listening test for speaker recognizability. The results show that the voice personalities of synthesized speech generated by the VoiceFont-based speech synthesizer are fairly close to those of the donor speakers.
- 社団法人電子情報通信学会の論文
- 2005-03-01
著者
-
Sakamoto Masaharu
Ibm Research
-
Sakamoto Masaharu
Ibm Research Tokyo Research Laboratory Ibm Japan Ltd.
-
SAITO Takashi
IBM Research, Tokyo Research Laboratory, IBM Japan Ltd.
-
Saito Takashi
Ibm Research Tokyo Research Laboratory Ibm Japan Ltd.
関連論文
- Wavelets and their Applications(Fundamental Technologies in Numerical Computation)
- A VoiceFont Creation Framework for Generating Personalized Voices(Speech Synthesis and Prosody, Corpus-Based Speech Technologies)
- Generating F0 Contours by Statistical Manipulation of Natural F0 Shapes(Speech Analysis, Statistical Modeling for Speech Processing)