A System for the Synthesis of High-Quality Speech from Texts on General Weather Conditions (Special Section on Speech Synthesis: Current Technologies and Equipment)
スポンサーリンク
概要
- 論文の詳細を見る
A text-to-speech conversion system for Japanese has been developed for the purpose of producing high-quality speech output. This system consists of four processing stages: 1) linguistic processing,2) phonological processing, 3) control parameter generation, and 4) speech waveform generation. Although the processing at the first stage is resticted to the texts on general weather conditions, the other three stages can also cope with texts of news and narrations on other topics. Since the prosodic features of speech are largely related to the linguistic information, such as word accent, syntactic structure and discourse structure, linguistic processing of a wider range than ever, at least a sentence, is indispensable to obtain good quality speech with respect to the prosody. From this point of view, input text was restricted to the weather forecast sentences and a method for linguistic processing was developed to conduct morpheme, syntactic and semantic analyses simultaneously. A quantitative model for generating fundamental frequency contours was adopted to make a good reflection of the linguistic information on the prosody of synthetic speech. A set of prosodic rules was constructed to generate prosodic symbols representing prosodic structures of the text from the linguistic information obtained at the first stage. A new speech synthesizer based on the terminal analog method was also developed to improve the segmental quality of synthetic speech, It consists of four paths of cascade connection of pole / zero filters and three waveform generators. The four paths are respectively used for the synthesis of vowels and vowel-like sounds, nasal murmur and buzz bar, friction, and plosion, while the three generators produce voicing source waveform approximated by polynomials, white Gaussian noise source for fricatives and impulse source for plosives. The validity of the approach above has been confirmed by the listening tests using speech synthesized by the developed system. Improvements both in the quality of prosodic features and in the quality of segmental features were realized for the synthetic speech.
- 社団法人電子情報通信学会の論文
- 1993-11-25
著者
-
Hirose Keikichi
The Faculty Of Engineering The University Of Tokyo
-
Fujisaki Hiroya
the Faculty of Fundamental Engineering, Science University
-
Fujisaki Hiroya
The Faculty Of Fundamental Engineering Science University
関連論文
- A System for the Synthesis of High-Quality Speech from Texts on General Weather Conditions (Special Section on Speech Synthesis: Current Technologies and Equipment)
- Manifestation of Linguistic Information in the Voice Fundamental Frequency Contours of Spoken Japanese (Special Section on Speech Synthesis: Current Technologies and Equipment)