Fractal Modeling of Fluctuations in the Steady Part of Sustained Vowels for High Quality Speech Synthesis (Special Section on Nonlinear Theory and Its Applications)
スポンサーリンク
概要
- 論文の詳細を見る
The naturalness of normal sustained vowels is considered to be attributable to the fluctuations observed in the steady part where speech signal is seemingly almost periodic. There always exist two kinds of involuntary fluctuations in the steady part of sustained vowels, even if the sustained vowels are phonated as steadily as possible. One is pitch period fluctuation and the other is waveform fluctuation. In this study, frequency analyses on these fluctuations were conducted in order to investigate their general characteristics. The results of the analyses suggested that the frequency characteristics of the fluctuations were possible to be approximated as 1/f^β-like, which is regarded as the specific feature of random fractal. Therefore, a procedure based on random fractal generation methods was proposed in order to produce these fluctuations for the improvement of the voice quality of synthesized sustained vowels. A series of psychoacoustic experiments was also conducted to evaluate the proposed technique. Experimental results indicated that the proposed technique was effective for synthesized sustained vowels to be perceived as human-like. Unlike the sustained vowels which were synthesized without pitch period fluctuation nor waveform fluctuation, the synthesized sustained vowels which contained the fluctuations were not perceived as buzzer-like, which is the major problem of the voice quality of synthesized sustained vowels. However, it was also found that both of the fluctuations were not always the acoustic cues for the naturalness of normal sustained vowels. The synthesized sustained vowels which contained the fluctuations whose frequency characteristics were the same as that of white noise were perceived as noise-like, which is not at all the voice quality of normal sustained vowels. The results of psychoacoustic experiments indicated that the frequency characteristics of the fluctuations, which are possible to be modeled as 1/f^β-like, were the significant factors for the naturalness of normal sustained vowels.
- 社団法人電子情報通信学会の論文
- 1998-09-25
著者
-
Aoki N
Hokkaido Univ. Sapporo‐shi Jpn
-
AOKI Naofumi
Research Institute for Electronic Science, Hokkaido University
-
IFUKUBE Tohru
Research Institute for Electronic Science, Hokkaido University
-
Ifukube T
Univ. Tokyo Tokyo Jpn
-
Ifukube Tohru
Research Center For Advanced Science And Technology The University Of Tokyo
関連論文
- Effects of an Eyeglass-free 3-D Display on the Human Visual System
- Comparison of obstacle sense ability between the blind and the sighted: A basic psychophysical study for designs of acoustic assistive devices
- A Localizing Technique of Alteration Using Fragile Digital Watermarking Based on Number Theoretic Transform
- A Fragile Digital Watermarking Technique by Number Theoretic Transform(Special Section on Digital Signal Processing)
- Fractal Modeling of Fluctuations in the Steady Part of Sustained Vowels for High Quality Speech Synthesis (Special Section on Nonlinear Theory and Its Applications)
- A basic study for a robotic transfer aid system based on human motion analysis
- Dependency of the Skin Temperature on the Threshold and the Quality of the Tactile Sensation - A Basic Study for Tactile Aids-
- Comparative Study of Adaptation Phenomena Between Vibratory Stimulation and Braille like Stimulation for Tactile Communication
- Tactile Sense and Pressure of Toe Contribution to Standing in the Healthy Elderly
- Tone Enhancement in Mandarin Speech for Listeners with Hearing Impairment
- A Method for Determining the Timing of Displaying the Speaker's Face and Captions for a Real-Time Speech-to-Caption System