Simultaneous Estimation of Vocal Tract and Voice Source Parameters Based on an ARX Model
スポンサーリンク
概要
- 論文の詳細を見る
A novel adaptive pitch-synchronous analysis method is proposed to estimate simultaneously vocal tract (formant/antiformant) and voice source parameters from speech waveforms. We use the parametric Rosenberg-Klatt (RK) model to generate a glottal waveform and an autoregressive-exogenous (ARX) model to represent voiced speech production process. The Kalman filter algorithm is used to estimate the formant/antiformant parameters from the coefficients of the ARX model, and the simulated annealing method is employed as a nonlinear optimization approach to estimate the voice source parameters. The two approaches work together in a system identification procedure to find the best set of the parameters of both the models. The new method has been compared using synthetic speech with some other approaches in terms of accuracy of estimated parameter values and has been proved to be superior. We also show that the proposed method can estimate accurately the parameters from natural speech sounds. A major application of the analysis method lies in a concatenative formant synthesizer which allows us to make flexible control of voice quality of synthetic speech.
- 社団法人電子情報通信学会の論文
- 1995-06-25
著者
-
Kasuya Hideki
Faculty Of Engineering Utsunomiya University
-
Adachi Shuichi
Faculty Of Engineering Utsunomiya University
-
Ding Wen
Faculty of Engineering, Utsunomiya University
-
Ding Wen
Faculty Of Engineering Utsunomiya University
関連論文
- Uniform and Non-uniform Normalization of Vocal Tracts Measured by MRI Across Male, Female and Child Subjects
- Generative model of F_0 change field for Mandarin trisyllabic words
- Statistical examination of invariance of relative F_0 change field for Chinese disyllabic words
- Evaluation of fundamental frequency (F_0) characteristics of speech in dysarthrias : A comparative study
- Prosodic variations in disyllabic meaningful words focused with different stress patterns in Mandarin Chinese
- Some considerations for designing spoken dialogue database from the viewpoint of paralinguistic information
- A System Identification Method for Linear Regression Models Based on Support Vector Mechine
- Feedback active noise control system based on H_∞ control
- Simultaneous Estimation of Vocal Tract and Voice Source Parameters Based on an ARX Model
- Effects of speaking rate on mora timing organization in producing Japanese contrastive geminate/single consonants and long/short vowels by native and Chinese speakers
- F_0 Dynamics in Singing : Evidence from the Data of a Baritone Singer(Speech Dynamics by Ear, Eye, Mouth and Machine)
- Study of Bandgap Energies of Cu(In,Ga)Se Thin Films Grown by a Sequential Evaporation Method Using Piezoelectric Photothermal Spectroscopy (Special Issue : Ultrasonic Electronics)