A Model-Based Learning Process for Modeling Coarticulation of Human Speech(<Special Section>Knowledge, Information and Creativity Support System)
スポンサーリンク
概要
- 論文の詳細を見る
Machine learning techniques have long been applied in many fields and have gained a lot of success. The purpose of learning processes is generally to obtain a set of parameters based on a given data set by minimizing a certain objective function which can explain the data set in a maximum likelihood or minimum estimation error sense. However, most of the learned parameters are highly data dependent and rarely reflect the true physical mechanism that is involved in the observation data. In order to obtain the inherent knowledge involved in the observed data, it is necessary to combine physical models with learning process rather than only fitting the observations with a black box model. To reveal underlying properties of human speech production, we proposed a learning process based on a physiological articulatory model and a coarticulation model, where both of the models are derived from human mechanisms. A two-layer learning framework was designed to learn the parameters concerned with physiological level using the physiological articulatory model and the parameters in the motor planning level using the coarticulation model. The learning process was carried out on an articulatory database of human speech production. The learned parameters were evaluated by numerical experiments and listening tests. The phonetic targets obtained in the planning stage provided an evidence for understanding the virtual targets of human speech production. As a result, the model based learning process reveals the inherent mechanism of the human speech via the learned parameters with certain physical meaning.
- 社団法人電子情報通信学会の論文
- 2007-10-01
著者
-
Lu Xugang
Atr Spoken Language Communication Research Laboratories
-
Lu Xugang
School Of Information Science Japan Advanced Institute Of Science And Technology
-
Lu Xugang
Japan Advanced Institute Of Science And Technology
-
Dang Jianwu
Japan Advanced Inst. Of Sci. And Technol. Ishikawa Jpn
-
Lu Xugang
Information School Japan Advanced Institute Of Science And Technology
-
Dang Jianwu
Information School Japan Advanced Institute Of Science And Technology
-
WEI Jianguo
Japan Advanced Institute of Science and Technology
関連論文
- Robust voice activity detection based on noise eigenspace
- A model-based investigation of activations of the tongue muscles in vowel production
- Speech Enhancement based on Noise Eigenspace Projection
- A speech enhancement framework based on noise eigenspace projection (音声)
- Comparative evaluation of modulation-transfer-function-based blind restoration of sub-band power envelopes of speech as a front-end processor for automatic speech recognition systems
- Sub-Band Temporal Envelope Restoration for ASR in Reverberation Environment (国際ワークショップ Frontiers in Speech and Hearing Research)
- Robust speech feature extraction based on auditory neuronal adaptation mechanism
- A Model-Based Learning Process for Modeling Coarticulation of Human Speech(Knowledge, Information and Creativity Support System)
- Normalization of vocal tract shape using radial basis function (音声)
- Normalization of vocal tract shape using radial basis function
- Optimization and Evaluation of a Coarticulation Model based on Observation and Simulation
- Parameter Optimization for a Coarticulation Model Based on Observation and Simulation (国際ワークショップ Frontiers in Speech and Hearing Research)
- Extraction of Low Dimensional Representation of Vowels in Articulatory Space (国際ワークショップ Frontiers in Speech and Hearing Research)
- Comparison of Emotion Perception among Different Cultures
- Investigation of coarticulation in continuous speech of Japanese
- Investigation of coarticulation effects on vocal tract shapes of vowels based on similarity