An Acoustically Oriented Vocal-Tract Model
スポンサーリンク
概要
- 論文の詳細を見る
The objective of this paper is to find a parametric representation for the vocal-tract log-area function that is directly and simply related to basic acoustic characteristics of the human vocal-tract. The importance of this representation is associated with the solution of the articulatory-to-acoustic inverse problem, where a simple mapping from the articulatory space onto the acoustic space can be very useful. The method is as follows: Firstly, given a corpus of log-area functions, a parametric model is derived following a factor analysis technique. After that, the articulatory space, defined by the parametric model, is filled with approximately uniformly distributed points, and the corresponding first three formant frequencies are calculated. These formants define an acoustic space onto which the articulatory space maps. In the next step, an independent component analysis technique is used to determine acoustic and articulatory coordinate systems whose components are as independent as possible. Finally, using singular value decomposition, acoustic and articulatory coordinate systems are rotated so that each of the first three components of the articulatory space has major influence on one, and only one, component of the acoustic space. An example showing how the proposed model can be applied to the solution of the articulatory-to-acoustic inverse problem is given at the end of the paper.
- 社団法人電子情報通信学会の論文
- 1996-08-25
著者
-
TAKEDA Kazuya
Nagoya University
-
Takeda Kazuya
Nagoya Univ.
-
Takeda K
Nagoya Univ. Nagoya Jpn
-
Itakura Fumitada
Nagoya University
-
YEHIA Hani
Nagoya University
-
Yehia Hani
Nagoya University:atr Human Information Processing Res. Labs
関連論文
- Acoustic Feature Transformation Combining Average and Maximum Classification Error Minimization Criteria
- Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition
- CENSREC-1-C : An evaluation framework for voice activity detection under noisy environments
- Driver Identification Using Driving Behavior Signals(Human-computer Interaction)
- AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition(Speech Corpora and Related Topics, Corpus-Based Speech Technologies)
- CENSREC-3: An Evaluation Framework for Japanese Speech Recognition in Real Car-Driving Environments(Speech and Hearing)
- Evaluation of HRTFs estimated using physical features
- Multiple Regression of Log Spectra for In-Car Speech Recognition Using Multiple Distributed Microphones(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training
- Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition
- Gamma Modeling of Speech Power and Its On-Line Estimation for Statistical Speech Enhancement(Speech Enhancement, Statistical Modeling for Speech Processing)
- Multichannel Speech Enhancement Based on Generalized Gamma Prior Distribution with Its Online Adaptive Estimation
- SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements (第6回音声言語シンポジウム)
- SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements (第6回音声言語シンポジウム)
- SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements (第6回音声言語シンポジウム)
- Acoustic Feature Transformation Combining Average and Maximum Classification Error Minimization Criteria
- Driver's irritation detection using speech recognition results (音声・第10回音声言語シンポジウム)
- Driver's irritation detection using speech recognition results (音声言語情報処理)
- Driver's irritation detection using speech recognition results (言語理解とコミュニケーション・第10回音声言語シンポジウム)
- Predicting the Degradation of Speech Recognition Performance from Sub-band Dynamic Ranges (特集 音声言語情報処理とその応用)
- A model of perceptual distance for group delays based on ellipsoidal mapping
- The effect of group delay spectrum on timbre
- Direction of Arrival Estimation Using Nonlinear Microphone Array
- Speech Enhancement Using Nonlinear Microphone Array Based on Noise Adaptive Complementary Beamforming
- Speech Enhancement Using Nonlinear Microphone Array Based on Complementary Beamforming (Special Section on Digital Signal Processing)
- Noise Robust Speech Recognition Using Subband-Crosscorrelation Analysis
- An Acoustically Oriented Vocal-Tract Model
- Estimation of speaker and listener positions in a car using binaural signals
- Sound localization under conditions of covered ears on the horizontal plane
- Single-Channel Multiple Regression for In-Car Speech Enhancement
- Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition(Speech Enhancement, Multi-channel Acoustic Signal Processing)
- Speech Recognition Using Finger Tapping Timings(Speech and Hearing)
- CIAIR In-Car Speech Corpus : Influence of Driving Status(Corpus-Based Speech Technologies)
- Construction and Evaluation of a Large In-Car Speech Corpus(Speech Corpora and Related Topics, Corpus-Based Speech Technologies)
- Method for determining sound localization by auditory masking
- Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition
- CENSREC-4: An evaluation framework for distant-talking speech recognition in reverberant environments
- A Graph-Based Spoken Dialog Strategy Utilizing Multiple Understanding Hypotheses
- Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition