Elderly Acoustic Models for Large Vocabulary Continuous Speech Recognition
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we evaluate elderly speaker acoustic models in LVCSR, which are trained by the 301 elderly speakers' database from the age of 60 to 90. Each speaker utters 200 sentences. The elderly speaker PTM (Phonetic Tied Mixture) acoustic model attains 86.7% word recognition rate, which is better than 82.1 % word recognition rate by the usual adult (an average age of 28.6) PTM acoustic model. To achieve higher recognition rates, we use two types of speaker adaptation methods, which are a supervised MLLR and an unsupervised adaptation method based on the sufficient HMM statistics. In our experimental results, the elderly acoustic model is better as the adaptation baseline HMM model than the usual adult model for elderly speakers.
- 社団法人電子情報通信学会の論文
- 2002-03-01
著者
-
SHIKANO Kiyohiro
Graduate School of Information Science, Nara Institute of Science and Technology
-
Shikano Kiyohiro
Graduate School Of Information Science Nara Institute Of Science And Technology
-
Shikano Kiyohiro
Institute Of Science And Technology
-
Yoshizawa Shinichi
Matsushita Electric Industrial Co. Ltd.:laboratories Of Image Information Science And Technology
-
Baba Akira
Laboratories Of Image Information Science And Technology:matsushita Electric Works Ltd.
-
Lee Akinobu
Institute Of Science And Technology
-
Lee Akinobu
Graduate School Of Information Science Nara Institute Of Science
-
YOSHIZAWA Shinichi
Laboratories of Image Information Science and Technology
-
YAMADA Miichi
Graduate School of Information Science, Nara Institute of Science and Technology
-
YAMADA Miichi
Institute of Science and Technology
関連論文
- Building an Effective Speech Corpus by Utilizing Statistical Multidimensional Scaling Method
- Cost Reduction of Acoustic Modeling for Real-Environment Applications Using Unsupervised and Selective Training
- Cross-language Voice Conversion Evaluation Using Bilingual Databases (特集 音声言語情報処理とその応用)
- Effect of Central Limit Theorem non-compliance on blind separation of speech by negentropy maximization
- Robots that can hear, understand and talk
- Probability Distribution of Time-Series of Speech Spectral Components(Audio/Speech Coding)(Applications and Implementations of Digital Signal Processing)
- A design of adaptive beamformer based on average speech spectrum for noisy speech recognition
- A Microphone Array-Based 3-D N-Best Search Method for Recognizing Multiple Sound Sources
- 3D N-best 探索法に基づく複数音源の位置推定と音声認識の統合
- 複数話者の音声認識における音源方向経路間距離を用いた3-D N-best探索法の評価
- Non-Audible Murmur (NAM) Recognition(2004 IEICE Excellent Paper Award)
- Non-Audible Murmur (NAM) Recognition Exploiting Adaptation Techniques
- Objective sound quality evaluation for combination method of beamforming and spectral subtraction (応用音響)
- Fast Convergence Blind Source Separation Using Frequency Subband Interpolation by Null Beamforming
- Rapid Compensation of Temperature Fluctuation Effect for Multichannel Sound Field Reproduction System
- Development, Long-Term Operation and Portability of a Real-Environment Speech-Oriented Guidance System
- On-Line Relaxation Algorithm Applicable to Acoustic Fluctuation for Inverse Filter in Multichannel Sound Reproduction System(Sound Field Reproduction, Multi-channel Acoustic Signal Processing)
- Iterative Inverse Filter Relaxation Algorithm for Adaptation to Acoustic Fluctuation in Sound Reproduction System
- Sound Reproduction System Including Adaptive Compensation of Temperature Fluctuation Effect for Broad-Band Sound Control(Special Section on Digital Signal Processing)
- Elderly Acoustic Models for Large Vocabulary Continuous Speech Recognition
- Unsupervised Phoneme Model Training Based on the Sufficient HMM Statistics from Selected Speakers
- Interface for Barge-in Free Spoken Dialogue System Combining Adaptive Sound Field Control and Microphone Array(Speech and Hearing)
- Lecture Speech Recognition Using Large Corpus of Spontaneous Japanese
- A Self-Generator Method for Initial Filters of SIMO-ICA Applied to Blind Separation of Binaural Sound Mixtures(Blind Source Separation, Multi-channel Acoustic Signal Processing)
- Multistage SIMO-Model-Based Blind Source Separation Combining Frequency-Domain ICA and Time-Domain ICA(Adaptive Signal Processing and Its Applications)
- Direction of Arrival Estimation Using Nonlinear Microphone Array
- Evaluation of Extremely Small Sound Source Signals Used in Speaking-Aid System with Statistical Voice Conversion
- Improvements of the One-to-Many Eigenvoice Conversion System
- Esophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models
- Adaptive Training for Voice Conversion Based on Eigenvoices
- Blind Separation and Deconvolution for Convolutive Mixture of Speech Combining SIMO-Model-Based ICA and Multichannel Inverse Filtering(Engineering Acoustics)
- Overdetermined Blind Separation for Real Convolutive Mixtures of Speech Based on Multistage ICA Using Subarray Processing(Speech/Acoustic Signal Processing)(Digital Signal Processing)
- Stable Learning Algorithm for Blind Separation of Temporally Correlated Acoustic Signals Combining Multistage ICA and Linear Prediction(Digital Signal Processing)
- Blind Source Separation of Acoustic Signals Based on Multistage ICA Combining Frequency-Domain ICA and Time-Domain ICA
- Fast-Convergence Algorithm for Blind Source Separation Based on Array Signal Processing
- An Iterative Inverse Filter Design Method for the Multichannel Sound Field Sound Field Reproduction System(Special Section on Acoustic Signal Processing)
- Semi-Blind Optimization Scheme of Joint Suppression of Background Noise and Late Reverberation