Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training
スポンサーリンク
概要
- 論文の詳細を見る
To improve speech recognition performance, acoustic feature transformation based on discriminant analysis has been widely used. For the same purpose, discriminative training of HMMs has also been used. In this letter we investigate the effectiveness of these two techniques and their combination. We also investigate the robustness of matched and mismatched noise conditions between training and evaluation environments.
- (社)電子情報通信学会の論文
- 2010-02-01
著者
-
SAKAI Makoto
DENSO CORPORATION
-
KITAOKA Norihide
Nagoya University
-
TAKEDA Kazuya
Nagoya University
-
Nakagawa Seiichi
Toyohashi Univ. Technol. Toyohashi‐shi Jpn
-
Nakagawa Seiichi
Department Of Information And Computer Sciences Toyohashi University
-
Nakagawa Seiichi
Toyohashi Univ. Of Technol. Toyohashi‐shi Jpn
-
Takeda Kazuya
Nagoya Univ.
-
Takeda Kazuya
Graduate School Of Information Science Nagoya University
-
Takeda Kazuya
Nagoya Univ. Nagoya‐shi Jpn
-
Kitaoka Norihide
Toyohashi University Of Technology
-
HATTORI Yuya
DENSO CORPORATION
-
Kitaoka Norihide
Nagoya Univ.
-
NAKAGAWA Seiichi
Nagoya University
-
TAKEDA Kazuya
DENSO CORPORATION
-
SAKAI Makoto
Nagoya University
関連論文
- Acoustic Feature Transformation Combining Average and Maximum Classification Error Minimization Criteria
- Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition
- Topic dependent language model based on on-line voting (言語理解とコミュニケーション)
- A transitive translation for Indonesian-Japanese CLQA (自然言語処理)
- A Machine Learning Approach for an Indonesian-English Cross Language Question Answering System(Natural Language Processing)
- Indonesian-Japanese Transitive Translation using English for CLIR
- AN INTEGRATED AUDIO-VISUAL VIEWER FOR A LARGE SCALE MULTIPOINT CAMERAS AND MICROPHONES(International Workshop on Advanced Image Technology 2007)
- CENSREC-1-C : An evaluation framework for voice activity detection under noisy environments
- Driver Identification Using Driving Behavior Signals(Human-computer Interaction)
- Auditory perception versus automatic estimation of location and orientation of an acoustic source in a real environment
- Topic dependent language model based on on-line voting (音声)
- Topic dependent language model based on clustering of noun word history
- Word and class dependency of N-gram language model (音声言語情報処理)
- Word and class dependency of N-gram language model (言語理解とコミュニケーション・第9回音声言語シンポジウム)
- Word and class dependency of N-gram language model (音声・第9回音声言語シンポジウム)
- AN INTEGRATED AUDIO-VISUAL VIEWER FOR A LARGE SCALE MULTIPOINT CAMERAS AND MICROPHONES
- G_007 Arbitrary Listening-point Generation Using Acoustic Transfer Function Interpolation in A Large Microphone Array
- THE SUB-BAND SOUND WAVE RAY-SPACE REPRESENTATION(International Workshop on Advanced Image Technology 2006)
- A-16-24 3D Sound Wave Field Representation Based on Ray-Space Method(A-16. マルチメディア・仮想環境基礎, 基礎・境界)
- TEXT-INDEPENDENT SPEAKER IDENTIFICATION ON TIMIT DATABASE
- Text-Independent/Text-Prompted Speaker Recognition by Combining Speaker-Specific GMM with Speaker Adapted Syllable-Based HMM(Speaker Recognition, Statistical Modeling for Speech Processing)
- AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition(Speech Corpora and Related Topics, Corpus-Based Speech Technologies)
- Selective Listening Point Audio Based on Blind Signal Separation and Stereophonic Technology
- CENSREC-3: An Evaluation Framework for Japanese Speech Recognition in Real Car-Driving Environments(Speech and Hearing)
- Evaluation of HRTFs estimated using physical features
- Multiple Regression of Log Spectra for In-Car Speech Recognition Using Multiple Distributed Microphones(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training
- LVCSR based on context-dependent syllable acoustic models (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Linear Discriminant Analysis Using a Generalized Mean of Class Covariances and Its Application to Speech Recognition
- Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN
- LVCSR based on context-dependent syllable acoustic models
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN
- Improving Keyword Recognition of Spoken Queries by Combining Multiple Speech Recognizer's Outputs for Speech-driven WEB Retrieval Task(Spoken Language Systems, Corpus-Based Speech Technologies)
- An Unsupervised Speaker Adaptation Method for Lecture-Style Spontaneous Speech Recognition Using Multiple Recognition Systems(Spoken Language Systems, Corpus-Based Speech Technologies)
- Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition
- Gamma Modeling of Speech Power and Its On-Line Estimation for Statistical Speech Enhancement(Speech Enhancement, Statistical Modeling for Speech Processing)
- Speaker Change Detection and Speaker Clustering Using VQ Distortion Measure
- Noisy Speech Recognition Based on Integration/Selection of Multiple Noise Suppression Methods Using Noise GMMs
- Multichannel Speech Enhancement Based on Generalized Gamma Prior Distribution with Its Online Adaptive Estimation
- SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements (第6回音声言語シンポジウム)
- SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements (第6回音声言語シンポジウム)
- SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements (第6回音声言語シンポジウム)
- Succeeding Word Prediction for Speech Recognition Based on Stochastic Language Model
- Acoustic Feature Transformation Combining Average and Maximum Classification Error Minimization Criteria
- Driver's irritation detection using speech recognition results (音声・第10回音声言語シンポジウム)
- Driver's irritation detection using speech recognition results (音声言語情報処理)
- Driver's irritation detection using speech recognition results (言語理解とコミュニケーション・第10回音声言語シンポジウム)
- A Survey on Automatic Speech Recognition(Special Issue on the 2000 IEICE Excellent Paper Award)
- サブバンドに含まれる周波数成分の瞬時周波数に基づく推定
- Relationship among Recognition Rate, Rejection Rate and False Alarm Rate in a Spoken Word Recognition System
- Predicting the Degradation of Speech Recognition Performance from Sub-band Dynamic Ranges (特集 音声言語情報処理とその応用)
- An Acoustically Oriented Vocal-Tract Model
- Speaker Recognition by Combining MFCC and Phase Information in Noisy Conditions
- Distant Speech Recognition Using a Microphone Array Network
- Auditory perception versus automatic estimation of location and orientation of an acoustic source in a real environment
- Continuous Speech Recognition Using an On-Line Speaker Adaptation Method Based on Automatic Speaker Clustering (Special Issue on Speech Information Processing)
- Comparison of acoustic measures for evaluating speech recognition performance in an automobile
- Estimation of speaker and listener positions in a car using binaural signals
- Sound localization under conditions of covered ears on the horizontal plane
- Single-Channel Multiple Regression for In-Car Speech Enhancement
- Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition(Speech Enhancement, Multi-channel Acoustic Signal Processing)
- Speech Recognition Using Finger Tapping Timings(Speech and Hearing)
- CIAIR In-Car Speech Corpus : Influence of Driving Status(Corpus-Based Speech Technologies)
- Construction and Evaluation of a Large In-Car Speech Corpus(Speech Corpora and Related Topics, Corpus-Based Speech Technologies)
- Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm
- Response Timing Detection Using Prosodic and Linguistic Information for Human-friendly Spoken Dialog Systems
- A Spoken Dialog System for Spontaneous Conversations Considering Response Timing and Response Type
- INVESTIGATIONS ON TEXT-INDEPENDENT SPEAKER IDENTIFICATION
- Method for determining sound localization by auditory masking
- Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition
- Selective Gammatone Envelope Feature for Robust Sound Event Recognition
- CENSREC-4: An evaluation framework for distant-talking speech recognition in reverberant environments
- Selective Gammatone Envelope Feature for Robust Sound Event Recognition
- Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription
- A Graph-Based Spoken Dialog Strategy Utilizing Multiple Understanding Hypotheses
- Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition
- Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription