Tree-Structured Clustering Methods for Piecewise Linear-Transformation-Based Noise Adaptation(Speech and Hearing)
スポンサーリンク
概要
- 論文の詳細を見る
This paper proposes the application of tree-structured clustering to the processing of noisy speech collected under various SNR conditions in the framework of piecewise-linear transformation (PLT)-based HMM adaptation for noisy speech. Three kinds of clustering methods are described : a one-step clustering method that integrates noise and SNR conditions and two two-step clustering methods that construct trees for each SNR condition. According to the clustering results, a noisy speech HMM is made for each node of the tree structure. Based on the likelihood maximization criterion, the HMM that best matches the input speech is selected by tracing the tree from top to bottom, and the selected HMM is further adapted by linear transformation. The proposed methods are evaluated by applying them to a Japanese dialogue recognition system. The results confirm that the proposed methods are effective in recognizing digitally noise-added speech and actual noisy speech issued by a wide range of speakers under various noise conditions. The results also indicate that the one-step clustering method gives better performance than the two-step clustering methods.
- 2005-09-01
著者
-
Furui Sadaoki
Department Of Computer Science Graduate School Of Information Science And Engineering Tokyo Institut
-
Furui Sadaoki
Tokyo Inst. Of Technol. Tokyo Jpn
-
ZHANG Zhipeng
Multimedia Laboratories, NTT DoCoMo, Inc.
-
SUGIMURA Toshiaki
Multimedia Laboratories, NTT DoCoMo, Inc.
-
Sugimura Toshiaki
Multimedia Laboratories Ntt Docomo Inc.
-
Zhang Zhipeng
Multimedia Laboratories Ntt Docomo Inc.
関連論文
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Tree-Structured Clustering Methods for Piecewise Linear-Transformation-Based Noise Adaptation(Speech and Hearing)
- Robust Scene Extraction Using Multi-Stream HMMs for Baseball Broadcast(Image Processing and Video Processing)
- Automatic recognition of Indonesian declarative questions and statements using polynomial coefficients of the pitch contours
- Initial evaluation of the drivers' Japanese speech corpus in a car environment (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Accent analysis for Mandarin large vocabulary continuous speech recognition (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Evaluation of a Noise-Robust Multi-Stream Speaker Verification Method Using F_0 Information
- Topic Extraction Based on Continuous Speech Recognition in Broadcast News Speech
- Noise Robust Speech Recognition Using F_0 Contour Information(Speech Dynamics by Ear, Eye, Mouth and Machine)
- Recent Progress in Corpus-Based Spontaneous Speech Recognition(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Phonology and Morphology Modeling in a Very Large Vocabulary Hungarian Dictation System(Speech and Hearing)
- THE USE OF FINITE-STATE TRANSDUCERS FOR MODELING PHONOLOGICAL AND MORPHOLOGICAL CONSTRAINTS IN AUTOMATIC SPEECH RECOGNITION
- Adaptation to Pronunciation Variations in Indonesian Spoken Query-Based Information Retrieval
- 連続発話認識のための言語モデル
- Dynamic Bayesian Network-Based Acoustic Models Incorporating Speaking Rate Effects(Speech and Hearing)
- Neural-network-based HMM adaptation for noisy speech recognition
- Speaker Verification Using MMAP Adaptation (言語理解とコミュニケーション)
- Speaker Verification Using MMAP Adaptation (音声)
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Two-pass Approach for Recognizing Code-Switching Speech
- Two-pass Approach for Recognizing Code-Switching Speech
- Committee-Based Active Learning for Speech Recognition
- Robust Gait-Based Person Identification against Walking Speed Variations
- Selected Topics from LVCSR Research for Asian Languages at Tokyo Tech
- Two-pass Approach for Recognizing Code-Switching Speech
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Active Learning Using Phone-Error Distribution for Speech Modeling
- Distance-based Factor Graph Linearization and Sampled Max-sum Algorithm for Efficient 3D Potential Decoding of Macromolecules
- Speaker Verification Using MMAP Adaptation
- Two-pass Approach for Recognizing Code-Switching Speech
- Active Learning Using Phone-Error Distribution for Speech Modeling