Orthogonalized Distinctive Phonetic Feature Extraction for Noise-Robust Automatic Speech Recognition(<Special Section>Speech Dynamics by Ear, Eye, Mouth and Machine)
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we propose a noise-robust automatic speech recognition system that uses orthogonalized distinctive phonetic features (DPFs) as input of HMM with diagonal covariance. In an orthogonalized DPF extraction stage, first, a speech signal is converted to acoustic features composed of local features (LFs) and ΔP, then a multilayer neural network (MLN) with 15 × 3 output units composed of context-dependent DPFs of a preceding context DPF vector, a current DPF vector, and a following context DPF vector maps the LFs to DPFs. Karhunen-Loeve transform (KLT) is then applied to orthogonalize each DPF vector in the context-dependent DPFs, using orthogonal bases calculated from a DPF vector that represents 38 Japanese phonemes. Each orthogonalized DPF vector is finally decorrelated one another by using Gram-Schmidt orthogonalization procedure. In experiments, after evaluating the parameters of the MLN input and output units in the DPF extractor, the orthogonalized DPFs are compared with original DPFs. The orthogonalized DPFs are then evaluated in comparison with a standard parameter set of MFCCs and dynamic features. Next, noise robustness is tested using four types of additive noise. The experimental results show that the use of the proposed orthogonalized DPFs can significantly reduce the error rate in an isolated spoken-word recognition task both with clean speech and with speech contaminated by additive noise. Furthermore, we achieved significant improvements when combining the orthogonalized DPFs with conventional static MFCCs and ΔP.
- 2004-05-01
著者
-
Nitta Tsuneo
Graduate School Of Engineering Toyohashi University Of Technology
-
Nitta T
The Graduate School Of Engineering Toyohashi University Of Technology
-
FUKUDA Takashi
Graduate School of Engineering, Toyohashi University of Technology
-
Fukuda Takashi
Graduate School Of Engineering Toyohashi University Of Technology
関連論文
- Orthogonalized Distinctive Phonetic Feature Extraction for Noise-Robust Automatic Speech Recognition(Speech Dynamics by Ear, Eye, Mouth and Machine)
- Distinctive Phonetic Feature (DPF) Extraction Based on MLNs and Inhibition/Enhancement Network
- Canonicalization of Feature Parameters for Robust Speech Recognition Based on Distinctive Phonetic Feature (DPF) Vectors
- Confidence Scoring for Accurate HMM-Based Speech Recognition by Using Monophone-Level Normalization Based on Subspace Method (Special Issue on Speech Information Processing)
- Pitch-Synchronous Peak-Amplitude (PS-PA)-Based Feature Extraction Method for Noise-Robust ASR(Speech and Hearing)
- Search Method for Inhibitors of Staphyloxanthin Production by Methicillin-Resistant Staphylococcus aureus
- Trichocyalides A and B, new inhibitors of alkaline phosphatase activity in bone morphogenetic protein-stimulated myoblasts, produced by Trichoderma sp. FKI-5513
- A new ascochlorin derivative from Cylindrocarpon sp. FKI-4602
- New dinapinone derivatives, potent inhibitors of triacylglycerol synthesis in mammalian cells, produced by Talaromyces pinophilus FKI-3864