Distinctive Phonetic Feature (DPF) Extraction Based on MLNs and Inhibition/Enhancement Network
スポンサーリンク
概要
- 論文の詳細を見る
This paper describes a distinctive phonetic feature (DPF) extraction method for use in a phoneme recognition system; our method has a low computation cost. This method comprises three stages. The first stage uses two multilayer neural networks (MLNs): MLNLF-DPF, which maps continuous acoustic features, or local features (LFs), onto discrete DPF features, and MLNDyn, which constrains the DPF context at the phoneme boundaries. The second stage incorporates inhibition/enhancement (In/En) functionalities to discriminate whether the DPF dynamic patterns of trajectories are convex or concave, where convex patterns are enhanced and concave patterns are inhibited. The third stage decorrelates the DPF vectors using the Gram-Schmidt orthogonalization procedure before feeding them into a hidden Markov model (HMM)-based classifier. In an experiment on Japanese Newspaper Article Sentences (JNAS) utterances, the proposed feature extractor, which incorporates two MLNs and an In/En network, was found to provide a higher phoneme correct rate with fewer mixture components in the HMMs.
著者
-
HUDA Mohammad
Graduate School of Engineering, Toyohashi University of Technology
-
KAWASHIMA Hiroaki
Graduate School of Engineering, Toyohashi University of Technology
-
NITTA Tsuneo
Graduate School of Engineering, Toyohashi University of Technology
関連論文
- Distinctive Phonetic Feature (DPF) Extraction Based on MLNs and Inhibition/Enhancement Network
- Distinctive Phonetic Feature (DPF) Extraction Based on MLNs and Inhibition/Enhancement Network
- Canonicalization of Feature Parameters for Robust Speech Recognition Based on Distinctive Phonetic Feature (DPF) Vectors
- PS-ZCPA Based Feature Extraction with Auditory Masking, Modulation Enhancement and Noise Reduction for Robust ASR(Speech Recognition, Statistical Modeling for Speech Processing)
- Confidence Scoring for Accurate HMM-Based Speech Recognition by Using Monophone-Level Normalization Based on Subspace Method (Special Issue on Speech Information Processing)
- Pitch-Synchronous Peak-Amplitude (PS-PA)-Based Feature Extraction Method for Noise-Robust ASR(Speech and Hearing)
- Changes of Bacterial Population in Frozen Soil