A Hybrid HMM/BN Acoustic Model for Automatic Speech Recognition (<Special Issue>Special Issue on Speech Information Processing)
スポンサーリンク
概要
- 論文の詳細を見る
In current HMM based speech recognition systems, it is difficult to supplement acoustic spectrum features with additional information such as pitch, gender, articulator positions, etc. On the other hand, Bayesian Networks (BN) allow for easy combination of different continuous as well as discrete features by exploring conditional dependencies between them. However, the lack of efficient algorithms has limited their application in continuous speech recognition. In this paper we propose new acoustic model, where HMM are used for modeling of temporal speech characteristics and state probability model is represented by BN. In our experimental system based on HMM/BN model, in addition to speech observation variable, state BN has two more (hidden) variables representing noise type and SNR value. Evaluation results on AURORA2 database showed 36.4% word error rate reduction for closed noise test which is comparable with other much more complex systems utilizing effective adaptation and noise robust methods.
- 社団法人電子情報通信学会の論文
- 2003-03-01
著者
-
MARKOV Konstantin
ATR Spoken Language Translation Research Labs.
-
NAKAMURA Satoshi
ATR Spoken Language Translation Research Labs.
-
Markov Konstantin
Atr Spoken Language Communication Research Laboratories
-
Nakamura S
Laboratory Of Integrative Aquatic Biology Field Science Center Graduate School Of Agricultural Scien
-
Nakamura Satoshi
Atr Spoken Language Translation Res. Lab. Kyoto Jpn
-
Nakamura Satoshi
Atr Spoken Language Communication Res. Lab. Kyoto‐fu Jpn
関連論文
- Combination Therapy with Vascular Endothelial Growth Factor Neutralizing Antibody and Mitomycin C on Human Gastric Cancer Xenograft
- Noise and Channel Distortion Robust ASR System for DARPA SPINE2 Task (Special Issue on Speech Information Processing)
- A Study on Acoustic Modeling of Pauses for Recognizing Noisy Conversational Speech (Special Issue on Speech Information Processing)
- Quantitative analysis of pattern of gonial proliferation during sexual maturation in Japanese scallop Patinopecten yessoensis
- GnRH-PROMOTED SPERMATOGONIAL PROLIFERATION OF SCALLOP MEDIATES THROUGH STEROIDOGENESIS(Endocrinology,Abstracts of papers presented at the 76^ Annual Meeting of the Zoological Society of Japan)
- MOLECULAR CLONING OF A PUTATIVE SEROTONIN RECEPTOR EXPRESSED IN THE OVARY OF SCALLOP, PATINOPECTEN YESSOENSIS(Developmental Biology,Abstracts of papers presented at the 76^ Annual Meeting of the Zoological Society of Japan)
- REGULATION OF GONIAL MULTIPLICATION BY A GnRH-LIKE FACTOR IN THE CENTRAL NERVOUS SYSTEM OF THE PATINOPECTEN YESSOENSIS(Endocrinology)(Proceedings of the Seventy-Third Annual Meeting of the Zoological Society of Japan)
- AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition(Speech Corpora and Related Topics, Corpus-Based Speech Technologies)
- Missing Feature Theory Applied to Robust Speech Recognition over IP Network(Speech Dynamics by Ear, Eye, Mouth and Machine)
- Effects of Proteolytic Digestion on the Control Mechanism of Ciliary Orientation in Ciliated Sheets from Paramecium
- CENSREC-3: An Evaluation Framework for Japanese Speech Recognition in Real Car-Driving Environments(Speech and Hearing)
- A Design for a Collaborative Steering System of Microphone Array and Video Camera Toward Multi-Lingual Tele-Conference (特集 インタラクション技術の革新と実用化)
- A design of adaptive beamformer based on average speech spectrum for noisy speech recognition
- A Microphone Array-Based 3-D N-Best Search Method for Recognizing Multiple Sound Sources
- The present status, progress, and usage of speech databases in Japan
- IMPROVING ACCURACY IN PARAMETER ESTIMATION IN AN EXTENDED KALMAN PARTICLE FILTERS FOR NOISY SPEECH RECOGNITION
- ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles(Speech Recognition, Statistical Modeling for Speech Processing)
- Construction of Audio-Visual Speech Corpus Using Motion-Capture System and Corpus Based Facial Animation(Life-like Agent and its Communication)
- Passive hybrid subtractive beamformer for near-field sound sources
- An Acoustic Modeling Method Robustagainst Changes of Speaking Stylein Error Recovery
- A Hybrid HMM/BN Acoustic Model Utilizing Pentaphone-Context Dependency(Speech Recognition, Statistical Modeling for Speech Processing)
- Improving Acoustic Model Precision by Incorporating a Wide Phonetic Context Based on a Bayesian Framework(Speech Recognition, Statistical Modeling for Speech Processing)
- A Hybrid HMM/BN Acoustic Model for Automatic Speech Recognition (Special Issue on Speech Information Processing)
- MIXTURE OF FACTOR ANALYZED HMM
- Iterative Estimation and Compensation of Signal Direction for Moving Sound Source by Mobile Microphone Array(Engineering Acoustics)
- TIME-VARYING NOISE COMPENSATION BY SEQUENTIAL MONTE CARLO METHOD
- Burst Error Recovery for Huffman Coding(Algorithm Theory)
- Audio-Visual Speech Recognition Based on Optimized Product HMMs and GMM Based-MCE-GPD Stream Weight Estimation (Special Issue on Speech Information Processing)