A Design for a Collaborative Steering System of Microphone Array and Video Camera Toward Multi-Lingual Tele-Conference (特集 インタラクション技術の革新と実用化)
スポンサーリンク
概要
- 論文の詳細を見る
It is very important for multi-lingual tele-conferencing through speech-to-speech translation to capture distant-talking speech with high quality. In addition, the speaker image is also needed to realize a natural communication in a such conference. A microphone array is an ideal candidate for capturing distant-talking speech. Uttered speech can be enhanced and speaker images can be captured by steering a microphone array and a video camera in the speaker direction. However, to realize automatic steering, it is necessary to localize the talker. To overcome this problem, we propose collaborative steering of the microphone array and the video camera in real-time for a multi-lingual tele-conference through speech-to-speech translation. We conducted experiments in a real room environment. The speaker localization rate (i.e., speaker image capturing rate) was 97.7%, speech recognition rate was 90.0%, and TOEIC score was 530 - 540 points, subject to locating the speaker at a 2.0 meter distance from the microphone array.
- 一般社団法人情報処理学会の論文
- 2002-12-15
著者
-
GRUHN Rainer
ATR Spoken Language Translation Research Labs.
-
NAKAMURA Satoshi
ATR Spoken Language Translation Research Labs.
-
NISHIURA Takanobu
Ritsumeikan University
-
Gruhn R
Atr Spoken Language Translation Res. Lab. Kyoto Jpn
-
Nakamura S
National Institute Of Information And Communications Technology
-
NISHIURA TAKANOBU
ATR Spoken Language Translation Research Laboratories
-
Nakamura Satoshi
Atr Spoken Language Translation Res. Lab. Kyoto Jpn
-
Nakamura Satoshi
Atr Spoken Language Communication Res. Lab. Kyoto‐fu Jpn
-
Nakamura Satoshi
Atr Spoken Language Translation Research Laboratories
-
Gruhn Rainer
ATR Spoken Language Translation Research Laboratories
関連論文
- Combination Therapy with Vascular Endothelial Growth Factor Neutralizing Antibody and Mitomycin C on Human Gastric Cancer Xenograft
- CENSREC-1-C : An evaluation framework for voice activity detection under noisy environments
- Noise and Channel Distortion Robust ASR System for DARPA SPINE2 Task (Special Issue on Speech Information Processing)
- A Study on Acoustic Modeling of Pauses for Recognizing Noisy Conversational Speech (Special Issue on Speech Information Processing)
- AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition(Speech Corpora and Related Topics, Corpus-Based Speech Technologies)
- Missing Feature Theory Applied to Robust Speech Recognition over IP Network(Speech Dynamics by Ear, Eye, Mouth and Machine)
- CENSREC-3: An Evaluation Framework for Japanese Speech Recognition in Real Car-Driving Environments(Speech and Hearing)
- A Design for a Collaborative Steering System of Microphone Array and Video Camera Toward Multi-Lingual Tele-Conference (特集 インタラクション技術の革新と実用化)
- A design of adaptive beamformer based on average speech spectrum for noisy speech recognition
- A Microphone Array-Based 3-D N-Best Search Method for Recognizing Multiple Sound Sources
- 3D N-best 探索法に基づく複数音源の位置推定と音声認識の統合
- 複数話者の音声認識における音源方向経路間距離を用いた3-D N-best探索法の評価
- The present status, progress, and usage of speech databases in Japan
- IMPROVING ACCURACY IN PARAMETER ESTIMATION IN AN EXTENDED KALMAN PARTICLE FILTERS FOR NOISY SPEECH RECOGNITION
- ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles(Speech Recognition, Statistical Modeling for Speech Processing)
- Construction of Audio-Visual Speech Corpus Using Motion-Capture System and Corpus Based Facial Animation(Life-like Agent and its Communication)
- Passive hybrid subtractive beamformer for near-field sound sources
- An Acoustic Modeling Method Robustagainst Changes of Speaking Stylein Error Recovery
- A Hybrid HMM/BN Acoustic Model Utilizing Pentaphone-Context Dependency(Speech Recognition, Statistical Modeling for Speech Processing)
- Improving Acoustic Model Precision by Incorporating a Wide Phonetic Context Based on a Bayesian Framework(Speech Recognition, Statistical Modeling for Speech Processing)
- A Hybrid HMM/BN Acoustic Model for Automatic Speech Recognition (Special Issue on Speech Information Processing)
- MIXTURE OF FACTOR ANALYZED HMM
- Iterative Estimation and Compensation of Signal Direction for Moving Sound Source by Mobile Microphone Array(Engineering Acoustics)
- TIME-VARYING NOISE COMPENSATION BY SEQUENTIAL MONTE CARLO METHOD
- Burst Error Recovery for Huffman Coding(Algorithm Theory)
- Audio-Visual Speech Recognition Based on Optimized Product HMMs and GMM Based-MCE-GPD Stream Weight Estimation (Special Issue on Speech Information Processing)
- CENSREC-4: An evaluation framework for distant-talking speech recognition in reverberant environments