Effects of single-channel speech enhancement algorithms on Mandarin speech intelligibility (応用音響)
スポンサーリンク
概要
- 論文の詳細を見る
Recent studies have shown that the state-of-the-art single channel speech enhancement algorithms cannot improve, usually decrease, the English speech intelligibility in most listening conditions. This study investigates the effects of single-channel speech enhancement algorithms on Mandarin speech intelligibility. Noisy speech signals were first processed by five different single-channel speech enhancement algorithms in various noise conditions and then presented to ten normal hearing listeners for syllables identification. Recognition results of Mandarin syllables provided by listeners show that the tested speech enhancement algorithms cannot improve the Mandarin speech intelligibility in the tested listening conditions, and even great deteriorations in speech recognition. The recognition errors were further analyzed in term of "tone error" and "phoneme error" to find out the potential factors that lead to the recognition deterioration. The analysis results demonstrated that (i) the tone is a relatively robust cue for speech enhancement; (ii) the main errors come from the mistakes of the syllable and consonant confusions.
- 社団法人電子情報通信学会の論文
- 2009-06-18
著者
-
Akagi Masato
School of Information Science, Japan Advanced Institute of Science and Technology
-
YAN Yonghong
Institute of Acoustics, Chinese Academy of Sciences
-
Li Junfeng
School of Information Science, Japan Advanced Institute of Science and Technology
-
Li Junfeng
Japan Advanced Inst. Sci. And Technol.
-
Li Junfeng
School Of Information Science Japan Advanced Institute Of Science And Technology
-
Li Junfeng
School Of Aerospace Tsinghua University
-
Akagi Masato
School Of Information Science Japan Advanced Institute Of Science And Technology
-
Akagi Masato
School Of Information Sci. Japan Advanced Inst. Of Sci. And Technol. (jaist) 1-1 Asahidai Nomi Ishik
-
Yang Lin
Institute Of Acoustics Chinese Academy Of Science
-
Yan Yonghong
Institute Of Acoustics Chinese Academy Of Science
-
Zhang Jianping
Institute of Acoustics, Chinese Academy of Science
-
Zhang Jianping
Institute Of Acoustics Chinese Academy Of Science
-
Yan Yonghong
Thinkit Speech Lab.
-
Yan Yonghong
Thinkit Speech Laboratory Institute Of Acoustics Chinese Academy Of Sciences Beijing
-
Akagi Masato
Japan Advanced Inst. Sci. And Technol. Ishikawa Jpn
-
Akagi Masato
School Of Information Sci. Japan Advanced Inst. Of Sci. And Technol.
関連論文
- 変調伝達関数に基づいた骨導音声ブラインド回復法の検討
- A DOA estimation algorithm based on equalization-cancellation theory (応用音響)
- 線形予測に基づいた骨導音声回復法の総合評価
- 音声に含まれる感情情報の認識 : 感情空間をどのように表現するか
- Acoustic Feature Optimization Based on F-Ratio for Robust Speech Recognition
- A study on the LP-based blind model in restoring bone-conducted speech (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- An LP-based blind restoration method for improving intelligibility of bone-conducted speech (音声)
- Trajectory Optimization of Multi-Asteroids Exploration with Low Thrust
- 方向性の手掛かりが雑音環境下での報知音の検知能力に及ぼす影響(聴覚・音声・言語とその障害,一般)
- ヒトの聴覚情報処理過程を考慮した音声認識モデル(感情音声,韻律,声質,音声生成・知覚,脳機能,一般)
- 基本周波数包絡が異なる感情音声聴取時の脳活動測定
- 聴覚末梢系の機能モデルの提案 : 聴神経の位相固定性及びスパイク生成機構のモデル化
- EA2010-31 線形予測に基づいた骨導音声回復法の総合評価
- A flexible spectral modification method based on temporal decomposition and Gaussian mixture model
- Trajectory Optimization of Multi-Asteroids Exploration with Low Thrust
- 雑音残響環境下におけるMTFに基づくパワーエンベロープ回復処理の検討
- fMRIを用いた歌声と話声における脳活動の差異の検討
- Influences of real-time auditory feedback on formant perturbations
- A speech dereverberation method based on the MTF concept in power envelope restoration
- An improved method based on the MTF concept for restoring the power envelope from a reverberant signal
- A DOA estimation algorithm based on equalization-cancellation theory (応用音響)
- On the Application of Temporal Decomposition to VQ-Based Speaker Identification
- Effects of single-channel speech enhancement algorithms on Mandarin speech intelligibility (応用音響)
- Improvement of robustness using selective sound segregation for automatic speech recognition systems in noisy environments (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- LP-baesd method of blind restoration to improve intelligibility of bone-conducted speech
- Approximate Decision Function and Optimization for GMM-UBM Based Speaker Verification
- Using a Kind of Novel Phonotactic Information for SVM Based Speaker Recognition
- Robust Speaker Clustering Using Affinity Propagation
- A Noise Reduction System in Localized and Non-Localized Noise Environments
- In Situ Resistance Measurement of Nickel-Induced Lateral Crystallization of Amorphous Silicon
- 変調伝達関数に基づいた骨導音声ブラインド回復法の検討
- アジアの音
- Noise reduction method based on generalized subtractive beamformer
- An LVCSR Based Reading Miscue Detection System Using Knowledge of Reference and Error Patterns
- Effective Acoustic Modeling for Pronunciation Quality Scoring of Strongly Accented Mandarin Speech
- A One-Pass Real-Time Decoder Using Memory-Efficient State Network
- Development of a Mandarin-English Bilingual Speech Recognition System for Real World Music Retrieval
- Automatic Singing Performance Evaluation for Untrained Singers
- Melody Track Selection Using Discriminative Language Model
- Automatic Language Identification with Discriminative Language Characterization Based on SVM
- Speech Enhancement Using Improved Adaptive Null-Forming in Frequency Domain with Postfilter
- Fundamental frequency estimation for noisy speech based on instantaneous amplitude and frequency
- A Noise Reduction Method Based on a Generalized Subtractive Beamformer
- Comparative evaluation of modulation-transfer-function-based blind restoration of sub-band power envelopes of speech as a front-end processor for automatic speech recognition systems
- Sub-Band Temporal Envelope Restoration for ASR in Reverberation Environment (国際ワークショップ Frontiers in Speech and Hearing Research)
- A study on expressive speech and perception of semantic primitives: comparison between Taiwanese and Japanese (音声)
- A flexible temporal decomposition-based spectral modification method using asymmetric Gaussian mixture model (音声)
- A Study on Restoration of Bone-Conducted Speech with LPC-Based Model (国際ワークショップ Frontiers in Speech and Hearing Research)
- 聴神経の順応特性の計算機シミュレーション : 順応の音圧レベル依存特性のモデル化
- A computational model of co-modulation masking release
- A method of signal extraction from noisy signal based on auditory scene analysis
- Modified Restricted Temporal Decomposition and Its Application to Low Rate Speech Coding
- Foreword to the special issue on "Applied Systems"
- Improvement of the Restricted Temporal Decomposition Method for LSF Parameters
- Fundamental Frequency Estimation for Noisy Speech Using Entropy-Weighted Periodic and Harmonic Features
- Evaluations of TS-BASE for speech enhancement and binaural benefits preservation (応用音響)
- Adaptive β-order Generalized Spectral Subtraction for Speech Enhancement
- Effects of the Temporal Fine Structure in Different Frequency Bands on Mandarin Tone Perception
- A Two-Microphone Noise Reduction Method in Highly Non-stationary Multiple-Noise-Source Environments
- Acoustic Feature Optimization Based on F-Ratio for Robust Speech Recognition
- A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features
- Comparison of Emotion Perception among Different Cultures
- 残響環境下におけるTS-BASE/WFの性能評価--TS-BASE/WFの改良手法についての検討
- Enhancing the Robustness of the Posterior-Based Confidence Measures Using Entropy Information for Speech Recognition
- 聴取印象に着目した音声の個人性知覚に関する基礎研究
- 会長就任にあたって : 新たな四半世紀に向けて計画から実行へ
- 雑音残響環境下での変調伝達関数に基づくパワーエンベロープ回復処理と音声認識への応用(オーガナイズドセッション:スピーチエンハンスメント,音声・音響信号処理,音声及び一般)
- 雑音残響環境下での変調伝達関数に基づくパワーエンベロープ回復処理と音声認識への応用(オーガナイズドセッション:スピーチエンハンスメント,音声・音響信号処理,音声及び一般)
- 雑音残響環境下での変調伝達関数に基づくパワーエンベロープ回復処理と音声認識への応用(オーガナイズドセッション:スピーチエンハンスメント,音声・音響信号処理,音声及び一般)
- Effects of single-channel speech enhancement algorithms on Mandarin speech intelligibility
- 招待講演 聴覚と音研究
- 変調伝達関数の概念に基づいた音声伝達指標のブラインド推定法の検討(音場計測・解析,アクティブ・コントロール,一般)
- 電子音響透かし法のための蝸牛遅延フィルタの最適構成に関する検討(音響信号処理,聴覚,一般)
- EEGによる基本周波数の時間変化に応じた脳活動の計測
- 音情景理解を応用した音声プライバシー保護(異種メディア融合,コンテンツ処理,メディア検索,電子透かし,一般)
- 音情景理解を応用した音声プライバシー保護(招待講演,異種メディア融合,コンテンツ処理,メディア検索,電子透かし,一般)
- Two-Microphone Noise Reduction Using Spatial Information-Based Spectral Amplitude Estimation
- 変調伝達関数に基づいたパワーエンベロープ回復処理における音声区間検出の検討(一般,音声・音響信号処理,音声及び一般)
- 変調伝達関数に基づいたパワーエンベロープ回復処理における音声区間検出の検討(一般,音声・音響信号処理,音声及び一般)
- 変調伝達関数に基づいたパワーエンベロープ回復処理における音声区間検出の検討(一般,音声・音響信号処理,音声及び一般)
- 2周波数混合波形による瞬時周波数計測の精度評価 : FFTを使用しない瞬時周波数計測(一般,音声・音響信号処理,音声及び一般)
- 605 Phytolith evidence for rice cultivation and spread in Mid-Late Neolithic archaeological sites in central North China
- 606 Phytolith analysis for differentiating between foxtail millet (Setaria italica) and green foxtail (Setaria viridis)
- 575 Phytolith evidence of millet agriculture during about 6000-2100 cal. aBP. in the Guanzhong Basin, China
- In Situ Resistance Measurement of Nickel-Induced Lateral Crystallization of Amorphous Silicon
- A low-cost concatenative TTS for monosyllabic languages (音声)
- Improving Naturalness of HMM-Based TTS Trained with Limited Data by Temporal Decomposition
- フーリエ変換を使用しない基本周波数測定による楽器音F0推定 : 時間・周波数分界能の考察
- 残響音声からの音声伝達指標のブラインド推定法の検討
- Adaptive equalization-cancellation model and its application to sound localization in noisy reverberant environments
- Loss of heterozygosity and methylation of multiple tumor suppressor genes on chromosome 3 in hepatocellular carcinoma