Sub-Band Temporal Envelope Restoration for ASR in Reverberation Environment (国際ワークショップ Frontiers in Speech and Hearing Research)
スポンサーリンク
概要
- 論文の詳細を見る
Dereverberation algorithms usually suppose that the room acoustics are known. Before doing dereverberation, the impulse response of the room acoustics is estimated. However, it is difficult to estimate the characteristic of room acoustics only from observed reverberant signals. Our proposed method is motivated by speech intelligibility experiments which show the importance of the speech temporal envelopes for speech perception. We proposed a sub-band modulation transfer function (MTF) based power envelope estimation algorithm for reverberant speech. In our algorithm, the impulse response of a room acoustics is assumed as an exponential decay modulated white noise. Speech is supposed as a temporal modulated white noise as carrier in each frequency sub-band. The reverberant speech is the convolution between the impulse response of room acoustics and speech signals. Based on theoretical analysis of the stochastic signal, we can restore the temporal power envelope of speech in each sub-band by a power envelope inverse filtering. The algorithm is designed as a front-end processor for ASR, and is tested on Japanese digital strings recognition task. Reverberated speech is made artificially by simple convolution between room acoustic and speech. Recognition results show that the proposed de-reverberation algorithm has improves 16.11% on average for reverberatation time from 0.5s to 1.5s compared with auditory power spectrum based method (AFCC).
- 社団法人電子情報通信学会の論文
- 2006-03-20
著者
-
Unoki Masashi
School of Information Science, Japan Advanced Institute of Science and Technology
-
Akagi Masato
School of Information Science, Japan Advanced Institute of Science and Technology
-
Lu Xugang
School of Information Science, Japan Advanced Institute of Science and Technology
-
Akagi Masato
School Of Information Sci. Japan Advanced Inst. Of Sci. And Technol. (jaist) 1-1 Asahidai Nomi Ishik
-
Unoki Masashi
School Of Information Science Japan Advanced Institute Of Science And Technology
-
Lu Xugang
Atr Spoken Language Communication Res. Laboratories
-
Lu Xugang
School Of Information Science Japan Advanced Institute Of Science And Technology
-
Unoki Masashi
Information School Japan Advanced Institute Of Science And Technology
-
Unoki Masashi
Japan Advanced Inst. Sci. And Technol. Ishikawa Jpn
-
Akagi Masato
School Of Information Sci. Japan Advanced Inst. Of Sci. And Technol.
関連論文
- A DOA estimation algorithm based on equalization-cancellation theory (応用音響)
- Study on a method of suppressing noise based on the MTF concept
- An MTF-based method of blind restoration for improving intelligibility of bone-conducted speech
- A study on the LP-based blind model in restoring bone-conducted speech (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- An LP-based blind restoration method for improving intelligibility of bone-conducted speech (音声)
- Robust voice activity detection based on noise eigenspace
- A flexible spectral modification method based on temporal decomposition and Gaussian mixture model
- A speech dereverberation method based on the MTF concept in power envelope restoration
- An improved method based on the MTF concept for restoring the power envelope from a reverberant signal
- A DOA estimation algorithm based on equalization-cancellation theory (応用音響)
- Effects of single-channel speech enhancement algorithms on Mandarin speech intelligibility (応用音響)
- Improvement of robustness using selective sound segregation for automatic speech recognition systems in noisy environments (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Improvement of robustness using selective sound segregation for automatic speech recognition systems in noisy environments
- LP-baesd method of blind restoration to improve intelligibility of bone-conducted speech
- A model-based investigation of activations of the tongue muscles in vowel production
- A Noise Reduction System in Localized and Non-Localized Noise Environments
- Noise reduction method based on generalized subtractive beamformer
- A study on audio watermarking method based on the cochlear delay characteristics
- Fundamental frequency estimation for noisy speech based on instantaneous amplitude and frequency
- Estimation of fundamental frequency of reverberant speech by utilizing complex cepstrum analysis
- Speech Enhancement based on Noise Eigenspace Projection
- A speech enhancement framework based on noise eigenspace projection (音声)
- Estimate of auditory filter shape using notched-noise masking for various signal frequencies
- Comparative evaluation of modulation-transfer-function-based blind restoration of sub-band power envelopes of speech as a front-end processor for automatic speech recognition systems
- Sub-Band Temporal Envelope Restoration for ASR in Reverberation Environment (国際ワークショップ Frontiers in Speech and Hearing Research)
- A study on expressive speech and perception of semantic primitives: comparison between Taiwanese and Japanese (音声)
- A flexible temporal decomposition-based spectral modification method using asymmetric Gaussian mixture model (音声)
- A Study on Restoration of Bone-Conducted Speech with LPC-Based Model (国際ワークショップ Frontiers in Speech and Hearing Research)
- A computational model of co-modulation masking release
- A method of signal extraction from noisy signal based on auditory scene analysis
- Modified Restricted Temporal Decomposition and Its Application to Low Rate Speech Coding
- Foreword to the special issue on "Applied Systems"
- A Model-Based Learning Process for Modeling Coarticulation of Human Speech(Knowledge, Information and Creativity Support System)
- Normalization of vocal tract shape using radial basis function (音声)
- Normalization of vocal tract shape using radial basis function
- Optimization and Evaluation of a Coarticulation Model based on Observation and Simulation
- Parameter Optimization for a Coarticulation Model Based on Observation and Simulation (国際ワークショップ Frontiers in Speech and Hearing Research)
- Extraction of Low Dimensional Representation of Vowels in Articulatory Space (国際ワークショップ Frontiers in Speech and Hearing Research)
- Evaluations of TS-BASE for speech enhancement and binaural benefits preservation (応用音響)
- Adaptive β-order Generalized Spectral Subtraction for Speech Enhancement
- A Two-Microphone Noise Reduction Method in Highly Non-stationary Multiple-Noise-Source Environments
- A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features
- Adaptive equalization-cancellation model and its application to sound localization in noisy reverberant environments
- Study on Speech Watermarking Based on Modifications to LSFs for Tampering Detection
- Study on Speech Watermarking Based on Modifications to LSFs for Tampering Detection
- Study on Speech Watermarking Based on Modifications to LSFs for Tampering Detection
- Study on Blind Method of Estimating Speech Transmission Index from Noisy Reverberant Amplitude-Modulated-Signals
- Study on Semi-scramble Method for Speech Signals Based on Phonemic Restoration