Predicting the Degradation of Speech Recognition Performance from Sub-band Dynamic Ranges (特集音声言語情報処理とその応用)

概要

論文の詳細を見る
An acoustic measure for predicting the degradation of speech recognition performance due to noise contamination is developed. The merits of the proposed measure over using conventional SNR are that 1) the measure does not require original clean signal as a reference signal, 2) the measure takes the spectral shape of noise into account and, 3) the measure can be used to predict recognition performance directly. The basic idea of the measure is to utilize the dynamic range of the sub-band signals as an estimate of the SNR and to predict the degradation of recognition performance by taking the product of the recognition accuracy of each sub-band. The proposed measure is tested through experimental evaluation using white Gaussian noise and human-speech-like noise (HSN). In the experiment, the correlation between the predicted and the actual recognition accuracies are 0.96 and 0.99 for white noise and HSN, respectively. The results confirm the effectiveness of the proposed measure.
一般社団法人情報処理学会の論文
2002-07-15