Enhancement of esophageal speech using formant synthesis

概要

論文の詳細を見る
The feasibility of using the formant analysis-synthesis approach to replace the voicing sources of esophageal speech was explored. Using inverse-filtered signals extracted from normal speakers provided the voicing sources. Pitch extraction was tested with various pitch extraction methods, and then a computationally simple, band-limited auto-correlation method was chosen. To accomplish stable and practical speech enhancement, the input signal was divided into low-and high-frequency channels, then only the low-frequency channel was processed by the formant analysis-synthesis method. A special purpose DSP-hardware unit was designed to perform the proposed analysis-synthesis process in real-time. Subjective evaluation tests (rating scale method) have been made with seven well-trained esophageal speakers and three speech therapists. Results of the subjective test showed that the synthesized speech was significantly improved, especially in cases of "loudness", "sonority", "strained", "stoma noise", "choppy", "stability", "intelligibility", "recognizability", and "duration" features.
社団法人日本音響学会の論文