Cross-language Voice Conversion Evaluation Using Bilingual Databases (特集音声言語情報処理とその応用)

概要

論文の詳細を見る
This paper describes experiments that test an extension of techniques for converting the voice of one speaker to sound like that of another speaker, to include cross-language utterances, such as would be required for spoken language translation or language training applications. In particular, it addresses the issue of evaluation of system performance, and compares objective tests using a perceptually-motivated acoustic measure, with perceptual tests of voice quality and speaker resemblance. The proposed method uses Japanese and English speech databases from 2 female and 2 male bilingual speakers for training in a system based on a Gaussian mixture model (GMM) and a high quality vocoder. Results indicate that training with cross-language models also produces close acoustic matches between source and target speakers' voices. Perceptual tests revealed little significant difference in the performance of mapping functions trained on single-language and cross-language data pairs.
一般社団法人情報処理学会の論文
2002-07-15