WWWを用いた書き言葉特有語彙から話し言葉語彙への用言の言い換え

概要

論文の詳細を見る
書き言葉で使われる語彙と，話し言葉で使われる語彙には大きな違いがある．そのため，書き言葉テキストから合成された音声は不自然なものとなってしまう．書き言葉テキストからでも自然な音声の合成を可能にするために，本論文では，書き言葉特有語彙から話し言葉語彙への言い換えを学習する手法を提案する．ある表現が書き言葉特有語彙であるか，話し言葉語彙であるかは，その表現の書き言葉コーパスでの出現確率と話し言葉コーパスでの出現確率をもとにして判断する．書き言葉コーパスと話し言葉コーパスはWWWから自動収集したものを用いる．実験の結果，書き言葉コーパスと話し言葉コーパスの収集精度は94%，言い換え学習の精度は79%であり，提案手法の有効性を示すことができた．There are a lot of differences between expressions used in written language and spoken language. This paper represents a method of paraphrasing written language specific vocabulary into spoken language vocabulary. They can be distinguished based on the occurrence probability in written and spoken language corpora which are automatically collected from WWW. Experimental results indicated the effectiveness of our method. The precision of the collected corpora was 94%, and the accuracy of learning paraphrases was 79%.
2004-10-10

WWWを用いた書き言葉特有語彙から話し言葉語彙への用言の言い換え

スポンサーリンク

概要

著者

関連論文

スポンサーリンク