Asynchronous Articulatory Feature Recognition Using Dynamic Bayesian Networks
スポンサーリンク
概要
- 論文の詳細を見る
This paper builds on previous work where dynamic Bayesian networks (DBN) were proposed as a model for articulatory feature recognition. Using DBNs makes it possible to model the dependencies between features, an addition to previous approaches which was found to improve feature recognition performance. The DBN results were promising, giving close to the accuracy of artificial neural nets (ANNs). However, the system was trained on canonical labels, leading to an overly strong set of constraints on feature co-occurrence. In this study, we describe an embedded training scheme which learns a set of data-driven asynchronous feature changes where supported in the data. Using a subset of the OGI Numbers corpus, we describe articulatory feature recognition experiments using both canonically-trained and asynchronous-feature DBNs. Performance using DBNs is found to exceed that of ANNs trained on an identical task, giving a higher recognition accuracy. Furthermore, inter-feature dependencies result in a more structured model, giving rise to fewer feature combinations in the recognition output. In addition to an empirical evaluation of this modeling approach, we give a qualitative analysis, investigating the asynchrony found through our data-driven method and interpreting it using linguistic knowledge.
- 一般社団法人情報処理学会の論文
- 2004-12-20
著者
-
King Simon
Centre For Speech Technology Research University Of Edinburgh
-
Wester Mirjam
Centre For Speech Technology Research University Of Edinburgh
-
Frankel Joe
Centre For Speech Technology Research University Of Edinburgh
関連論文
- Asynchronous Articulatory Feature Recognition Using Dynamic Bayesian Networks
- Asynchronous Articulatory Feature Recognition Using Dynamic Bayesian Networks
- Asynchronous Articulatory Feature Recognition Using Dynamic Bayesian Networks
- Combining perceptually-motivated spectral shaping with loudness and duration modification for intelligibility enhancement of HMM-based synthetic speech in noise
- Combining perceptually-motivated spectral shaping with loudness and duration modification for intelligibility enhancement of HMM-based synthetic speech in noise