Melody Track Selection Using Discriminative Language Model

概要

論文の詳細を見る
In this letter we focus on the task of selecting the melody track from a polyphonic MIDI file. Based on the intuition that music and language are similar in many aspects, we solve the selection problem by introducing an n-gram language model to learn the melody co-occurrence patterns in a statistical manner and determine the melodic degree of a given MIDI track. Furthermore, we propose the idea of using background model and posterior probability criteria to make modeling more discriminative. In the evaluation, the achieved 81.6% correct rate indicates the feasibility of our approach.
（社）電子情報通信学会の論文
2008-06-01

著者

Yan Yonghong
Thinkit Speech Lab. Institute Of Acoustics Chinese Academy Of Sciences
Yan Yonghong
Institute Of Acoustics Chinese Academy Of Science
SUO Hongbin
ThinkIT Speech Lab., Institute of Acoustics, Chinese Academy of Sciences
Yan Yonghong
Thinkit Speech Lab Institute Of Acoustics Chinese Academy Of Sciences
Yan Yonghong
Thinkit Speech Lab.
Yan Yonghong
Thinkit Speech Laboratory Institute Of Acoustics Chinese Academy Of Sciences Beijing
LI Ming
ThinkIT Speech Lab., Institute of Acoustics, Chinese Academy of Sciences
WU Xiao
ThinkIT Speech Lab., Institute of Acoustics, Chinese Academy of Sciences
Li Ming
Thinkit Speech Lab. Institute Of Acoustics Chinese Academy Of Sciences
Wu Xiao
Thinkit Speech Lab. Institute Of Acoustics Chinese Academy Of Sciences
Suo Hongbin
Thinkit Speech Lab Institute Of Acoustics Chinese Academy Of Sciences

関連論文

Effects of single-channel speech enhancement algorithms on Mandarin speech intelligibility (応用音響)
Approximate Decision Function and Optimization for GMM-UBM Based Speaker Verification
Using a Kind of Novel Phonotactic Information for SVM Based Speaker Recognition
Robust Speaker Clustering Using Affinity Propagation
An LVCSR Based Reading Miscue Detection System Using Knowledge of Reference and Error Patterns
Effective Acoustic Modeling for Pronunciation Quality Scoring of Strongly Accented Mandarin Speech
A One-Pass Real-Time Decoder Using Memory-Efficient State Network
Development of a Mandarin-English Bilingual Speech Recognition System for Real World Music Retrieval
Automatic Singing Performance Evaluation for Untrained Singers
Melody Track Selection Using Discriminative Language Model
Automatic Language Identification with Discriminative Language Characterization Based on SVM
A two-element-microphone-array-based speech recognition system in vehicle environment(Commemoration of the Japan-China Joint Conference on Acoustics 2007 (JCA2007))
Speech Enhancement Using Improved Adaptive Null-Forming in Frequency Domain with Postfilter
Effects of the Temporal Fine Structure in Different Frequency Bands on Mandarin Tone Perception
Acoustic Feature Optimization Based on F-Ratio for Robust Speech Recognition
A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features
Enhancing the Robustness of the Posterior-Based Confidence Measures Using Entropy Information for Speech Recognition
Two-Microphone Noise Reduction Using Spatial Information-Based Spectral Amplitude Estimation
A bayesian logistic regression approach to spoken language identification

Melody Track Selection Using Discriminative Language Model

スポンサーリンク

概要

著者

関連論文

スポンサーリンク