Low complexity speaker identification in AAC domain (メディア工学)
スポンサーリンク
概要
- 論文の詳細を見る
This paper presents an implementation of a low-complexity speaker identification algorithm working in the compressed audio domain. The goal is to perform speaker modeling and identification without decoding the AAC bitstream to extract speaker dependent features, thus saving important system resource. The silence detection and MFCC parameters are calculated from MDCT coefficient other than from the FFT spectrum. Each speaker is modeled by a GMM, which is trained using the EM algorithm to refine the weight and the parameters of each component. The recognition accuracies of our algorithm reach 97% for ARCTIC database with 16% CPU overload comparing to the algorithms based on the analysis of the decoded PCM signals.
- 社団法人映像情報メディア学会の論文
- 2008-10-23
著者
-
Haseyama Miki
Hokkaido Univ. Sapporo Jpn
-
Haseyama Miki
Graduate School Of Information Science And Technology Hokkaido University
-
AI Haojun
National Engineering Research Center for Multimedia Software, Wuhan University
-
Ai Haojun
National Engineering Research Center For Multimedia Software Wuhan University:graduate School Of Inf
-
Haseyama Miki
Graduate School Of Engineering Hokkaido University
関連論文
- An ER Algorithm-Based Method for Removal of Adherent Water Drops from Images Obtained by a Rear View Camera Mounted on a Vehicle in Rainy Conditions
- A Kalman Filter-Based Method for Restoration of Images Obtained by an In-Vehicle Camera in Foggy Conditions
- POCS-Based Texture Reconstruction Method Using Clustering Scheme by Kernel PCA(Papers Selected from the 21st Symposium on Signal Processing)
- A Study on Adaptive Spatial-Temporal Error Concealment method for Wavelet based Video Coding in Wireless Networks
- Kalman Filter-Based Error Concealment for Video Transmission
- A New Fitness Function of a Genetic Algorithm for Routing Applications
- A Kalman Filter Using Texture for Noise Reduction in SAR Images
- 速度を増すために入力画像から直接直線を抽出するアルゴリズム
- A New Conic Section Extraction Approach and Its Applications(Pattern Recognition)
- Estimating Number of People Using Calibrated Monocular Camera Based on Geometrical Analysis of Surface Area
- Performance of Reed-Solomon Coded MC-DS-CDMA with Bi-orthogonal Modulation
- Performance of Adaptive Trellis Coded Modulation Applied to MC-CDMA with Bi-orthogonal Keying
- Steady-State Properties of a CORDIC-Based Adaptive ARMA Lattice Filter(Digital Signal Processing)
- Convergence Properties of a CORDIC-Based Adaptive ARMA Lattice Filter(Digital Signal Processing)
- A Cost-Effective CORDIC-Based Architecture for Adaptive Lattice Filters(Audio/Speech Coding)(Applications and Implementations of Digital Signal Processing)
- A Transformation Method of a CORDIC ARMA Lattice Filter for Signal Synthesis (Special Section on VLSI for Digital Signal Processing)
- Error-Resilient 3-D Wavelet Video Coding with Duplicated Lowest Sub-Band Coefficients and Two-Step Error Concealment Method
- 航空写真のための高速画像分類
- A MODEL-BASED APPROACH FOR SOCCER TEAM ADVANTAGE MEASUREMENT(International Workshop on Advanced Image Technology 2006)
- A Significant Property of Mapping Parameters for Signal Interpolation Using Fractal Interpolation Functions(Digital Signal Processing)
- A Novel Contour Description with Expansion Ability Using Extended Fractal Interpolation Functions(Image Processing, Image Pattern Recognition)
- A Simplification Method for Line Drawings which Retains the Shape by Using the Fractal Dimension
- GAおよびSAを用いたフラクタル画像符号化(画像処理)
- Video Frame Interpolation by Image Morphing Including Fully Automatic Correspondence Setting
- A SIMILAR IMAGE CLUSTERING METHOD INCLUDING AUTOMATIC SELECTION OF NUMBER OF CLUSTERS(International Workshop on Advanced Image Technology 2006)
- AN IMAGE ENLARGEMENT METHOD USING ITERATED FUNCTION SYSTEM(International Workshop on Advanced Image Technology 2006)
- A Novel Video Retrieval Method Based on Web Community Extraction Using Features of Video Materials
- A SIMPLE WORD SPOTTING METHOD BASED ON TEMPLATE MATCHING FOR SPEECH RETRIEVAL(International Workshop on Advanced Image Technology 2005)
- A study on adaptive spatial-temporal error concealment method for wavelet based video coding in wireless networks (画像工学)
- A Study on Adaptive Spatial-Temporal Error Concealment method for Wavelet based Video Coding in Wireless Networks
- Low complexity speaker identification in AAC domain (メディア工学)
- Adaptive Missing Texture Reconstruction Method Based on Kernel Canonical Correlation Analysis with a New Clustering Scheme
- POCS-Based Annotation Method Using Kernel PCA for Semantic Image Retrieval
- A REGION MERGING METHOD FOR IMAGE SEGMENTATION(International Workshop on Advanced Image Technology 2005)
- Players Clustering Based on Graph Theory for Tactics Analysis Purpose in Soccer Videos(Papers Selected from the 21st Symposium on Signal Processing)
- A Study on Adaptive Spatial-Temporal Error Concealment method for Wavelet based Video Coding in Wireless Networks
- An Accurate Scene Segmentation Method Based on Graph Analysis Using Object Matching and Audio Feature
- Audio-Based Shot Classification for Audiovisual Indexing Using PCA, MGD and Fuzzy Algorithm(Papers Selected from the 21st Symposium on Signal Processing)
- Cross Low-Dimension Pursuit for Sparse Signal Recovery from Incomplete Measurements Based on Permuted Block Diagonal Matrix
- A Novel Framework for Extracting Visual Feature-Based Keyword Relationships from an Image Database
- A Novel Framework for Extracting Visual Feature-Based Keyword Relationships from an Image Database
- Erratum: Error-Resilient 3-D Wavelet Video Coding with Duplicated Lowest Sub-Band Coefficients and Two-Step Error Concealment Method [IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E93.A (2010) , No. 11 pp.2173-218