A Constrained Gaussian Mixture Model for Correlation-Based Cluster Analysis of Gene Expression Data
スポンサーリンク
概要
- 論文の詳細を見る
Clustering is a practical data analysis step in gene expression-based studies. Model-based clusterings, which are based on probabilistic generative models, have two advantages: the number of clusters can be determined based on statistical criteria, and the clusters are robust against the observation noises in data. Many existing approaches assume multi-variate Gaussian mixtures as generative models, which are analogous to the use of Euclidean or Mahalanobis type distance as the similarity measure. However, these types of similarity measures often fail to detect co-expressed gene groups. We propose a novel probabilistic model for cluster analyses based on the correlation between gene expression patterns. We also propose a “meta” cluster analysis method to eliminate the dependence of the clustering result on initial values of the clustering algorithm. In empirical studies with a time course gene expression dataset of <i>Bacillus subtilis</i> during sporulation, our method acquires more stable and informative results than the ordinary Gaussian mixture model-based clustering, <i>k</i>-means clustering and hierarchical clustering algorithms, which are widely used in this field. In addition, with the meta-cluster analysis, biologically-meaningful expression patterns are extracted from a set of clustering results. The constraints in our model worked more efficiently than those in the previous studies. In our experiment, such constraints contributed to the stability of the clustering results. Moreover, the clustering based on the Bayesian inference was found to be more stable than those by the conventional maximum likelihood estimation.
- 一般社団法人情報処理学会の論文
- 2009-05-25
著者
-
行縄 直人
京都大学情報学研究科
-
Naoto Yukinawa
Graduate School of Informatics, Kyoto University
-
Taku Yoshioka
ATR Computational Neuroscience Laboratories
-
Kazuo Kobayashi
Graduate School of Information Science, Nara Institute of Science and Technology
-
Naotake Ogasawara
Graduate School of Information Science, Nara Institute of Science and Technology
-
Shin Ishii
Graduate School of Informatics, Kyoto University
-
Naotake Ogasawara
Graduate School Of Information Science Nara Institute Of Science And Technology
-
Shin Ishii
Graduate School Of Informatics Kyoto University
-
Kazuo Kobayashi
Graduate School Of Information Science Nara Institute Of Science And Technology
関連論文
- cDNA microarray を用いた肝細胞癌の遺伝子解析 : 予後予測を含めた臨床応用への試み(第105回日本外科学会定期学術集会)
- 実画像からの重なり合ったひも状オブジェクトの認識(機械学習,一般)
- 二値分類器集合による遺伝子発現プロファイルからの癌サブクラス識別法(バイオインフォマティックス(1))
- 線形ダイナミカルシステムモデルの変分ベイズ推定による遺伝子発現時系列のシステム同定
- 二値分類器組み合わせの確率モデルに基づく多クラスパターン識別
- A Constrained Gaussian Mixture Model for Correlation-Based Cluster Analysis of Gene Expression Data
- 二値分類器組み合わせの確率モデルに基づく多クラスパターン識別
- 線形ダイナミカルシステムモデルの変分ベイズ推定による遺伝子発現時系列のシステム同定
- 注意の影響を考慮した知覚学習のシミュレーションモデル (ニューロコンピューティング)
- 注意の影響を考慮した知覚学習のシミュレーションモデル
- システム神経生物学スプリングスクール2012開催報告