Cluster Inference Methods and Graphical Models Evaluated on NCI60 Microarray Gene Expression Data
スポンサーリンク
概要
- 論文の詳細を見る
At present, there is a lack of a sound methodology to infer causal gene expression relationships on a genome wide basis. We address this first by examining the behaviour of some of the latest and fastest algorithms for tree and cluster analysis, particularly hierarchical methods popular in phylogenetics. Combined with these are two novel distances based on partial, rather than full, correlations. Theoretically, partial correlations should provide better evidence for regulatory genetic links than standard correlations. To compare the clusters obtained by many alternative methods we use tree consensus methods. To compare methods of analysis we used tree partition metrics followed by another level of clustering. These, and a tree fit metric, all suggest that the new distances give quite different trees than those usually obtained. In the second part we consider graphical modeling of the interactions of important genes of the cell cycle. Despite the models seeming to fit well on occasions, and despite the experimental error structure seeming close to multivariate normal, there are considerable problems to overcome. Latent variables, in this case important genes missing from the analysis, are inferred to have a strong effect on the partial correlations. Also, the data show clear evidence of sampling distributions conditional on the status of important cancer related genes, including TP53. Without full information on which genes are wild type the appropriate models cannot be fitted. These findings point to the need to include and distinguish not only all relevant genes but also all splice variants in the design phase of a microarray analysis. Failure to do so will induce problems similar to both latent variables and conditional distributions.
- 日本バイオインフォマティクス学会の論文
日本バイオインフォマティクス学会 | 論文
- Performance Improvement in Protein N-Myristoyl Classification by BONSAI with Insignificant Indexing Symbol
- A combined pathway to simulate CDK-dependent phosphorylation and ARF-dependent stabilization for p53 transcriptional activity
- A versatile petri net based architecture for modeling and simulation of complex biological processes
- XML documentation of biopathways and their simulations in Genomic Object Net
- Prediction of debacle points for robustness of biological pathways by using recurrent neural networks