Survival Analysis by Penalized Regression and Matrix Factorization
スポンサーリンク
概要
- 論文の詳細を見る
To find a suitable model to simulate follow-ups is needed because every disease has its unique survival pattern. DNA microarray is a useful technique to detect thousands of gene expressions at once, and is usually employed to classify different types of cancer. In this technical report, we propose combination methods of penalized regression models and nonnegative matrix factorization (NMF) for predicting survival. We examined L1- (lasso), L2- (ridge), and L1-L2 combined (elastic net) penalized regression for diffuse large B-cell lymphoma (DLBCL) patients' microarray data, and found L1-L2 combined method predicts survival best with the smallest logrank p-value. Moreover, 80% of selected genes have been reported to correlate with carcinogenesis or lymphoma. Through NMF we found that DLBCL patients can be divided into four groups clearly, and it implies that DLBCL may have four subtypes which have a little different survival patterns. Next, we excluded some patients who were indicated hard to classify by NMF, and executed three penalized regression models again. We found the performance of survival prediction has been improved with lower logrank p-values. Therefore, we conclude that after pre-selection of patients by NMF, penalized regression models can predict DLBCL patients' survival successfully.
- 2013-09-19
著者
-
Tatsuya Akutsu
Bioinformatics Center, Institute for Chemical Research, Kyoto University
-
Morihiro Hayashida
Bioinformatics Center Institute For Chemical Research Kyoto University
-
Tatsuya Akutsu
Bioinformatics Center Institute For Chemical Research Kyoto University
-
Tatsuya Akutsu
Bioinformatics Center Institute For Chemical Research Kyoto Univerty
-
Yeuntyng Lai
Bioinformatics Center, Institute for Chemical Research, Kyoto University
関連論文
- Prediction of Protein Folding Rates from Structural Topology and Complex Network Properties
- Prediction of Protein Folding Rates from Structural Topology and Complex Network Properties
- RNA-RNA Interaction Prediction Using Integer Programming with Threshold Cut
- A Quadsection Algorithm for Grammar-Based Image Compression
- Analyses and Algorithms for Predecessor and Control Problems for Boolean Networks of Bounded Indegree
- Conditional Random Field Approach to Prediction of Protein-protein Interactions Using Domain Information
- Integer Programming and Dynamic Programming-based Methods of Optimizing Control Policy in Probabilistic Boolean Networks with Hard Constraints
- An Improved Clique-Based Method for Computing Edit Distance between Rooted Unordered Trees
- Prediction of protein residue contacts using discriminative random field
- Prediction of protein residue contacts using discriminative random field
- Efficient Computation of Impact Degrees for Multiple Reactions in Metabolic Networks with Cycles
- Prediction of RNA Secondary Structures with Binding Sites Using Dynamic Programming Algorithm
- Message from the Editor-in-Chief
- Message from the Editor-in-Chief
- Message from the Editor-in-Chief
- Protein complex prediction via improved verification methods using constrained domain-domain matching
- Predicting Protein-RNA Residue-base Contacts Using Two-dimensional Conditional Random Field
- Finding Conserved Regions in Protein Structures Using Support Vector Machines and Structure Alignment
- Inferring Strengths of Protein-Protein Interactions Using Support Vector Regression
- Evaluating Effectiveness of Accessibility to Infer RNA-RNA Interactions
- Survival Analysis by Penalized Regression and Matrix Factorization
- A Dominating Set Approach to Structural Controllability of Unidirectional Bipartite Networks
- Prediction of Heterodimeric Protein Complexes from Weighted Protein-Protein Interaction Networks Using Novel Features and Kernel Functions
- Prediction of Heterodimeric Protein Complexes from Weighted Protein-Protein Interaction Networks Using Novel Features and Kernel Functions
- Breadth-first Search Approach to Enumeration of Tree-like Chemical Compounds
- Breadth-first Search Approach to Enumeration of Tree-like Chemical Compounds
- Algorithms for Finding a Largest Common Subtree of Bounded Degree
- Parallelization of Enumerating Tree-like Chemical Compounds by Breadth-first Search Order
- Prediction of Heterotrimeric Protein Complexes by Two-phase Learning Using Neighboring Kernels
- Prediction of Heterotrimeric Protein Complexes by Two-phase Learning Using Neighboring Kernels
- Grammar-based Compression for Multiple Trees Using Integer Programming
- Grammar-based Compression for Multiple Trees Using Integer Programming