A Method for Isoform Prediction from RNA-Seq Data by Iterative Mapping
スポンサーリンク
概要
- 論文の詳細を見る
Alternative splicing plays an important role in eukaryotic gene expression by producing diverse proteins from a single gene. Predicting how genes are transcribed is of great biological interest. To this end, massively parallel whole transcriptome sequencing, often referred to as RNA-Seq, is becoming widely used and is revolutionizing the cataloging isoforms using a vast number of short mRNA fragments called reads. Conventional RNA-Seq analysis methods typically align reads onto a reference genome (mapping) in order to capture the form of isoforms that each gene yields and how much of every isoform is expressed from an RNA-Seq dataset. However, a considerable number of reads cannot be mapped uniquely. Those so-called multireads that are mapped onto multiple locations due to short read length and analogous sequences inflate the uncertainty as to how genes are transcribed. This causes inaccurate gene expression estimations and leads to incorrect isoform prediction. To cope with this problem, we propose a method for isoform prediction by iterative mapping. The positions from which multireads originate can be estimated based on the information of expression levels, whereas quantification of isoform-level expression requires accurate mapping. These procedures are mutually dependent, and therefore remapping reads is essential. By iterating this cycle, our method estimates gene expression levels more precisely and hence improves predictions of alternative splicing. Our method simultaneously estimates isoform-level expressions by computing how many reads originate from each candidate isoform using an EM algorithm within a gene. To validate the effectiveness of the proposed method, we compared its performance with conventional methods using an RNA-Seq dataset derived from a human brain. The proposed method had a precision of 66.7% and outperformed conventional methods in terms of the isoform detection rate.
- 2012-06-21
著者
-
Matsuda Hideo
Graduate School Of Information Sci. And Technol. Osaka Univ.
-
Seno Shigeto
Graduate School Of Information Sci. And Technol. Osaka Univ.
-
Shigeto Seno
Graduate School Of Information Science And Technology Osaka University
-
Takenaka Yoichi
Graduate School Of Information Sci. And Technol. Osaka Univ.
-
Tomoshige Ohno
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osak
-
Shigeto Seno
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osak
-
Yoichi Takenaka
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osak
-
Hideo Matsuda
Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osak
-
Tomoshige Ohno
Department Of Bioinformatic Engineering Graduate School Of Information Science And Technology Osaka University
関連論文
- Retrieving Functionally Similar Bioinformatics Workflows Using TF-IDF Filtering
- A Distributed-Proccessing System for Accelerating Biological Research Using Data-Staging
- A Combination Method of the Tanimoto Coefficient and Proximity Measure of Random Forest for Compound Activity Prediction
- Improved Prediction Method for Protein Interactions Using Both Structural and Functional Characteristics of Proteins
- A Method for Isoform Prediction from RNA-Seq Data by Iterative Mapping
- A Method for Isoform Prediction from RNA-Seq Data by Iterative Mapping