Text Mining and Protein Annotations:The Construction and Use of Protein Description Sentences
スポンサーリンク
概要
- 論文の詳細を見る
Existing biological knowledge stored as structured database records has been extracted manually by database curators analyzing the scientific literature. Most of this information was derived from sentences which describe biologically relevant aspects of genes and gene products. We introduce the Protein description sentence (Prodisen) corpus, a useful resource for the automatic identification and construction of text-based protein and gene description records using information extraction and text classification techniques. Basic guidelines and criteria relevant for the construction of a text corpus of functional descriptions of genes and proteins are proposed. The steps used for the corpus construction and its features are presented. Moreover, some of the potential applications of the Prodisen corpus for biomedical text mining purposes are explored and the obtained results are presented.
- 日本バイオインフォマティクス学会の論文
日本バイオインフォマティクス学会 | 論文
- Performance Improvement in Protein N-Myristoyl Classification by BONSAI with Insignificant Indexing Symbol
- A combined pathway to simulate CDK-dependent phosphorylation and ARF-dependent stabilization for p53 transcriptional activity
- A versatile petri net based architecture for modeling and simulation of complex biological processes
- XML documentation of biopathways and their simulations in Genomic Object Net
- Prediction of debacle points for robustness of biological pathways by using recurrent neural networks