Systematic Detection of Statistically Overrepresented DNA Motif Association Rules
スポンサーリンク
概要
- 論文の詳細を見る
DNA motifs, or cis-elements, are short nucleotide sequence patterns recognized by various transcription factors (TFs). In promoters, these TFs bind in a complex combinatorial manner in order to regulate the expression of a downstream gene. The combinatorial space is frequently large and difficult to manage since vertebrates have thousands of transcription factors and more than 20, 000 genes. We introduce a computer program called CAYCE (Combinatorial AnalYsis of Cis-Elements) that systematically detects statistically overrepresented DNA motif association rules independent of Microarray information. CAYCE is an adaptation of the <I>apriori</I> algorithm traditionally used for association rule mining, but offers three significant advancements.(1) It analyzes multiple occurrences of an item, corresponding to multiple TF binding sites, (2) It compares results with a biologically relevant background, and (3), it provides p-values for straightforward statistical interpretation. CAYCE can be easily applied to any item-set data where the investigator is also interested in multiple occurrences of a single item, and/or overrepresentation of association rules compared with a background. Applying CAYCE to human promoters in 1% of the human genome, we discover that motif clusters containing five repetitions of SP1 are the most statistically significant.
- 日本バイオインフォマティクス学会の論文
日本バイオインフォマティクス学会 | 論文
- Performance Improvement in Protein N-Myristoyl Classification by BONSAI with Insignificant Indexing Symbol
- A combined pathway to simulate CDK-dependent phosphorylation and ARF-dependent stabilization for p53 transcriptional activity
- A versatile petri net based architecture for modeling and simulation of complex biological processes
- XML documentation of biopathways and their simulations in Genomic Object Net
- Prediction of debacle points for robustness of biological pathways by using recurrent neural networks