A Distributed-Processing System for Accelerating Biological Research Using Data-Staging
スポンサーリンク
概要
- 論文の詳細を見る
The number of biological databases has been increasing rapidly as a result of progress in biotechnology. As the amount and heterogeneity of biological data increase, it becomes more difficult to manage the data in a few centralized databases. Moreover, the number of sites storing these databases is getting larger, and the geographic distribution of these databases has become wider. In addition, biological research tends to require a large amount of computational resources, i.e., a large number of computing nodes. As such, the computational demand has been increasing with the rapid progress of biological research. Thus, the development of methods that enable computing nodes to use such widely-distributed database sites effectively is desired. In this paper, we propose a method for providing data from the database sites to computing nodes. Since it is difficult to decide which program runs on a node and which data are requested as their inputs in advance, we have introduced the notion of "data-staging" in the proposed method. Data-staging dynamically searches for the input data from the database sites and transfers the input data to the node where the program runs. We have developed a prototype system with data-staging using grid middleware. The effectiveness of the prototype system is demonstrated by measurement of the execution time of similarity search of several-hundred gene sequences against 527 prokaryotic genome data.
- 一般社団法人 情報処理学会の論文
著者
-
Matsuda Hideo
Graduate School Of Engineering Science Osaka University
-
Date Susumu
Graduate School Of Information Schience And Technology Osaka University
-
Kido Yoshiyuki
Graduate School Of Information Schience And Technology Osaka University
-
SENO SHIGETO
Graduate School of Information Schience and Technology, Osaka University
-
TAKENAKA YOICHI
Graduate School of Information Schience and Technology, Osaka University
関連論文
- GXML : A Novel Method for Exchanging and Querying Complete Geneomes by Representing them as Structured Documents
- Introduction of Aggregate Functions to a Language for Querying Structured Genome Documents (夏のデータベースワークショップ1999(DBWS'99)沖縄--1999年7の月,天から大量データが降ってくる!) -- (4C:半構造データ検索(2))
- Introduction of Aggregate Functions to a Language for Querying Structured Genome Documents
- Querying Molecular Biology Databases by Integration Using Multiagents (Special Issue on New Generation Database Technologies)
- Implementation of a Parallel Prolog System on a Distributed Memory Parallel Computer (Special Issue on Parallel and Distributed Supercomputing)
- Analytic Space Management for Drug Design Application
- Conformational Search and Analysis of β-hairpin Formation by High-Speed Exhaustive Tree Search
- Retrieving Functionally Similar Bioinformatics Workflows Using TF-IDF Filtering
- Retrieving Functionally Similar Bioinformatics Workflows Using TF-IDF Filtering
- A Distributed-Proccessing System for Accelerating Biological Research Using Data-Staging
- A Combination Method of the Tanimoto Coefficient and Proximity Measure of Random Forest for Compound Activity Prediction
- Analytic Space Management for Drug Design Application
- Analytic Space Management for Drug Design Application
- Improved Prediction Method for Protein Interactions Using Both Structural and Functional Characteristics of Proteins
- A Distributed-Processing System for Accelerating Biological Research Using Data-Staging
- A Combination Method of the Tanimoto Coefficient and Proximity Measure of Random Forest for Compound Activity Prediction