Fault Tolerance Design for Hadoop MapReduce on Gfarm Distributed Filesystem
スポンサーリンク
概要
- 論文の詳細を見る
Many distributed file systems that have been designed to use MapReduce, such as Google file system and HDFS (Hadoop Distributed File System), relax some POSIX requirements to enable high throughput streaming access. Due to lack of POSIX compatibility, it is difficult for programs other than MapReduce to access these file systems. It is often need to import files to these file system, process the data and then export the output into a POSIX compatible file system. This results in a large number of redundant file operations. In order to solve this problem we have proposed[9] Hadoop-Gfarm plugin to be able to execute MapReduce jobs directly on top of Gfarm, a globally distributed file system. In this paper we analyse the redundancy and reliability for a fault tolerance design for Hadoop-Gfarm plugin. Our evaluation shows that Hadoop-Gfarm plugin can offer a reliable solution and performs just as well as Hadoop's native HDFS, allowing users to use a POSIX-compliant API and reduces redundant copy without sacrificing performance.
- 2013-07-24
著者
-
Osamu Tatebe
Graduate School of Systems and Information Engineering, University of Tsukuba
-
Marilia Melo
Graduate School of Systems and Information Engineering, University of Tsukuba