E3: an Elastic Execution Engine for Scalable Data Processing
スポンサーリンク
概要
- 論文の詳細を見る
With the unprecedented growth of data generated by mankind nowadays, it has become critical to develop efficient techniques for processing these massive data sets. To tackle such challenges, analytical data processing systems must be extremely efficient, scalable, and flexible as well as economically effective. Recently, Hadoop, an open-source implementation of MapReduce, has gained interests as a promising big data processing system. Although Hadoop offers the desired flexibility and scalability, its performance has been noted to be suboptimal when it is used to process complex analytical tasks. This paper presents E3, an elastic and efficient execution engine for scalable data processing. E3 adopts a “middle” approach between MapReduce and Dryad in that E3 has a simpler communication model than Dryad yet it can support multi-stages job better than MapReduce. E3 avoids reprocessing intermediate results by adopting a stage-based evaluation strategy and collocating data and user-defined (map or reduce) functions into independent processing units for parallel execution. Furthermore, E3 supports block-level indexes, and built-in functions for specifying and optimizing data processing flows. Benchmarking on an in-house cluster shows that E3 achieves significantly better performance than Hadoop, or put it another way, building an elastically scalable and efficient data processing system is possible.
著者
-
Shi Lei
School of Computing, National University of Singapore
-
Vo Hoang
School of Computing, National University of Singapore
-
Ooi Beng
School of Computing, National University of Singapore
-
Jiang Dawei
School of Computing, National University of Singapore
-
Chen Ke
College of Computer Science, Zhejiang University
-
Wu Sai
School of Computing, National University of Singapore
-
Chen Gang
College of Computer Science, Zhejiang University
関連論文
- Energy-saving Diagnosis of Ground Water-source Heat Pump System Based on Artificial Neural Network
- Unusual Magnetism Due to a Random Distribution of Cations in \alpha-LiFeO2
- E3: an Elastic Execution Engine for Scalable Data Processing
- E3: an Elastic Execution Engine for Scalable Data Processing
- Global pattern of carbon stable isotopes of suspended particulate organic matter in lakes