Understanding the Impact of BPRAM on Incremental Checkpoint
スポンサーリンク
概要
- 論文の詳細を見る
Existing large-scale systems suffer from various hardware/software failures, motivating the research of fault-tolerance techniques. Checkpoint-restart techniques are widely applied fault-tolerance approaches, especially in scientific computing systems. However, the overhead of checkpoint largely influences the overall system performance. Recently, the emerging byte-addressable, persistent memory technologies, such as phase change memory (PCM), make it possible to implement checkpointing in arbitrary data granularity. However, the impact of data granularity on the checkpointing cost has not been fully addressed. In this paper, we investigate how data granularity influences the performance of a checkpoint system. Further, we design and implement a high-performance checkpoint system named AG-ckpt. AG-ckpt is a hybrid-granularity incremental checkpointing scheme through: (1) low-cost modified-memory detection and (2) fine-grained memory duplication. Moreover, we also formulize the performance-granularity relationship of checkpointing systems through a mathematical model, and further obtain the optimum solutions. We conduct the experiments through several typical benchmarks to verify the performance gain of our design. Compared to conventional incremental checkpoint, our results show that AG-ckpt can reduce checkpoint data amount up to 50% and provide a speedup of 1.2x-1.3x on checkpoint efficiency.
著者
-
Zhou Xu
National Astronomical Observatories Chinese Academy Of Sciences
-
LU Kai
National University of Defence Technology
-
LI Xu
National University of Defence Technology
-
WANG Xiaoping
National University of Defence Technology
-
DAI Bin
National University of Defence Technology
-
ZHOU Xu
National University of Defence Technology
関連論文
- Sixteen-Color Photometry of Galaxy Cluster Abell 566
- Multicolor Photometry of the Galaxies in Abell 1775 : Substructures, Luminosity Functions, and Star-Formation Properties
- Understanding the Impact of BPRAM on Incremental Checkpoint