Multi-level Temporal Blocking for Stencil Computation for Memory Hierarchy on TSUBAME2.5
スポンサーリンク
概要
- 論文の詳細を見る
The domain of the stencil computation is limited by the memory capacity of GPUs on a GPU cluster. As the domain grows to cope with higher accuracy requirements, more GPUs need to be employed to extend the memory capacity. In this paper, we propose new methods which apply temporal blocking method to device memory and registers of a set of GPUs to allow computations on the domain that is bigger than the memory capacity of GPUs while maintaining high performance on TSUBAME2.5. We also analyze the parameters and performance differences between TSUBAME2.0 and TSUBAME2.5 to apply our methods to wide range GPU clusters.
- 2014-02-24
著者
-
ENDO Toshio
Tokyo Institute of Technology
-
Satoshi Matsuoka
National Inst. Of Informatics
-
Satoshi Matsuoka
東京工業大学|JST-CREST|国立情報学研究所
-
Guanghao Jin
東京工業大学|JST-CREST
-
Toshio Endo
東京工業大学|JST-CREST
関連論文
- MPI-CUDA Applications Checkpointing
- Efficient PageRank on GPU Clusters
- Low-overhead checkpoint for large-scale GPU-accelerated systems
- Low-overhead checkpoint for large-scale GPU-accelerated systems
- Efficient PageRank on GPU Clusters
- Web-site-based partitioning techniques for efficient parallelization of the PageRank computation (ハイパフォーマンスコンピューティング)
- MPI-CUDA Applications Checkpointing
- CG on GPU-enhanced Clusters
- CG on GPU-enhanced Clusters
- Fast GPU Read Alignmennt with Burrows Wheeler Transform Based Index
- GPU-based approach for elastic-plastic deformation simulations
- Data Ownership Assurance in the Inter-Cloud supporting data dynamics
- Towards an Asynchronous Checkpointing System
- Towards an Asynchronous Checkpointing System
- Towards an Asynchronous Checkpointing System
- Towards an Asynchronous Checkpointing System
- Towards Fast PGAS Implementation of Multithreaded Asynchronous Large-Scale Graph Traversal for Supercomputers with Local Semi-External Memory
- Towards a Dataflow FMM using the OmpSs Programming Model
- Avoiding silent data corruption in checkpoint files
- Burst SSD Buffer: Checkpoint Strategy at Extreme Scale
- Multi-level Temporal Blocking for Stencil Computation for Memory Hierarchy on TSUBAME2.5