Aggregate Router: An Efficient Inter-Cluster MPI Communication Facility
スポンサーリンク
概要
- 論文の詳細を見る
At a cluster of clusters used for parallel computing, it is important to fully utilize the inter-cluster network. Existing MPI implementations for cluster of clusters have two issues: 1) Single point-to-point communication cannot utilize the bandwidth of the high-bandwidth inter-cluster network because a Gigabit Ethernet interface is used at each node for inter-cluster communication, while more bandwidth is available between clusters. 2) Heavy packet loss and performance degradation occur on the TCP/IP protocol when many nodes generate short-term burst traffic. In order to overcome these issues, this paper proposes a novel method called the aggregate router method. In this method, multiple router nodes are set up in each cluster and inter-cluster communication is performed via these router nodes. By striping a single message to multiple routers, the bottleneck caused by network interfaces is reduced. The packet congestion issue is also avoided by using high-speed interconnects in a cluster, instead of the TCP/IP protocol. The aggregated router method is evaluated using the HPC Challenge Benchmarks and the NAS Parallel Benchmarks. The result shows that the proposed method outperforms the existing method by 24% in the best case.
著者
-
Matsuba Hiroya
Information Technology Center, The University of Tokyo
-
Ishikawa Yutaka
Graduate School of Information Science and Technology, The University of Tokyo
関連論文
- Aggregate Router: An Efficient Inter-Cluster MPI Communication Facility
- Inter-kernel Communication between Multiple Kernels on Multicore Machines
- Inter-kernel Communication between Multiple Kernels on Multicore Machines