Limits of Thread-Level Parallelism in Non-numerical Programs
スポンサーリンク
概要
- 論文の詳細を見る
Chip multiprocessors (CMPs), which recently became available with the advance of LSI technology, can outperform current superscalar processors by exploiting thread-level parallelism (TLP). However, the effectiveness of CMPs unfortunately depends greatly on their applications. In particular, they have so far not brought any significant benefit to non-numerical programs. This study explores what techniques are required to extract large amounts of TLP in non-numerical programs. We focus particularly on three techniques: thread partitioning with various control structure levels, speculative thread execution, and speculative register communication. We evaluate these techniques by examining the upper bound of the TLP, using trace-driven simulations. Our results are as follows. First, little TLP can be extracted without both of the speculations in any of the partitioning levels. Second, with the speculations, available TLP is still limited in conventional function-level and loop-level partitioning. However, it increases considerably with basic block-level partitioning. Finally, in basic block-level partitioning, focusing on control-equivalence instead of post-domination can significantly reduce the compile time, with a modest degradation of TLP.
- 一般社団法人 情報処理学会の論文
著者
-
Ando Hideki
Department Of Biology Faculty Of Science Okayama University
-
Nakajima Akio
Department Of Applied Chemistry Osaka Institute Of Technology
-
Kobayashi Ryotaro
Department Of Electrical Engineering And Computer Science Nagoya University
-
Shimada Toshio
Department of Biology, Faculty of Science and High Technology Research Center, Konan University
-
Shimada Toshio
Department of Electrical Engineering and Computer Science, Nagoya University
関連論文
- PJ-249 Clinical Implication of Delayed Contrast Enhancement by Gd-DTPA MRI and Elevated Brain Natriuretic Peptide Hormone in Aortic Stenosis(MRI/MRA-4 (I) PJ42,Poster Session (Japanese),The 70th Anniversary Annual Scientific Meeting of the Japanese Circul
- The Cutting Balloon Blades and Calcified Lesions : Are the Blades Cutting into the Calcification? : An Intravascular Ultrasound Investigation
- Cutting Balloon Angioplasty for the Treatment of Calcified Coronary Lesions : An Intravascular Ultrasound Study
- Intravascular large B cell lymphoma : proposed of the strategy for early diagnosis and treatment of patients with rapid deteriorating condition
- Detection of Genes Encoding Bholera Toxin (CT), Zonula Occludens Toxin (ZOT), Accessory Cholera Enterotoxin (ACE) and Heat-Stable Enterotoxin (ST) in Vibrio mimcus Clinical Strains
- Hydrodynamic Evolution of Highly Energetic Matter Produced by Cylindrically Symmetric Heavy Ions Collisions
- A Priority Forwarding Scheme for Real-Time Multistage Interconnection Networks and Its Evaluation (実時間処理システムとその応用論文特集)
- Diesel Exhaust Particle-Induced Cell Death of Cultured Normal Human Bronchial Epithelial Cells
- Diesel Exhaust Particle-Induced Cell Death of Human Leukemic Promyelocytic Cells HL-60 and Their Variant Cells HL-NR6
- Effect of Polyethylene Glycol on the Synthesis of Oligopeptide by Papain in an Organic Medium(Organic Chemistry)
- PJ-177 Enhanced expression of V-1, a novel catecholamine biosynthesis regulatory protein, in atrial myocytes of hypertrophic heart of Dahl hypertensive rats(Hypertension, Basic 2 (H) : PJ30)(Poster Session (Japanese))
- Energy-Efficient Pre-Execution Techniques in Two-Step Physical Register Deallocation
- Limits of Thread-Level Parallelism in Non-numerical Programs(System Evaluation)
- Register File Size Reduction through Instruction Pre-Execution Incorporating Value Prediction
- Backward Flow of Mesonic Fluid in Heavy Ions Collision at Ultra High Energy : Particles and Fields
- High-dose methotrexate with R-CHOP therapy for the treatment of patients with primary central nervous system lymphoma
- kra-1,A GENE REQUIRED FOR KETAMINE RESPONSE IN THE NEMATODE Caenorhabditis elegans
- PI-36 Immunohistochemical Localization of PAF-receptor and Cross-talk between PAF-and ACTH-induced aldosterone secretion in Guinea Pig Adrenals
- Delay Evaluation of Issue Queue in Superscalar Processors with Banking Tag RAM and Correct Critical Path Identification
- Two-Step Physical Register Deallocation for Data Prefetching and Address Pre-Calculation
- Two-Step Physical Register Deallocation for Data Prefetching and Address Pre-Calculation
- Limits of Thread-Level Parallelism in Non-numerical Programs
- Limits of Thread-Level Parallelism in Non-numerical Programs