Optimal Parallel Algorithms for Computing the Sum, the Prefix-Sums, and the Summed Area Table on the Memory Machine Models
スポンサーリンク
概要
- 論文の詳細を見る
The main contribution of this paper is to show optimal parallel algorithms to compute the sum, the prefix-sums, and the summed area table on two memory machine models, the Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM). The DMM and the UMM are theoretical parallel computing models that capture the essence of the shared memory and the global memory of GPUs. These models have three parameters, the number p of threads, and the width w of the memory, and the memory access latency l. We first show that the sum of n numbers can be computed in $O({n\over w}+{nl\over p}+l\log n)$ time units on the DMM and the UMM. We then go on to show that $\Omega({n\over w}+{nl\over p}+l\log n)$ time units are necessary to compute the sum. We also present a parallel algorithm that computes the prefix-sums of n numbers in $O({n\over w}+{nl\over p}+l\log n)$ time units on the DMM and the UMM. Finally, we show that the summed area table of size $\sqrt{n}\times\sqrt{n}$ can be computed in $O({n\over w}+{nl\over p}+l\log n)$ time units on the DMM and the UMM. Since the computation of the prefix-sums and the summed area table is at least as hard as the sum computation, these parallel algorithms are also optimal.
著者
-
Nakano Koji
Department Of Applied Chemistry Faculty Of Engineering Kyushu University
-
NAKANO Koji
Department of Information Engineering, Hiroshima University
関連論文
- PJ-293 Reduced Progressive Common Carotid Artery Intimal-medial Thickness (IMT) Correlates with Restroration of Insulin Resistance in Coronary Heart Disease (CHD) Patients(Atherosclerosis, Clinical 4 (IHD) : PJ49)(Poster Session (Japanese))
- Energy-Efficient Initialization Protocols for Ad-Hoc Radio Networks
- Evaluation of Vulnerable Coronary Plaques and Non-Alcoholic Fatty Liver Disease (NAFLD) by 64-Detector Multislice Computed Tomography (MSCT)
- Characterization of Transendothelial Migratory Lymphokine-activated Killer Cells
- Functional and T Cell Receptor Gene Usage Analysis of Cytotoxic T Lymphocytes in Fresh Tumor-infiltrating Lymphocytes from Human Head and Neck Cancer
- Reversed-Phase High-Performance Liquid-Chromatographic Determination Systems Specific to Ultratrace Hard Metal Ions with Tridentate Schiff Bases and Pyridylhydrazones
- An Atomic Force Microscopy Assay of Intercalation Binding, Unwinding, and Elongation of DNA, Using a Water-Soluble Psoralen Derivative as a Covalent Binding Probe Molecule
- Flow Immunoassay for Nonioinic Surfactants Based on Surface Plasmon Resonance Sensors
- Synthesis of Circular Double-Stranded DNA Having Single-Stranded Recognition Sequence as Molecular-Physical Probe for Nucleic Acid Hybridization Detection Based on Atomic Force Microscopy Imaging
- Phospholipid-linked Coumarin : A Fluorescent Probe for Sensing Hydroxyl Radicals in Lipid Membranes
- Surface Plasmon Resonance Immunosensor for IgE Analysis Using Two Types of Anti-IgE Antibodies with Different Active Recognition Sites
- Electrochemical Immunoassay for Vitellogenin Based on Sequential Injection Using Antigen-immobilized Magnetic Microbeads
- Deafness Resilient MAC Protocol for Directional Communications
- Bioaffinity Sensor to Anti-DNA Antibodies Using DNA Modified Au Electrode
- Clipping-Free Halftoning and Multitoning Using the Direct Binary Search
- New Class of Catalysts for Alternating Copolymerization of Alkylene Oxide and Carbon Dioxide
- Effect of pioglitazone on various parameters of insulin resistance including lipoprotein subclass according to particle size by a gel-permeation high-performance liquid chromatography in newly diagnosed patients with type 2 diabetes
- Sequential Injection Immunoassay for Environmental Measurements
- Calix[n]arenes Provided with Thiols for Modified Electrode Applications;Ring-size Dependent Voltammetric Behavior toward Ferrocene Derivatives
- Eosinophil count is positively correlated with coronary artery calcification
- A Graph Rewriting Approach for Converting Asynchronous ROMs into Synchronous Ones
- AFM-Imaging Diagnosis Method for Single Nucieotide Polymorphism Using Molecular Beacon DNA as an Intramolecular Ligation Template of Target DNA and a Viewable Indicator
- Optimal Parallel Algorithms for Computing the Sum, the Prefix-Sums, and the Summed Area Table on the Memory Machine Models
- Offline Permutation Algorithms on the Discrete Memory Machine with Performance Evaluation on the GPU
- A Pivot-Hinge-Style DNA Immobilization Method with Adaptable Surface Concentration Based on Oligodeoxynucleotide-Phosphorothioate Chemisorption on Gold Surfaces
- Low insulin level is associated with aortic stiffness