Authors

Samuel McCauley

Type

Text

Type

Dissertation

Advisor

Bender, Michael A | Chen, Jing | Johnson, Rob | Fineman, Jeremy.

Date

2016-12-01

Keywords

algorithms, external-memory, locality, packed memory array, scheduling | Computer science

Department

Department of Computer Science

Language

en_US

Source

This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.

Identifier

http://hdl.handle.net/11401/77241

Publisher

The Graduate School, Stony Brook University: Stony Brook, NY.

Format

application/pdf

Abstract

As computers become faster and more parallel, the number of computations a process executes is becoming too simplistic a predictor of performance. In many circumstances, modern algorithmic research can no longer rely on the assumption that each computation roughly corresponds to a set amount of time. For example, large-scale computers may spend far more time moving data around than performing computation, in which case minimizing data transfers is the best way to improve wall-clock performance. Similarly, manufacturing plants may need to focus on minimizing maintenance costs. While they must still guarantee good throughput, the maintenance costs are what most impact their profits. Under such circumstances, algorithms must be analyzed using models that incorporate these new costs. This work focuses on using locality as a tool to improve performance across different models and objectives. We discuss three specific areas: scheduling, external memory, and high-performance computing. Our scheduling research deals with machines that need to be calibrated at large cost. Since jobs can only be scheduled on calibrated machines, we want to group jobs as much as possible—while retaining good throughput—to minimize the number of calibrations. We give algorithms and results for the offline setting (where jobs are known ahead of time), and the online setting (where jobs must be handled as they arrive). The external-memory model measures the number of hard disk accesses made by an algorithm. This model is applicable for I/O-bound algorithms (such as large-scale sorting) whose performance is limited by hard drive access time rather than computation time. In this work, we focus on the packed memory array, a data structure used as a building block for external-memory algorithms. We show that we can retain optimal packed memory array performance while also providing history independence, a security guarantee. Finally, we discuss a new type of memory, High-Bandwidth Memory, which has been proposed for the next generation of supercomputers. We modify the external-memory model to take this new kind of memory into account, and give theoretical analysis. We also give experimental results which show that this memory has the potential to significantly improve sorting performance. | As computers become faster and more parallel, the number of computations a process executes is becoming too simplistic a predictor of performance. In many circumstances, modern algorithmic research can no longer rely on the assumption that each computation roughly corresponds to a set amount of time. For example, large-scale computers may spend far more time moving data around than performing computation, in which case minimizing data transfers is the best way to improve wall-clock performance. Similarly, manufacturing plants may need to focus on minimizing maintenance costs. While they must still guarantee good throughput, the maintenance costs are what most impact their profits. Under such circumstances, algorithms must be analyzed using models that incorporate these new costs. This work focuses on using locality as a tool to improve performance across different models and objectives. We discuss three specific areas: scheduling, external memory, and high-performance computing. Our scheduling research deals with machines that need to be calibrated at large cost. Since jobs can only be scheduled on calibrated machines, we want to group jobs as much as possible—while retaining good throughput—to minimize the number of calibrations. We give algorithms and results for the offline setting (where jobs are known ahead of time), and the online setting (where jobs must be handled as they arrive). The external-memory model measures the number of hard disk accesses made by an algorithm. This model is applicable for I/O-bound algorithms (such as large-scale sorting) whose performance is limited by hard drive access time rather than computation time. In this work, we focus on the packed memory array, a data structure used as a building block for external-memory algorithms. We show that we can retain optimal packed memory array performance while also providing history independence, a security guarantee. Finally, we discuss a new type of memory, High-Bandwidth Memory, which has been proposed for the next generation of supercomputers. We modify the external-memory model to take this new kind of memory into account, and give theoretical analysis. We also give experimental results which show that this memory has the potential to significantly improve sorting performance. | 79 pages

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.