Type

Text

Type

Dissertation

Advisor

Mitchell, Joseph | Bender, Michael A. | Johnson, Robert | Farach-Colton, Martin.

Date

2014-12-01

Keywords

Algorithms, Batched, External Memory, Lower Bounds, Searching, Sorting | Computer science

Department

Department of Computer Science.

Language

en_US

Source

This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.

Identifier

http://hdl.handle.net/11401/77297

Publisher

The Graduate School, Stony Brook University: Stony Brook, NY.

Format

application/pdf

Abstract

This dissertation presents variants on sorting and searching in external memory. In the first part of the dissertation, we derive lower and upper bounds on sorting with different-sized records. We show that the record size substantially affects the sorting complexity, and so does the final interleaving of the smaller and larger records in the final sorted sequence: sorting costs more when large and small records are segregated than when they are interleaved in the final sorted order. In the second part of the dissertation, we study the batched predecessor problem in external memory. Given the underlying sorted set S of size n, and a sorted query Q of size x <= n^c, 0 <= c < 1, we study tradeoffs between the searching cost, and the cost to preprocess S. We give lower bounds in three external memory models: the I/O comparison model, I/O pointer-machine model, and the indexability model. Our results show that in the I/O comparison model, the batched predecessor problem needs \Omega(\log_{B}n + 1/B) I/Os per element if the preprocessing is polynomial; with exponential preprocessing, the problem can be solved faster, in \Theta((\log_{2}n + 1)/B). In the third part of the dissertation, we introduce alternatives to the well-known Bloom filter data structure. The quotient filter is designed for RAM, but with better data locality than the Bloom filter. The buffered quotient filter and the cascade filter are SSD-optimized alternatives to the Bloom filter. In experiments, the cascade filter and buffered quotient filter performed insertions 8.6-11 times faster than the fastest Bloom filter variant and performed lookups 0.94-2.56 times faster. | 83 pages

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.