Carnegie Mellon University
Browse
file.pdf (584.55 kB)

Practical Batch-Updatable External Hashing with Sorting

Download (584.55 kB)
journal contribution
posted on 1988-01-01, 00:00 authored by Hyeontaek Lim, David G. Andersen, Michael Kaminsky

This paper presents a practical external hashing scheme that supports fast lookup (7 microseconds) for large datasets (millions to billions of items) with a small memory footprint (2.5 bits/item) and fast index construction (151 K items/s for 1-KiB key-value pairs). Our scheme combines three key techniques: (1) a new index data structure (Entropy-Coded Tries); (2) the use of sorting as the main data manipulation method; and (3) support for incremental index construction for dynamic datasets. We evaluate our scheme by building an external dictionary on flash-based drives and demonstrate our scheme's high performance, compactness, and practicality.

History

Publisher Statement

All Rights Reserved

Date

1988-01-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC