File(s) stored somewhere else

Please note: Linked content is NOT stored on Carnegie Mellon University and we can't guarantee its availability, quality, security or accept any liability.

Synthetic Workload Performance Analysis of Incremental Updates

journal contribution
posted on 01.01.1994 by Kurt Shoens, Anthony Tomasic, Hector Garcia-Molina

Declining disk and CPU costs have kindled a renewed interest in efficient document indexing techniques. In this paper, the problem of incremental updates of inverted lists is addressed using a dual-structure index data structure that dynamically separates long and short inverted lists and optimizes the retrieval, update, and storage of each type of list. The behavior of this index is studied with the use of a synthetically-generated document collection and a simulation model of the algorithm. The index structure is shown to support rapid insertion of documents, fast queries, and to scale well to large document collections and many disks.