Carnegie Mellon University
Browse

On Efficient Sketching Algorithms

Download (2.21 MB)
thesis
posted on 2024-07-26, 20:58 authored by Praneeth KachamPraneeth Kacham

Sketching refers to a wide variety of techniques to compress large datasets into much smaller forms that can be efficiently processed to answer questions about the original dataset. Over the past few decades, sketching has emerged as a key tool to efficiently handle large datasets in majorly three settings: (i) the Classic setting, in which  the dataset is given to us and we want to solve a problem as quickly as possible, (ii)  the Streaming setting, in which the underlying dataset is defined by a large stream of  updates and we want to compute interesting properties of the dataset using a small  amountofspace, and(iii) the Distributed setting, in which the dataset of interest is split among multiple servers and we want protocols that use a small amount of communication among servers to solve problems of interest on the underlying dataset. 

Each of the above settings presents a different challenge with regard to the measure of efficiency we are interested in. In this thesis, we study sketching algorithms in these three settings for a variety of problems. While the techniques required to obtain our  algorithms differ across problems and settings, the underlying idea of data compression  to convert the original large dataset into a much smaller form is a key ingredient behind all the results in this thesis.  

Funding

Collaborative Research: AF: Small: Exploring the Frontiers of Adversarial Robustness

Directorate for Computer & Information Science & Engineering

Find out more...

History

Date

2024-05-31

Degree Type

  • Dissertation

Department

  • Computer Science

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

David P. Woodruff

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC