On Efficient Sketching Algorithms
Sketching refers to a wide variety of techniques to compress large datasets into much smaller forms that can be efficiently processed to answer questions about the original dataset. Over the past few decades, sketching has emerged as a key tool to efficiently handle large datasets in majorly three settings: (i) the Classic setting, in which the dataset is given to us and we want to solve a problem as quickly as possible, (ii) the Streaming setting, in which the underlying dataset is defined by a large stream of updates and we want to compute interesting properties of the dataset using a small amount of space, and (iii) the Distributed setting, in which the dataset of interest is split among multiple servers and we want protocols that use a small amount of communication among servers to solve problems of interest on the underlying dataset.
Each of the above settings presents a different challenge with regard to the measure of efficiency we are interested in. In this thesis, we study sketching algorithms in these three settings for a variety of problems. While the techniques required to obtain our algorithms differ across problems and settings, the underlying idea of data compression to convert the original large dataset into a much smaller form is a key ingredient behind all the results in this thesis.
History
Date
2024-05-31Degree Type
- Dissertation
Department
- Computer Science
Degree Name
- Doctor of Philosophy (PhD)