Algorithms from diverse domains often have tunable parameters that have a significant impact on performance metrics such as solution quality, runtime, and memory usage. Typically, the optimal setting depends intimately on the application domain at hand. Hand-tuning parameters is often tedious, time-consuming, and error-prone, so a burgeoning line of research has studied automated algorithm configuration via machine learning, where a training set of typical problem instances from the application at hand is used to find high-performing parameter settings.
In this thesis, we help develop the theory and practice of automated algorithm configuration. We investigate both statistical and algorithmic questions. For example,
how large should the training set be to ensure that an algorithm’s average performance over the training set is indicative of its future performance on unseen instances? How can we algorithmically find provably high-performing configurations? As we answer these questions, we analyze parameterized algorithms from a variety of domains, including:
1. Integer programming. We study branch-and-bound algorithms for integer linear programming, the most widely-used tools for solving combinatorial and nonconvex
problems. Beyond answering the algorithmic and statistical questions above, we provide experiments demonstrating that no one parameter setting is optimal across all application domains, and that tuning parameters using our
approach can have a significant impact on algorithmic performance. We also analyze integer quadratic programming approximation algorithms, which can
be used to find nearly-optimal solutions to a variety of combinatorial problems.
2. Mechanism design. A mechanism is a special type of algorithm that plays a crucial role in economics and political science. A mechanism’s purpose is to help a set of agents come to a collective decision. In economics, for example, a mechanism might dictate how a set of items should be split among the agents, given their values for those items. Mechanisms often have tunable parameters that impact, for example, their revenue. No one setting is optimal across all mechanism design scenarios. In this thesis, we analyze sales mechanisms, where the goal is to maximize revenue, and voting mechanisms, where the goal is to maximize the agents’ total value for the outcome the mechanism selects.
3. Computational biology. Algorithms from computational biology are often highly parameterized, and understanding which parameter settings are optimal in which scenarios is an active area of research. We analyze parameterized algorithms for several fundamental problems from computational biology, including sequence alignment, RNA folding, and predicting topologically associating domains in DNA sequences. The key challenge from both an algorithmic and statistical perspective is that across these three diverse domains, an algorithm’s performance is an erratic function of its parameters. This is because a small tweak to the parameters can cause a cascade of changes in the algorithm’s performance. We develop tools for analyzing and
optimizing these volatile performance functions, which we use to provide parameter tuning procedures and strong statistical guarantees.