Graphical Models and Overlay Networks for Reasoning about Large Distributed Systems
This thesis examines reasoning under uncertainty in distributed systems. Unlike in centralized
systems, where the observations reside in a single location, the observations in distributed
systems are often scattered across the network. To reason accurately, a networked
device often needs to incorporate observations from other nodes and must do so with limited
computation and communication even for large problems. The reasoning is further complicated
by unstable network conditions, characteristic to many real-world networks: the nodes
may fail, communication links may become unreliable, and the entire network may get fragmented
into several components that cannot communicate with each other. These aspects
make distributed inference very challenging.
We consider one general problem of distributed filtering for estimating the state of a dynamical
system and three independent applications: simultaneous localization and tracking,
where a camera network localizes itself by observing a moving object, internal localization of
large-scale modular robots, where a robot determines the relative poses of its internal parts,
and collaborative filtering for providing recommendations in a peer-to-peer network. These
problems share a common theme: each of these problems can be described by a graphical
model that permits compact representation of and efficient reasoning about the problem. Using
graphical models, we design algorithms that address challenges, such as inconsistency of
node beliefs in fragmented networks and difficult local optima in modular robot localization.
Due to the complexity of the reasoning tasks, it is not sufficient to coordinate the nodes locally
within each node’s immediate physical neighborhood. Instead, our algorithms employ
overlay networks—distributed data structures built on top of the physical networks—to coordinate
among distant nodes. The resulting algorithms obey the communication constraints
imposed by the network, while solving the problems robustly.
We evaluate our algorithms on data from real sensor networks and on a realistic deployment
on the PlanetLab network. We demonstrate robustness to network fluctuations and, in
some cases, our distributed algorithms improve upon state-of-the-art centralized approaches.