Respawn: A Distributed Multi-Resolution Time-Series Datastore
As sensor networks gain traction and begin to scale, we will be increasingly faced with challenges associated with managing large-scale time-series data. In this paper, we present a cloud-to-edge partitioned architecture called Respawn that is capable of serving large amounts of time-series data from a continuously updating datastore with access latencies low enough to support interactive real-time visualization. Respawn targets sensing systems where resource-constrained edge node devices may only have limited or intermittent network connections linking them to a cloud-backend. The cloud-backend provides aggregate storage and transparent dispatching of data queries to edge node devices. Data is downsampled as it enters the system creating a multi-resolution representation capable of lowlatency range-base queries. Lower-resolution aggregate data is automatically migrated from edge nodes to the cloud-backend both for improved consistency and caching. In order to further mask latency from users, edge nodes automatically identify and migrate blocks of data that contain statistically interesting features. We show through simulation and micro-benchmarking that Respawn is able to run on ARM-based edge node devices connected to a cloud-backend with the ability to serve thousands of clients and terabytes of data with sub-second latencies.