posted on 1999-09-01, 00:00authored byErik Reidel, Garth Gibson, Christos Faloutsos
The increasing performance and decreasing cost of processors and memory are causing system
intelligence to move into peripherals from the CPU. Storage system designers are using this trend
toward “excess” compute power to perform more complex processing and optimizations inside
storage devices. To date, such optimizations have been at relatively low levels of the storage protocol.
At the same time, trends in storage density, mechanics, and electronics are eliminating the bottleneck
in moving data off the media and putting pressure on interconnects and host processors to
move data more efficiently. We propose a system called Active Disks that takes advantage of processing
power on individual disk drives to run application-level code. Moving portions of an application’s
processing to execute directly at disk drives can dramatically reduce data traffic and take
advantage of the storage parallelism already present in large systems today. We discuss several
types of applications that would benefit from this capability with a focus on the areas of database,
data mining, and multimedia. We develop an analytical model of the speedups possible for scanintensive
applications in an Active Disk system.We also experiment with a prototype Active Disk
system using relatively low-powered processors in comparison to a database server system with a
single, fast processor. Our experiments validate the intuition in our model and demonstrate speedups
of 2x on 10 disks across four scan-based applications. The model promises linear speedups in
disk arrays of hundreds of disks, provided the application data is large enough.