Carnegie Mellon University
Browse

Practical Coding-Theoretic Tools for Machine Learning Systems and by Machine Learning Systems

Download (1.82 MB)
thesis
posted on 2023-06-08, 21:00 authored by Jack KosaianJack Kosaian

 Machine learning (ML) is now used in many domains, such as web services and
safety-critical systems. This has led to the development of ML systems for deploying
and training ML models. Beyond achieving high accuracy, ML systems must also
use computing infrastructure efficiently and tolerate unreliable infrastructure.


Coding-theoretic tools enable many systems to operate reliably without the sig-
nificant resource overhead that accompanies replication-based approaches. These
tools are used in production storage and communication systems, and there is grow-
ing interest in their use for distributed computing.


This thesis explores the interplay between ML systems and practical applications
of coding theory. Specifically, we show how ML systems can be made more reliable
and efficient via novel uses of coding-theoretic tools, and how coding-theoretic tools
can be expanded in reach and be made more efficient through techniques from ML
and ML systems. We illustrate this interaction via multiple thrusts:


1) We show how properties unique to ML systems can be exploited to efficiently integrate coding-theoretic fault tolerance techniques into ML systems. First, we reduce the execution-time overhead of fault-tolerant inference on GPUs by up to 5.3× by exploiting trends in neural network design and GPU hardware. Second, we show how coding-theoretic tools can be coupled with the unique properties of recommendation models to enable low overhead fault tolerance in training.
2) We demonstrate that co-designing coding-theoretic tools with ML systems
offers new opportunities to extend these tools beyond prior limitations. Specifically,
we enable resource-efficient fault tolerance in distributed prediction serving systems
by using ML to overcome a key barrier in prior coding-theoretic tools.
3) We identify opportunities for ideas inspired by coding theory to be used to
improve the performance of ML systems, even when reliability is not a concern. We
show that the throughput and GPU utilization of specialized convolutional neural
network inference can be improved by up to 2.5× by combining images in a coding-
theory-inspired manner and making small modifications to the model architecture.
4) Finally, we show that the encoding and decoding functions of one popular class of coding-theoretic tools, linear codes, can operate at higher throughput and with little developer effort via advancements in ML systems. We show how similarities between operations in linear codes and those in ML libraries enable linear codes to be represented via ML libraries, and thus allow libraries for computing linear codes to adopt the many optimizations that have gone into ML libraries. This approach outperforms custom libraries for computing linear codes by up to 2.2×.

Through these thrusts, this thesis demonstrates the promise of using coding-
theoretic in ML systems and ideas from ML systems in coding-theoretic tools to
bring about the next generation of efficient and reliable systems. 

History

Date

2023-04-07

Degree Type

  • Dissertation

Department

  • Computer Science

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Rashmi Vinayak

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC