On Designing Resource-Constrained CNNs Efficiently

Chin, Ting-wu

doi:10.1184/R1/19249499.v1

Chin_cmu_0041E_10683.pdf (15.07 MB)

On Designing Resource-Constrained CNNs Efficiently

thesis

posted on 2022-03-03, 20:45 authored by Ting-wu ChinTing-wu Chin

Deep Convolutional Neural Networks (CNNs) have been adopted in many computer vision applications to achieve high performance. However, the growing computational demand of CNNs has made it increasingly difficult to deploy state-of-the-art CNNs onto resource-constrained platforms. As a result, model compression/acceleration has emerged to be an important field of research. In this thesis, we intend to

make CNNs more friendly for resource-limited platforms from two perspectives. The first perspective is to introduce novel ways of compressing/accelerating CNNs and the second perspective is to reduce the overhead of existing methodologies for constructing resource-constrained CNNs.

In the first perspective, we propose one novel technique for model acceleration and another for model compression. First, we propose AdaScale which is an algorithm that automatically scales the resolution of input images to improve both the speed and accuracy of a video object detection system. Second, we identify the Winning-Bitwidth phenomenon, where we found some weight bitwidth is more efficient than others for model compression when the filter counts of the CNNs are allowed to change. In the second perspective, we propose three novel algorithms for accelerating existing filter pruning methods for constructing resource-constrained CNNs. First, we propose LeGR, an algorithm that aims to learn a global ranking among filters of a pre-trained CNN so that compressing the CNN to different target constraint levels using filter pruning can be done efficiently by greedily pruning the filters following the

learned ranking. Second, we improve upon LeGR and propose Joslim, which is an algorithm that trains a CNN from scratch by jointly optimizing its weights and filter counts such that the trained CNN can be pruned without fine-tuning. Joslim improves upon LeGR in terms of efficiency as LeGR requires the pruned models to be fine-tuned to be usable. Lastly, we propose Width Transfer, which improves

the efficiency for filter pruning methods that are derived from a neural architecture search perspective. Width Transfer assumes that the optimized filter counts are regular across depths and widths of a CNN architecture and are invariant to the size and the resolution of the training dataset. As a result, Width Transfer performs neural architecture search for filter counts by solving a proxy problem that has a much

lower overhead.

History

Date

2021-07-07

Degree Type

Dissertation

Department

Electrical and Computer Engineering

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Diana Marculescu

Usage metrics

Keywords

Automated Machine Learning Convolutional Neural Networks Machine Learning for Edge Devices

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

On Designing Resource-Constrained CNNs Efficiently

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports