Open-world Object Detection and Tracking

Dave, Achal

doi:10.1184/R1/16660855.v1

achald_phd_robotics.pdf (28.89 MB)

Open-world Object Detection and Tracking

thesis

posted on 2021-09-23, 19:56 authored by Achal DaveAchal Dave

Computer vision today excels at recognizing narrow slices of the real world: our models seem to accurately detect objects like cats, cars, or chairs in benchmark datasets. However, deploying models requires that they work in the open world, which includes arbitrary objects in diverse settings.

Current methods struggle on both axes: they recognize only a few classes, and struggle in settings that differ from the training distribution. A model that addresses these challenges can serve as a fundamental building block

for downstream applications, including recognizing actions, manipulating objects, and navigating around obstacles. This thesis presents our work in building robust models for detecting and tracking any object, especially ones with few or even no training examples. We start by exploring how traditional models, which recognize only a small set of object classes, generalize to the real world. We show that current

methods are extremely sensitive: even subtle changes in the input image or test distribution can lead to drops in accuracy. Our systematic evaluations show that models — even ones trained for robustness to adversarial or synthetic corruptions — often correctly classify one frame of a video, but fail on a perceptually similar nearby frame. A similar phenomenon applies even to small distribution shifts arising from natural variation between datasets. Finally, we present an approach for addressing an extreme form of generalization to object appearance: detecting fully occluded objects. Next, we explore generalization to large or infinite vocabularies, which

contain rare and never-before-seen classes. Since current datasets are largely limited to a small, closed-world set of objects, we first present a large vocabulary benchmark for measuring progress in detection and tracking. We show that current evaluations do not suffice for large vocabulary benchmarks, and present alternative metrics that appropriately evaluate progress in this setting. Finally, we present approaches which leverage advances in closed-world recognition to build accurate, generic detectors and trackers for any object.

History

Date

2021-05-23

Degree Type

Dissertation

Department

Robotics Institute

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Deva Ramanan

Usage metrics

Keywords

Open World detection tracking robustness

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Open-world Object Detection and Tracking

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports