Self-Supervised Learning on Mobile Robots Using Acoustics, Vibration, and Visual Models to Build Rich Semantic Terrain Maps

Libby, Jacqueline

doi:10.1184/R1/11371482.v1

jlibby_phd_ri_2019.pdf (40.4 MB)

Self-Supervised Learning on Mobile Robots Using Acoustics, Vibration, and Visual Models to Build Rich Semantic Terrain Maps

thesis

posted on 2019-12-18, 19:24 authored by Jacqueline LibbyJacqueline Libby

Humans and robots would benefit from having rich semantic maps of the terrain in which they operate. Mobile robots equipped with sensors and perception software could build such maps as they navigate through a new environment.
This information could then be used by humans or robots for better localization and path planning, as well as a variety of other tasks. However, it is hard to build good semantic maps without a great deal of human effort and robot time. Others have addressed this problem, but they do not provide
a high level of semantic richness, and in some cases their approaches require extensive human data labeling and robot driving time. We use a combination of better sensors and features, both proprioceptive and exteroceptive, and self-supervised learning to solve this problem. We enhance
proprioception by exploring the use of new sensing modalities such as sound and vibration, and in turn we increase the number and variety of terrain types that can be estimated. We build a supervised proprioceptive multiclass model that predicts seven terrain classes. The proprioceptive predictions are then used as labels to train a self-supervised exteroceptive model from camera
data. This exteroceptive model can then estimate those same terrain types more reliably in new environments. The exteroceptive semantic terrain predictions are spatially registered into a larger map of the surrounding environment.
3d point clouds from rolling/tilting ladar are used to register the proprioceptive and exteroceptive data, as well as to register the resulting exteroceptive predictions
into the larger map. Our claim is that self-supervised learning makes the exteroception more reliable since it can be automatically retrained for new locations without human supervision. We conducted experiments to support this claim by collecting data sets from different geographical environments and then comparing classification accuracies. Our results show that our self-supervised learning approach is able to outperform state of the art supervised visual learning techniques.

History

Date

2019-12-16

Degree Type

Dissertation

Department

Robotics Institute

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Anthony Stentz

Usage metrics

Keywords

mobile robotics field robotics robot perception Terrain classification computer vision proprioceptive classification acoustic-based classification deep feature learning interactive sensing Machine Hearing auto classification data fusion audio-visual fusion

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Self-Supervised Learning on Mobile Robots Using Acoustics, Vibration, and Visual Models to Build Rich Semantic Terrain Maps

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports