Learning Models that Match

Tyo, Jacob

doi:10.1184/R1/26239931.v1

Learning Models that Match

thesis

posted on 2024-07-22, 19:06 authored by Jacob TyoJacob Tyo

Contrastive learning has emerged as a critical methodology in machine learning applications, offering a pair-wise comparison perspective on data interpretation and model training. This thesis comprehensively examines contrastive learning models, emphasizing their development, application, and optimization for real-world scenarios. This thesis is structured into two main sections: the first explores practical applications in diverse domains such as authorship attribution, verification, and person re-identification, while the second focuses on methodological advancements aimed at enhancing model efficacy and adaptability.

In Part I, the thesis systematically evaluates the application of contrastive learning techniques across various fields, highlighting their strengths and limitations in real-world settings. Through detailed case studies, including the implementation of a photo-searching system for off-road motorcycle racing, this work assesses the adaptability and effectiveness of contrastive models under challenging conditions. The findings underscore the necessity for nuanced understanding and strategic application of these models to harness their full potential, especially concerning curating the right pairs during training.

Part II delves into developing innovative approaches to overcome the inherent challenges identified in contrastive learning. It introduces new algorithms and frameworks designed to refine the learning process, particularly in handling weakly labeled data and optimizing the influence of each sample on the overall loss (i.e. the pair curation). The proposed methodologies aim to bridge the gap between theoretical principles and practical utility, facilitating the creation of more robust, efficient, and versatile machine learning systems. This thesis yields highly-performant authorship identification and person reidentification models, often achieving a new state-of-the-art. Furthermore, the insights drawn from analysis of these models and applications lead to the introduction of two methodologies that enhance model training. The first is a method for automatically adjusting the influence each data-point has on a model at a particular point in training, and the second method enables contrastive training among weakly labeled data via a contrastive extension to the multiple-instance learning framework. Together, these findings represent insight into the dynamics of contrastive learning, and present viable solutions to broaden their real-world applicability.

History

Date

2024-03-14

Degree Type

Dissertation

Department

Machine Learning

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Zachary C. Lipton

Usage metrics

Keywords

Machine Learning Contrastive Learning Authorship Identification Meta-Learning Motorcycle Racing Dataset Multiple Instance Learning Text Spotting Person Search Artificial Intelligence and Image Processing

Licence

In Copyright

Learning Models that Match

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports