Sparse Models of Natural Language Text

Yogatama, Dani

doi:10.1184/R1/28840505.v1

Sparse Models of Natural Language Text

thesis

posted on 2025-04-24, 21:03 authored by Dani Yogatama

In statistical text analysis, many learning problems can be formulated as a minimization of a sum of a loss function and a regularization function for a vector of parameters (feature coefficients). The loss function drives the model to learn generalizable patterns from the training data, whereas the regularizer plays two important roles: to prevent the models from capturing idiosyncrasies of the training data (overfitting) and to encode prior knowledge about the model parameters.

When learning from high-dimensional data such as text, it has been empirically observed that relatively few dimensions are relevant to the predictive task (Forman, 2003). How can we capitalize on this insight and choose which dimensions are relevant in an informed and principled manner? Sparse regularizers provide a way to select relevant dimensions by means of regularization. However, past work rarely encodes non-trivial prior knowledge that yields sparse solutions through a regularization function. This thesis investigates the applications of sparse models—especially structured sparse models—as a medium to encode linguistically-motivated prior knowledge in textual models to advance NLP systems. We explore applications of sparse NLP models in temporal models of text, word embeddings, and text categorization.

Sparse models come with their own challenges, since new instantiations of sparse models often require a specialized optimization method. This thesis also presents opti mization methods for the proposed instantiations of sparse models. Therefore, the goals of this thesis are twofold: (i) to show how sparsity can be used to encode linguistic in formation in statistical text models, and (ii) to develop efficient learning algorithms to solve the resulting optimization problems

History

Date

2015-04-15

Degree Type

Dissertation

Department

Language Technologies Institute

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Noah A. Smith

Usage metrics

Keywords

machine learning natural language processing artificial intelligence

Licence

CC BY 4.0

Sparse Models of Natural Language Text

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports