Diverse Context for Learning Word Representations
Word representations are mathematical objects that capture a word’s meaning and its grammatical properties in a way that can be read and understood by computers. Word representations map words into equivalence classes such that words that share similar properties to each other are part of the same equivalence class. Word representations are either constructed man ually by humans (in the form of word lexicons, dictionaries etc.) or obtained automatically using unsupervised learning algorithms. Since, manual construction of word representations is unscalable, and expensive, obtaining them automatically is desirable.
Traditionally, automatic learning of word representations has relied on the distributional hypothesis, which states that the meaning of a word is evidenced by the words that occur in its context (Harris, 1954). Thus, existing word representation learning algorithms like latent semantic analysis (Deerwester et al., 1990; Landauer and Dumais, 1997), derive word meaning in terms of aggregated co-occurrence counts of words extracted from unlabeled monolingual corpora.
In this thesis, we diversify the notion of context to include information beyond the mono lingual distributional context. We show that information about word meaning is present in other contexts like neighboring words in a semantic lexicon, context of the word across different languages, and the morphological structure of the word. We show that in addition to monolingual distributional context these sources provide complementary information about word meaning, which can substantially improve the quality of word representations. We present methods to augment existing models of word representations to incorporate these knowledge sources.
History
Date
2016-05-03Degree Type
- Dissertation
Department
- Language Technologies Institute
Degree Name
- Doctor of Philosophy (PhD)