A randomized algorithm for learning Mahalanobis metrics : application to classification and regression of biological data

Langmead, Christopher James.

doi:10.1184/R1/6591272.v1

file.pdf (672.88 kB)

A randomized algorithm for learning Mahalanobis metrics : application to classification and regression of biological data

journal contribution

posted on 1998-04-01, 00:00 authored by Christopher James. Langmead

Abstract: "We present a randomized algorithm for semi-supervised learning of Mahalanobis metrics over R[superscript n]. The inputs to the algorithm are a set, U, of unlabeled points in R[superscript n], a set of pairs of points, S = [(x,y)[subscript i]]; x,y [element of] U, that are known to be similar, and a set of pairs of points, D = [(x,y)[subscript i]] ; x,y [element of] U, that are known to be dissimilar. The algorithm randomly samples S, D, and m-dimensional subspaces of R[superscript n] and learns a metric for each subspace. The metric over R[superscript n] is a linear combination of the subspace metrics. The randomization addresses issues of efficiency and overfitting. Extensions of the algorithm to learning non-linear metrics via kernels, and as a pre-processing step for dimensionality reduction are discussed. The new method is demonstrated on a regression problem (structure-based chemical shift prediction) and a classification problem (predicting clinical outcomes for immunomodularity strategies for treating severe sepsis).

History

Date

1998-04-01

Usage metrics

Keywords

Computational biology.Regression analysis.

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

A randomized algorithm for learning Mahalanobis metrics : application to classification and regression of biological data

History

Date

Usage metrics

Categories

Keywords

Licence

Exports