Active Sampling for Rank Learning via Optimizing the Area Under the ROC Curve

Donmez, Pinar; Carbonell, Jaime G.

doi:10.1184/R1/6602975.v1

file.pdf (159.18 kB)

Active Sampling for Rank Learning via Optimizing the Area Under the ROC Curve

journal contribution

posted on 2012-10-01, 00:00 authored by Pinar Donmez, Jaime G. Carbonell

Learning ranking functions is crucial for solving many problems, ranging from document retrieval to building recommendation systems based on an individual user’s preferences or on collaborative filtering. Learning-to-rank is particularly necessary for adaptive or personalizable tasks, including email prioritization, individualized recommendation systems, personalized news clipping services and so on. Whereas the learning-to-rank challenge has been addressed in the literature, little work has been done in an active-learning framework, where requisite user feedback is minimized by selecting only the most informative instances to train the rank learner. This paper addresses active rank-learning head on, proposing a new sampling strategy based on minimizing hinge rank loss, and demonstrating the effectiveness of the active sampling method for rankSVM on two standard rank-learning datasets. The proposed method shows convincing results in optimizing three performance metrics, as well as improvement against four baselines including entropbased, divergence-based, uncertainty-based and random sampling methods.

History

Publisher Statement

© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Date

2012-10-01

Usage metrics

Keywords

Active learning document retrieval rank learning AUC hinge loss performance optimization

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Active Sampling for Rank Learning via Optimizing the Area Under the ROC Curve

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports