Carnegie Mellon University
Browse

A Framework for Evaluation and Optimization of Relevance and Novelty-based Retrieval

Download (3.26 MB)
thesis
posted on 2025-04-18, 19:35 authored by Abhimanyu Lad

There has been a growing interest in building and optimizing retrieval systems with respect to relevance as well as novelty of information, which more realistically reflects the usefulness of a system as perceived by the user. How to combine these criteria into a single metric that can be used to measure as well as optimize retrieval systems is an open challenge that has only received partial solutions so far. Unlike relevance, which can be measured independently for each document, the novelty of a document depends on other documents seen by the user during his or her past interaction with the system. This is especially problematic for assessing the retrieval performance across multiple ranked lists, as well as for learning from user’s feedback, which must be interpreted with respect to other documents seen by the user. Moreover, users often have different tolerances towards redundancy depending on the nature of their information needs, but this factor is not explicitly modeled by existing approaches for novelty-based retrieval.

In this thesis, we develop a new framework for evaluating as well as optimizing retrieval systems with respect to their utility, which is measured in terms of relevance and novelty of information. We combine a nugget-based model of utility with a probabilistic model of user behavior; this leads to a flexible metric that generalizes existing evaluation measures. We demonstrate that our framework naturally extends to the evaluation of session-based retrieval while maintaining a consistent definition of novelty across multiple ranked lists.

Next, we address the complementary problem of optimization, i.e., how to max imize retrieval performance for one or more ranked lists with respect to the proposed measure. Since the system does not have knowledge of the nuggets that are relevant to each query, we propose a ranking approach based on the use of observable query and document features (e.g., words and named entities) as surrogates for the unknown nuggets, whose weights are automatically learned from user feedback. However, finding the ranked list that maximizes the coverage of a given set of nuggets leads to an NP-hard problem. We take advantage of the sub-modularity of the proposed measure to derive lower bounds on the performance of approximate algorithms, and also conduct experiments to assess the empirical performance of a greedy algorithm under various conditions.

Our framework provides a strong foundation for modeling retrieval performance in terms of non-independent utility of documents across multiple ranked lists. Moreover, it allows accurate evaluation and optimization of retrieval systems under realistic conditions, and hence, enable rapid development and tuning of new algorithms for novelty-based retrieval without the need for user-centric evaluations involving human subjects, which, although more realistic, are expensive, time consuming, and risky in a live environment.

History

Date

2011-03-01

Degree Type

  • Dissertation

Thesis Department

  • Language Technologies Institute

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Yiming Yang

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC