Carnegie Mellon University
Browse

Headline Generation using a Training Corpus

Download (202.47 kB)
journal contribution
posted on 2015-07-01, 00:00 authored by Rong Jin, Alexander Hauptmann
This paper discusses fundamental issues involved in word selection for title generation. We review several methods for title generation, namely extractive summarization and two versions of a Naïve Bayesian, and compare the performance of those methods using an F1 metric. In addition, we introduce a novel approach to title generation using the k-nearest neighbor (KNN) algorithm. Both the KNN method and a limited-vocabulary Naïve Bayesian method outperform the other evaluated methods with an F1 score of around 20%. Since KNN produces complete and legible titles, we conclude that KNN is a very promising method for title generation, provided good content overlap exists between the training corpus and the test documents

History

Publisher Statement

Copyright © 2015 International Joint Conferences on Artificial Intelligence

Date

2015-07-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC