Selecting Text Spans for Document Summaries: Heuristics and Metrics

Mittal, Vibhu; Kantrowitz, Mark; Goldstein, Jade; Carbonell, Jaime G.

doi:10.1184/R1/6625706.v1

Selecting Text Spans for Document Summaries: Heuristics and Metrics

journal contribution

posted on 1999-01-01, 00:00 authored by Vibhu Mittal, Mark Kantrowitz, Jade Goldstein, Jaime G. Carbonell

Human-quality text summarization systems are difficult to design, and even more difficult to evaluate, in part because documents can differ along several dimensions, such as length, writing style and lexical usage. Nevertheless, certain cues can often help suggest the selection of sentences for inclusion in a summary. This paper presents an analysis of news-article summaries generated by sentence extraction. Sentences are ranked for potential inclusion in the summary usi ng a weighted combination of linguistic features – derived from an analysis of news-wire summaries. This paper evaluates the relative effectiveness of these features. In order to do so, we discuss the construction of a large corpus of extraction-based summaries, and characterize the underlying degree of difficulty of summarization at different compression level s on articles in this corpus. Results on our feature set are prese nted after normalization by this degree of difficulty.

History

Date

1999-01-01

Usage metrics

Keywords

Software Research Computer Software not elsewhere classified

Licence

In Copyright

Selecting Text Spans for Document Summaries: Heuristics and Metrics

History

Date

Usage metrics

Categories

Keywords

Licence

Exports