posted on 1976-01-01, 00:00authored byJade Goldstein, Mark Kantrowitz, Vibhu Mittal, Jaime G. Carbonell
Human-quality text summarization systems
are difficult to design, and even more difficult to evaluate, in part because documents can differ along several dimensions, such as length, writing style and lexical usage.
Nevertheless, certain cues can often help suggest the selection of sentences for inclusion in a summary. This paper
presents our analysis of news-article summaries generated
by sentence selection. Sentences are ranked for potential
inclusion in the summary using a weighted combination of
statistical and linguistic features. The statistical features
were adapted from standard IR methods. The potential
linguistic ones were derived from an analysis of news-wire
summaries. To evaluate these features we use a normalized
version of precision-recall curves, with a baseline of random
sentence selection, as well as analyze the properties of such
a baseline. We illustrate our discussions with empirical results showing the importance of corpus-dependent baseline
summarization standards, compression ratios and carefully
crafted long queries.