What's News, What's Not? Associating News Videos with Words

Duygulu, Pinar; Hauptmann, Alexander

doi:10.1184/R1/6612881.v1

file.pdf (1.32 MB)

What's News, What's Not? Associating News Videos with Words

journal contribution

posted on 1999-03-01, 00:00 authored by Pinar Duygulu, Alexander Hauptmann

Text retrieval from broadcast news video is unsatisfactory, because a transcript word frequently does not directly " describe " the shot when it was spoken. Extending the retrieved region to a window around the matching keyword provides better recall, but low precision. We improve on text retrieval using the following approach: First we segment the visual stream into coherent story-like units, using a set of visual news story delimiters. After filtering out clearly irrelevant classes of shots, we are still left with an ambiguity of how words in the transcript relate to the visual content in the remaining shots of the story. Using a limited set of visual features at different semantic levels ranging from color histograms, to faces, cars, and outdoors, an association matrix captures the correlation of these visual features to specific transcript words. This matrix is then refined using an EM approach. Preliminary results show that this approach has the potential to significantly improve retrieval performance from text queries.

History

Date

1999-03-01

Usage metrics

Keywords

computer sciences

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

What's News, What's Not? Associating News Videos with Words

History

Date

Usage metrics

Categories

Keywords

Licence

Exports