posted on 2001-01-01, 00:00authored byR. S. Jasinschi, N. Dimitrova, T. McGee, L. Agnihotri, John Zimmerman, D. Li
In this paper we describe integrated multimedia processing for
Video Scout, a system that segments and indexes TV programs
according to their audio, visual, and transcript information. Video
Scout represents a future direction for personal video recorders.
In addition to using electronic program guide metadata and a user
profile, Scout allows the users to request specific topics within a
program. For example, users can request the video clip of the President
speaking from a half-hour news program.
Video Scout has three modules: (i) Video Pre-Processing, (ii)
Segmentation and Indexing, and (iii) Storage and User Interface.
Segmentation and Indexing, the core of the system, incorporates
a Bayesian framework that integrates information from the audio,
visual, and transcript (closed captions) domains. This framework
uses three layers to process low, mid, and high-level multimedia
information. The high-level layer generates semantic information
about TV program topics. This paper describes the elements of the
system and presents results from running Video Scout on real TV
programs.