posted on 2006-10-01, 00:00authored byMing-Yu Chen, Huan Li, Alexander Hauptmann
The Informedia team participated in the tasks of high-level feature extraction and event
detection in surveillance video. This year, we especially put our focus on analyzing motions in
videos. We developed a robust new descriptor called MoSIFT, which explicitly encodes
appearance features together with motion information. For the high-level feature detection, we
trained multi-modality classifiers which include traditional static features and MoSIFT. The
experimental result shows that MoSIFT has solid performance on motion related concepts and
is complementary to static features. For event detection, we trained event classifiers in sliding
windows using a bag-of-video-word approach. To reduce the number of false alarms, we
aggregated short positive windows to favor long segmentation and applied a cascade classifier
approach. The performance shows dramatic improvement over last year on the event detection
task.