posted on 2012-01-01, 00:00authored bySusanne Burger, Qin Jin, Peter F. Schulam, Florian MetzeFlorian Metze
Audio information retrieval is a difficult problem due to the highly unstructured nature of the data. A general labeling system for identifying audio patterns could unite research efforts in the field. This paper introduces 42 distinct labels, the “noisemes”, developed for the manual annotation of noise segments as they occur in audio streams of consumer captured and semiprofessionally produced videos. The labels describe distinct noise units based on audio concepts, independent of visual concepts as much as possible. We trained a recognition system using 5.6 hours of manually labeled data, and present recognition results