Intelligently Integrating Information from Speech and Vision Processing to Perform Light-weight Meeting Understanding
Important information is often generated at meetings but identifying, and retrieving that information after the meeting is not always simple. Automatically capturing such information and making it available for later retrieval has therefore become a topic of some interest. Most approaches to this problem have involved constructing specialized instrumented meeting rooms that allow a meeting to be captured in great detail. We propose an alternate approach that focuses on people’s information retrieval needs and makes use of a light-weight data collection system that allows data acquisition on portable equipment, such as personal laptops. Issues that arise include the integration of information from diﬀerent audio and video streams and optimum use of sparse computing resources. This paper describes our current development of a light-weight portable meeting recording infrastructure, as well as the use of streams of visual and audio information to derive structure from meetings. The goal is to make meeting contents easily accessible to people.