posted on 2006-01-01, 00:00authored byJiazhi Ou, Yanxin Shi, Jeffrey Wong, Susan R. Fussell, Jie Yang
The increasing interest in supporting multiparty remote collaboration has created both opportunities and challenges for the research community. The research reported here aims to develop tools to support multiparty remote collaborations and to study
human behaviors using these tools. In this paper we first introduce an experimental multimedia (video and audio) system with which an expert can collaborate with several novices. We then use this
system to study helpers’ focus of attention (FOA) during a collaborative circuit assembly task. We investigate the
relationship between FOA and language as well as activities using multimodal (audio and video) data, and use learning methods to predict helpers’ FOA . We process different modalities separately
and fusion the results to make a final decision. We employ a sliding window-based delayed labeling method to automatically predict changes in FOA in real time using only the dialogue
among the helper and workers. We apply an adaptive background subtraction method and support vector machine to recognize the worker’s activities from the video. To predict the helper’s FOA,
we make decisions using the information of joint project boundaries and workers’ recent activities. The overall prediction
accuracies are 79.52% using audio only and 81.79% using audio and video combined.