posted on 01.01.2007, 00:00by Darren Gergle, Carolyn P. Rose, Robert E. Kraut
A number of recent studies have demonstrated that groups benefit considerably from access to shared visual
information. This is due, in part, to the communicative efficiencies provided by the shared visual context. However, a large gap exists between our current theoretical
understanding and our existing models. We address this gap by developing a computational model that integrates
linguistic cues with visual cues in a way that effectively models reference during tightly-coupled, task-oriented interactions. The results demonstrate that an integrated
model significantly outperforms existing language-only and visual-only models. The findings can be used to inform and
augment the development of conversational agents, applications that dynamically track discourse and collaborative interactions, and dialogue managers for natural language interfaces.