%0 Journal Article %A Lasecki, Walter S. %A Thiha, Phyo %A Zhong, Yu %A Brady, Erin %A Bigham, Jeffrey P. %D 2013 %T Answering Visual Questions with Conversational Crowd Assistants %U https://kilthub.cmu.edu/articles/journal_contribution/Answering_Visual_Questions_with_Conversational_Crowd_Assistants/6469823 %R 10.1184/R1/6469823.v1 %2 https://kilthub.cmu.edu/ndownloader/files/11898377 %K Assistive Technology %K Crowdsourcing %K Human Computation %X

Blind people face a range of accessibility challenges in their everyday lives, from reading the text on a package of food to traveling independently in a new place. Answering general questions about one’s visual surroundings remains well beyond the capabilities of fully automated systems, but recent systems are showing the potential of engaging on-demand human workers (the crowd) to answer visual questions. The input to such systems has generally been a single image, which can limit the interaction with a worker to one question; or video streams where systems have paired the end user with a single worker, limiting the benefits of the crowd. In this paper, we introduce Chorus:View, a system that assists users over the course of longer interactions by engaging workers in a continuous conversation with the user about a video stream from the user’s mobile device. We demonstrate the benefit of using multiple crowd workers instead of just one in terms of both latency and accuracy, then conduct a study with 10 blind users that shows Chorus:View answers common visual questions more quickly and accurately than existing approaches. We conclude with a discussion of users’ feedback and potential future work on interactive crowd support of blind users

%I Carnegie Mellon University