Automated analysis of the Cinderella story

<p><em>Background</em>: AphasiaBank is a collaborative project whose goal is to develop an archival database of the discourse of individuals with aphasia. Along with databases on first language acquisition, classroom discourse, second language acquisition, and other topics, it forms a component of the general TalkBank database. It uses tools from the wider system that are further adapted to the particular goal of studying language use in aphasia.</p> <p><em>Aims</em>: The goal of this paper is to illustrate how TalkBank analytic tools can be applied to AphasiaBank data.</p> <p><em>Methods &Procedures</em>: Both aphasic (<em>n</em> = 24) and non-aphasic (<em>n</em>  = 25) participants completed a 1-hour standardised videotaped data elicitation protocol. These sessions were transcribed and tagged automatically for part of speech. One component of the larger protocol was the telling of the Cinderella story. For these narratives we compared lexical diversity across the groups and computed the top 10 nouns and verbs across both groups. We then examined the profiles for two participants in greater detail.</p> <p><em>Conclusions</em>: Using these tools we showed that, in a story-retelling task, aphasic speakers had a marked reduction in lexical diversity and a greater use of light verbs. For example, aphasic speakers often substituted “girl” for “stepsister” and “go” for “disappear”. These findings illustrate how it is possible to use TalkBank tools to analyse AphasiaBank data.</p>