The goal of the TalkBank project (http://talkbank.org) is to support data-sharing and direct, community-wide access to naturalistic
recordings and transcripts of human and animal communication. Toward this end, we have constructed a web accessible database of
transcripts linked to audio and video media within fields such as conversation analysis, classroom discourse, animal communication,
gesture, meetings, second language acquisition, first language acquisition, bilingualism, tutoring, and legal oral argumentation. We
discuss how we have taken discrepant databases from dozens of individual projects and merged them together into a well-structured
uniform database in which transcripts can be opened online through browsers, allowing direct multimedia playback. To achieve
translation across corpora, we have defined a general XML schema. The validity of this schema is checked by bidirectional conversion
from alternative input formats to XML and back. The resultant transcripts are then linked to hinted media and XSLT is used to format
web readable browsable multimedia transcripts playable through SMIL. A parallel pathway is used to support collaborative
commentary and publication of PDF linked to media through special issues of journals in the relevant fields.