Enriching CHILDES for Morphosyntactic Analysis

Macwhinney, Brian

doi:10.1184/R1/6614687.v1

file.pdf (220.28 kB)

Enriching CHILDES for Morphosyntactic Analysis

journal contribution

posted on 2009-01-01, 00:00 authored by Brian MacwhinneyBrian Macwhinney

The current paper examines a particular approach to morphosyntactic analysis that has been elaborated in the context of the CHILDES (Child Language Data Exchange System) database. Readers unfamiliar with this database and its role in child language acquisition research may find it useful to download and study the materials (manuals, programs, and database) that are available for free over the web at http://childes.psy.cmu.edu. However, before doing this, users should read the "Ground Rules" for proper usage of the system. This database now contains over 44 million spoken words from 28 different languages. In fact, CHILDES is the largest corpus of conversational spoken language data currently in existence. In terms of size, the next largest collection of conversational data is the British National Corpus with 5 million words. What makes CHILDES a single corpus is the fact that all of the data in the system are consistently coded using a single transcript format called CHAT. Moreover, for several languages, all of the corpora have been tagged for part of speech using an automatic tagging program called MOR.

History

Date

2009-01-01

Usage metrics

Keywords

psychology

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Enriching CHILDES for Morphosyntactic Analysis

History

Date

Usage metrics

Categories

Keywords

Licence

Exports