Carnegie Mellon University
Browse
file.pdf (281.28 kB)

Automated Corpus Analysis and the Acquisition of Large, Multi-Lingual Knowledge Bases for MT

Download (281.28 kB)
journal contribution
posted on 1993-01-01, 00:00 authored by Teruko Mitamura, Eric NybergEric Nyberg, Jaime G. Carbonell

Although knowledge-based MT systems have the potential to achieve high translation accuracy, each successful application system requires a large amount of hand-coded knowledge (lexicons, grammars, mapping rules, etc.). Systems like KBMT-89 and its descendants have demonstrated how knowledge-based translation can produce good results in technical domains with tractable domain semantics. Nevertheless, the cost of developing large-scale applications with tens of thousands of domain concepts precludes a purely hand-crafted approach. The current challenge for the "next generation" of knowledge-based MT systems is to utilize on-line textual resources and corpus analysis software in order to automate the most laborious aspects of the knowledge acquisition process. This partial automation can in turn maximize the productivity of human knowledge engineers and help to make large-scale applications of knowledge-based MT an economic reality. In this paper we discuss the corpus-based knowledge acquisition methodology used in KANT, a knowledge-based translation system for multi-lingual document production. This methodology can be generalized beyond the KANT interlingua approach for use with any system that requires similar kinds of knowledge.

History

Date

1993-01-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC