posted on 2008-04-01, 00:00authored byTeruko Mitamura, Eric Nyberg, Jaime G. Carbonell
Although knowledge-based MT systems have the potential to achieve high translation accuracy,
each successful application system requires a large amount of hand-coded knowledge
(lexicons, grammars, mapping rules, etc.). Systems like KBMT-89 and its descendants have
demonstrated how knowledge-based translation can produce good results in technical domains
with tractable domain semantics. Nevertheless, the cost of developing large-scale applications
with tens of thousands of domain concepts precludes a purely hand-crafted approach. The current
challenge for the "next generation" of knowledge-based MT systems is to utilize on-line textual
resources and corpus analysis software in order to automate the most laborious aspects of the
knowledge acquisition process. This partial automation can in turn maximize the productivity of
human knowledge engineers and help to make large-scale applications of knowledge-based MT
an economic reality. In this paper we discuss the corpus-based knowledge acquisition methodology
used in KANT, a knowledge-based translation system for multi-lingual document production.
This methodology can be generalized beyond the KANT interlingua approach for use with any
system that requires similar kinds of knowledge.
History
Publisher Statement
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/
by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.