posted on 2007-01-01, 00:00authored byMaxim Makatchev, Kurt VanLahn
We describe a combination of a statistical and symbolic approaches for
automated scoring of student utterances according to their semantic content. The
proposed semantic classifier overcomes the limitations of bag-of-words methods by
mapping natural language sentences into predicate representations and matching
them against the automatically generated deductive closure of the domain givens,
buggy assumptions and domain rules. With the goal to account for uncertainties in
both symbolic representations of natural language sentences and logical relations
between domain statements, this work extends the deterministic symbolic approach
by augmenting the deductive closure graph structure with conditional probabilities,
thus creating a Bayesian network. By deriving the structure of the network formally,
instead of estimating it from data, we alleviate the problem of sparseness of
training data. We compare the performance of the Bayesian network classifier with
the deterministic graph matching-based classifiers and baselines.