Adversarial Evaluation for Models of Natural Language
We now have a rich and growing set of modeling tools and algorithms for inducing linguistic structure from text that is less than fully annotated. In this paper, we discuss some of the weaknesses of our current methodology. We present a new abstract framework for evaluating natural language processing (NLP) models in general and unsupervised NLP models in particular. The central idea is to make explicit certain adversarial roles among researchers, so that the different roles in an evaluation are more clearly defined and performers of all roles are offered ways to make measurable contributions to the larger goal. Adopting this approach may help to characterize model successes and failures by encouraging earlier consideration of error analysis. The framework can be instantiated in a variety of ways, simulating some familiar intrinsic and extrinsic evaluations as well as some new evaluations.