posted on 2011-07-01, 00:00authored byDesai Chen, Chris Dyer, Shay B. Cohen, Noah A. Smith
In this paper, we give a treatment to the problem of bilingual part-of-speech induction with parallel data. We demonstrate that na¨ıve optimization of log-likelihood with joint MRFs suffers from a severe problem of local maxima, and suggest an alternative – using contrastive estimation for estimation of the parameters. Our experiments show that estimating the parameters this way, using overlapping features with joint MRFs performs better than previous work on the 1984 dataset.