posted on 2010-12-01, 00:00authored byBrendan O'Connor, Jacob Eisenstein, Eric P Xing, Noah A. Smith
We propose a Bayesian generative model of how demographic social factors in- fluence lexical choice. We apply the method to a corpus of geo-tagged Twitter messages originating from mobile phones, cross-referenced against U.S. Census demographic data. Our method discovers communities jointly defined by linguistic and demographic properties.