Hi All!

Our speaker this Wednesday (4/23) will be Nick Beauchamp from Northeastern University. Nick will be giving a talk entitled Predicting, Extrapolating and Interpolating State-level Polls using Twitter. The abstract is included below. 

As usual, we will meet in CGIS K354 at 12 noon and lunch will be served.

Looking forward to seeing you all there!

Tess
-----------------
Tess Wise
PhD Candidate
Harvard Department of Government
http://tesswise.com


ABSTRACT:

Predicting, Extrapolating and Interpolating State-level Polls using Twitter

Presidential, gubernatorial, and senatorial elections all require state-level polling, but even during presidential campaigns, state-level surveys remain sparse, erratically timed, and entirely neglected in uncompetitive states. Partly in response to these unmet needs in political and other domains, there have been numerous efforts to approximate various survey measures using social media data, but most of these approaches remain distinctly flawed, both methodologically and due to insufficient training data.  To remedy these flaws, this paper combines 1200 state-level polls during the 2012 presidential campaign with over 100 million state-located political Tweets; models the former as a function of the latter using a new linear regularization feature-selection method; and shows via forward-in-time rolling-window out-of-sample testing that, properly modeled, the Twitter textual data tracks polling variation both across states and within states over time, predicting short-term changes in polls with greater accuracy than is possible using past polling data alone. Thus validated, these measures can be extended to unpolled states and, given the density of the Twitter data, potentially to sub-state regions and sub-day timescales.  In addition, an examination of the textual features most strongly associated with changes in surveyed vote intention reveals the topics, events, and concerns associated with the rapidly shifting national debate, making this not just a measurement tool, but also of potential use for real-time campaign strategy.