BigSurv18 Program

Wednesday 24th October Thursday 25th October Friday 26th October Saturday 27th October

Program at a glance    Floor plans

Download the BigSurv18 App

Using Big Data for Electoral Research I: What's the Sentiment for Using Sentiment in Electoral Research?

Chair Professor Susan Banducci (University of Exeter)
TimeSaturday 27th October, 14:00 - 15:30
Room: 40.012

Political Sentiment and Election Forecasting

Mr Niklas M. Loynes (University of Manchester / NYU) - Presenting Author
Professor Mark Elliot (University of Manchester)

Several papers have sought to forecast election results using Twitter data, with a mixed record. To date, there is no clear understanding of why such forecasts fail or succeed in accurately predicting a given election's outcome. The vast majority of such papers draw upon one of two methodological approaches - an aggregate measure of volume of mentions for a given party/candidate, or a measure of the volume of positive sentiment measured in such tweets. We argue that both of these methods are not fit for purpose, as they lack a theoretical conceptualisation of the linkage between off- and online acts of political participation.

In this paper, we introduce a new method of forecasting elections using simulated "approval rating" scores for all available candidates in a given election, derived from the probability density function of "Political Sentiment" (PS) scores for text units mentioning a given candidate within a sample of tweets pertaining to an election. Furthermore, we introduce a new method of extracting the underlying "political sentiment", i.e. sentiment as directed towards politically relevant actors, events or topics from salient text units, such as tweets. We argue that PS is a more reliable way of understanding sentiment polarity contained in such text units, as the language typical of online political speech (e.g. sarcasm, context-specific & referential language) is markedly different from everyday language, which off-the-shelf sentiment analysis tools are designed to classify.

We trial both these methodological innovations on a sample of tweets pertaining to the 2016 New Hampshire Democratic Primary. Our forecast correctly predicts the election’s eventual winner using a basic model of vote choice, while slightly under-performing the poll of polls. We argue that a more accurately specified model should exceed this model’s performance.

A Full Spectrum Approach to Election Polling and Forecasting

Mr Chris Jackson (Ipsos) - Presenting Author
Mr Mark Polyak (Ipsos)
Dr Clifford Young (Ipsos)
Ms Mallory Newall (Ipsos)

Download presentation

Election polling has encountered several very public missteps in the last few years as hyper-polarized electorates have collided with shifting methodological robustness. The path forward for election research is to step beyond the polls-alone approach and embrace a full spectrum technique for charting and forecasting elections using a wide range of modeled, survey, and big-data based indicators. While survey research has advanced in sophistication, it still struggles to clearly identify and represent small sectors of the population and is only as robust as the assumptions that underlie the research design. Big data analysis derived from real-time social media and internet of things data provides a qualitative and behavioral check on survey data improving the accuracy of estimates and confidence in election forecasts.

This paper will present Ipsos' experience and success implementing this full spectrum approach in the 2018 Mexican Presidential election. In this election, we combine high-quality surveys with social media tracking, regular engagement via social media, MRP-based modeling of the electorate, and econometric models derived from academic literature. This approach allows us to have more clarity into the accuracy of our assumptions and provides multiple checks before making a forecast of the results.

I Get It! Using Qualitative and Quantitative Data to Investigate Comprehension Difficulties in Political Attitude Questions

Dr Naomi Kamoen (Tilburg University) - Presenting Author
Dr Bregje Holleman (Utrecht University)

Download presentation

Voting Advice Applications (VAAs) are online tools with attitude questions about political issues, such as “All coffee shops should be closed down. Agree-Disagree/no-opinion”. Users visit these survey tools spontaneously to obtain a voting advice based on a comparison between a user’s answers and issue positions of the political parties. While VAAs have become a central source of political information in many European countries, not much is known about the comprehension and use of political opinion questions in VAAs. We employed cognitive interviewing as well as statistical analyses of large amounts of VAA answers to evaluate comprehension difficulties in VAAs.

We collected data during 3 elections in the Netherlands: the national elections (2012), the municipal elections (2014), and the provincial elections (2015). In each election, we asked between 40 and 80 people to fill out a VAA while thinking aloud (Willis 2005). During this process, verbalizations were recorded so that we could interpret verbalizations. Moreover, we gained access to all the VAA answers provided in 34 VAAs in the 2014 Dutch municipality elections, as well as to answers provided in 11 provincial elections, and 4 national elections. This allows us to combine qualitative think-aloud data to analyses of these big datasets.

Analyses of the qualitative data show that users encounter comprehension difficulties for about 1 in every 5 VAA-questions. About two-thirds of the comprehension problems are related to the semantic meaning of the question. These problems often encompass a lack of understanding technical political terms (e.g., dog tax). Problems with the pragmatic meaning of the question (about 1/3 of the problems) included having too little information about the current situation or a lack of background information about the reason for a proposal. In case of comprehension problems, VAA users often make assumptions about the meaning of/about the question (“welfare work…do they mean health care with that? They probably do.”). Failing to look for additional information, VAA users nevertheless proceed by supplying an answer, which is disproportionally often a neutral or a no-opinion answer.

A drawback of these qualitative analyses is that they rely on relatively few respondents. We therefore conducted quantitative analyses for each election, to investigate if the question characteristics related to difficulties in the qualitative study, indeed led to more neutral and no-opinion responding in a large and real-life dataset with people’s actual responses to VAA questions. For the municipal elections, for example, we analyzed all answers provided to VAAs during the Dutch municipality elections (34 * 30 questions, of over 300,000 respondents). This confirmed that the mentioning of political terms and locations was correlated with larger proportions of no-opinion responses and neutral responses. We also found that while semantic comprehension problems are more often correlated with no-opinion answers, pragmatic problems more frequently translate to a neutral answer. This is a refinement of results concerning the use of nonsubstantial answering options in existing research (e.g. Sturgis, Roberts & Smith, 2014).

Voter Information and Learning in the U.S. 2016 Presidential Election: Evidence From a Panel Survey Combined With Direct Observation of Social Media Activity

Professor Jonathan Nagler (NYU, Social Media and Political Participation Lab) - Presenting Author
Dr Gregory Eady (NYU, Social Media and Political Participation Lab)
Professor Patrick Egan (NYU, Dept of Politics)
Professor Josh Tucker (NYU, Social Media and Political Participation Lab)

Download presentation

We combine a 3 wave panel of 3500 respondents collected during the US 2016 presidential election with data on the social media activity of respondents. For over 1800 respondents we have information on who respondents followed on twitter, as well as their own tweeting activity. And for over 1500 respondents we have information as to their Facebook feeds. This information on Facebook feeds includes accounts they have liked, profile data, permissions, and posts. For both Twitter and Facebook this is information we have directly recovered based on the user voluntarily supplying their Twitter ID, and/or agreeing to the use of a Facebook App developed by the SMaPP lab to directly retrieve information from Facebook. We also have respondents’ self-reports of use of traditional media.

For respondents who supplied us with their Twitter IDs we recovered a list of all accounts each respondent followed, and classified the accounts as either media, political, or non-elite. We have collected each tweet sent by those accounts (i.e., the set of tweets each user could see), and coded them into different topics. We then produced for each user a set of tweets by topic by source. Respondents were asked both factual (surveillance) knowledge questions during wave 1 and wave 3, and their views of the candidates’ positions as well as their attitudes on a wide set of political issues during multiple waves of the survey. We then estimated the impact of information seen on Twitter on respondents’ knowledge of politically relevant facts, and of their views of party positions, and their attitudes on political issues.

We tested several key hypotheses. First, we tested whether respondents who followed more online media outlets learned more politically relevant facts, and more about the candidates’ positions, than respondents who followed fewer media outlets online. Second, we tested whether respondents who followed political candidates updated their beliefs about the candidates’ positions, or their own political attitudes. Third, we tested whether respondents who followed more ideologically homogenous sets of accounts online changed their beliefs in ways different than respondents who followed a more ideologically diverse set of accounts online.

This paper makes use of a unique dataset combining high quality survey data with social media data to learn about information acquisition and opinion formation during an important election campaign. We feel that it would fit well with the conference theme of Big Data Meets Survey Science. However, since the paper focuses very much on political outcomes - we are planning to look for a different publication outlet than the two conference outlets. But we very much hope to have the opportunity to present at BIgSurv18 where we would hope to get useful feedback from a diverse audience.