With the failure of traditional forecasting methods to accurately predict the outcomes of the UK General Election of May 2015, can social media based predictions do any better? In this article, Andrea Ceron, Luigi Curini, and Stefano M. Iacus (University of Milan and VOICES from the Blogs) find that supervised and aggregated sentiment analysis (SASA) applied in proportional electoral systems produces the most accurate forecasts of election results.
The exponential growth of social media and social network sites, such as Facebook and Twitter, and their potential impact on real world politics has increasingly attracted the attention of scholars in recent years. Among the other things, researchers have started to explore social media as a device to assess the popularity of politicians, to track the political alignment of social media users, and to compare citizens’ political preferences expressed online with those reported by polls. Analysing social media during an electoral campaign can indeed be very interesting for a number of reasons. Besides being cheaper and faster compared with traditional surveys, social media analysis can monitor an electoral campaign on a daily (or on an hourly) basis. Consequently, the possibility to nowcast a campaign, that is, to track trends in real time and capture (eventual) sudden changes (so called “momentum”) in public opinion faster than is possible through traditional polls (for example, the results of a TV debate), becomes a reality. Some scholars, however, go even further, claiming that analysing social media allows a reliable forecast of the final result. This is quite fascinating, as forecasting an election is one of the few exercises in social science where an independent measure of the outcome that a model is trying to predict is clearly and indisputable available, i.e., the vote share of candidates (and/or parties) at the ballots.
To reach this aim, however, at least two challenges need to be successfully overcome. Last year, while attending a conference, we heard a speaker arguing that Giuseppe Civati won the primary election of the Italian Democratic Party, at least on Twitter. The speaker justified this statement by asserting that all the people the speaker was following on Twitter were all posting messages in favour of Civati: therefore, Civati should have won! After collecting and analysing almost 6oo,ooo tweets through VOICES from the Blogs posted in the three weeks leading up to polling day, which discussed the primary election, we can confidently say that this was not the case. In fact, Civati was the third (and therefore, the last) candidate in terms of declared support on Twitter, clearly beyond Matteo Renzi as well as Gianni Cuperlo. This example warns us against the risk of political homophily and selective exposure that is always present regardless of the promise of a virtual world where everyone can freely connect with anyone else.
Moreover, relying on random sampling of Big Data Internet is extremely complex, more so than working with traditional surveys. There is no comprehensive phone list of the entire Internet community on which the standard techniques of sampling are applied. In addition, no reliable information about the individual traits of social media users is currently accessible, making the possibility of a stratified sample unfeasible. However, unlike traditional surveys where we have to rely on a sample precisely because analysing the universe is unattainable, when we talk about social media, the entire universe is in principle available, at least the universe referring to public posts. Let’s leave aside the technical challenge to get access to such “universe” (a far from irrelevant task), and let’s suppose that we were able to collect it. The difficult part would begin just now for the researcher: how does one analyse such a large amount of data? How would one extract politically significant meaning from the data?
This is clearly a methodological problem. For example, is it enough to count the volume of data related to candidates or parties to try to predict the final electoral result? Let us revisit the example of the Italian primary election, but this time, concentrate on the 2012 centre-left election. In November 2012, Matteo Renzi had approximately 73,000 mentions on Twitter (i.e., posts that contained the word “Renzi”), while Pierluigi Bersani reached approximately 26,000 mentions. According to these numbers, Renzi should have exceeded Bersani by approximately 73 per cent; however, Bersani won the polls with a 10 per cent margin in the first round (and over 20 per cent in the second round). Of course, this should not be that surprising. Indeed, the number of mentions are indicative of only the notoriety (positive or negative alike), not the popularity or the (potential) support (at least online) for a politician (Ceron et al., 2015a).
We recently conducted a meta-analysis of 219 electoral social-media forecasts related to 89 different elections held between 2008 and 2015 (Ceron et al., 2015b). Overall, the Mean Absolute Error (MAE) of social media based prediction was higher than 7. Note that survey polls in the same subset of elections produced a MAE slightly lower than 2. Compared to surveys, the predictive power of social media appears therefore rather poor prima facie. However, in some cases, social media predictions actually were comparable (if not better) than survey polls.
Our aim has been therefore to ascertain the reasons that could explain the accuracy of the electoral forecast, focusing in particular on the method adopted to analyse social media. We differentiated between computational approaches (either based on volume data, such as the number of mentions related to a party or candidate or the occurrence of particular hashtags; or endorsement data, such as the number of Twitter followers, Facebook friends or the number of “likes” received on Facebook walls), sentiment analysis approaches, that pay attention to the language and try to attach a qualitative meaning to the comments (posts, tweets) published by social media users employing automated tools for sentiment analysis (i.e., via natural language processing models or the employment of pre-defined ontological dictionaries), and finally what we call supervised and aggregated sentiment analysis (SASA), that is, techniques that exploit the human codification in their process and focus on the estimation of the aggregated distribution of the opinions, rather than on individual classification of each single text (Ceron et al. 2016). More in details, the SASA method is based on a two-stage process (Ceron et al. 2015a). In the first step human coders read and codify a subsample of the documents. This subsample, with no particular statistical property, represents a training set that will be used by the second step of the algorithm to classify all the unread documents (the test set). At the second stage, the aggregated statistical estimation of the SASA algorithms extends such accuracy to the whole population of posts, allowing one to properly obtain the opinions expressed on social networks.
The SASA approach, first introduced by Hopkins and King (2010), aims to solve two different problems. First, users on social media use natural language, which evolves continuously and varies depending on the person who is actually writing (male, female, young, old, officer, journalist, etc.) and a particular topic (soccer, politics, music, etc.). In addition, metaphoric or ironic sentences as well as jargon, contractions or neologisms are used in different and new ways every time. This fact puts all unsupervised methods based on ontological dictionaries or statistical methods based on natural language processing (NLP) models under stress when it comes to accurately capturing sentiment. For these reasons, supervised human coding of a training set is a cornerstone of the SASA methodology. Human coding, in fact, allows to reduce misclassification errors given that human coders are indeed more effective than ontological dictionaries in recognising all the specificity of the language and in interpreting the texts and the author’s attitude. Second, by directly estimating the aggregated distribution of opinions, SASA produces more reliable aggregate results in a context where they (i.e., the final vote-share of parties and/or candidates) are what concerns us.
Our meta-analysis, in this respect, shows that SASA increases the accuracy of the forecasts by a remarkable 3.7 points if compared to forecasts based on a mere computational approach and by 2.6 points if compared to other sentiment analysis techniques based on ontological dictionaries, which are no more effective than computational methods in improving the accuracy of the prediction.
Although highly relevant, the method is not the only factor affecting the accuracy of the prediction. The electoral system matters too. When the elections are held under proportional representation, social media forecasts are remarkably more precise. This effect is due to the lower incentive to cast a strategic vote. Because every vote counts in proportional electoral systems, citizens are freer to behave according to their sincere preferences. As a consequence, we observe a higher congruence between opinions expressed online and actual voting behaviour. Conversely, when there is an incentive to behave strategically, the analysis of the opinions expressed online becomes less relevant because voters may express their sincere preference online while casting a strategic vote at the polls. This suggests that when some elements prompt the coherence between online opinions and offline behaviour, the accuracy of social media based predictions is heightened. The fact that our analysis consistently shows that the error is lower in elections with a high turnout and at the same a huge volume of comments, points in the same direction.
In sum, despite the well-known limits and challenges faced by social media analysis, there are reasons to be optimistic about the capability of sentiment analysis becoming (if it is not already) a useful complement to traditional offline polls. But in this respect a word of caution is well needed: Big Data is likely to contribute so long as the desired qualities of the data are not negatively correlated with the quantity of data (Clark and Golder 2015). The method employed in this respect, as well as the (institutional) context in which you run the analysis, make a difference!
Ceron, Andrea, Luigi Curini and Stefano M. Iacus (2015a). “Using sentiment analysis to monitor electoral campaigns: method matters. Evidence from the United States and Italy”, Social Science Computer Review, 33(1), 2015, 3-20
Ceron, Andrea, Luigi Curini and Stefano M. Iacus (2015b). “Social Media and Elections. A meta-analysis of online-based electoral forecasts”, in Kai Arzheimer, Jocelyn Evans and Michael Lewis-Beck (eds.), The Handbook of Electoral Behaviour, Sage, forthcoming
Ceron, Andrea, Luigi Curini and Stefano M. Iacus (2016). Forecasting and Nowcasting Elections Using Social Media: Just By Chance? London: Ashgate, forthcoming
Clark William Roberts, and Matt Golder (2015) “Big Data, Causal Inference, and Formal Theory: Contradictory Trends in Political Science?” PS: Political Science & Politics, 48(1): 65-70
Hopkins, D.J., King, G. (2010) A method of automated nonparametric content analysis for social science, American Journal of Political Science, 54(1), 229-247.