Posts In Category

Advances in Political Science Methods

When does traditional statistical modelling (TSM) become machine learning (ML)?[i] “Machine learning” has truly become a buzzword that is applied rather liberally to a wide range of modelling applications. But, the difference is far from a question of semantics: there are fundamental differences between ML and TSM that data practitioners should keep in mind. Similarities But, let’s start off with some commonalities between ML and TSM. In both disciplines our aim is to build a (statistical) model (to use TSM terminology) that minimises loss, that is, that achieves the smallest possible difference between observed values and the values estimated by the model. In so doing, we have to achieve a successful balance between model complexity and generalisability: pick too complex a …

A few months ago, I published an exceptionally short paper presenting experimental evidence on a particular issue of survey methodology. This experience has taught me valuable lessons about conveying the necessary information under extreme restrictions on the word count. In its original version, my paper was 2,300 words long and it was formatted in accordance with the convention of the field: an introduction highlighting the relevance of the research question and the gaps in the existing literature, an empirical section describing the methods and presenting the results, and a conclusion discussing implications, limitations, and offering directions for future research. Since the study relied on experimental methods and was already short, I decided to submit this contribution to the Journal of …

On 18th April 2017 Theresa May announced a snap general election to take place on 8th June. The announcement came as a surprise and was widely believed to be motivated by the large lead in the polls (approximately twenty points) that Ms May holds over her main rival, Labour Party leader Jeremy Corbyn. In calling the snap election at this point, Theresa May has put a lot of confidence in her projected lead in the polls. This is interesting because British election polls have previously been met with a large degree of skepticism and distrust. In this blogpost, I briefly explore the British polling experience and highlight the various explanations that have been provided for the UK’s poor track record …

Below, I discuss and analyse pre-processing decisions in relation to an often-used application of text analysis: scaling. Here, I’ll be using a new tool, called preText (for R statistical software), to investigate the potential effect of different pre-processing options on our estimates. Replication material for this post may be found on my GitHub page. Feature Selection and Scaling Scaling algorithms rely on the bag-of-words (BoW) assumption, i.e. the idea that we can reduce text to individual words and sample them independently from a “bag” and still get some meaningful insights from the relative distribution of words across a corpus. For the demonstration below, I’ll be using the same selection of campaign speeches from one of my earlier blog posts, in which I used a …

About thirty major pieces of government legislation are produced annually in the UK. As there are five main opportunities to amend each bill (two stages in the Commons and three in the Lords) and bills may undergo hundreds, even thousands, of amendments, comprehensive quantitative analysis of legislative changes is almost impossible by manual methods. We used insights from bioinformatics to develop a semi-automatic procedure to map the changes in successive versions the text of a bill as it passes through parliament. This novel tool for scholars of the parliamentary process could be used, for example, to compare amendment patterns over time, between different topics or governments, and between legislatures. Parliamentary amendments A major role of parliament is to scrutinize and amend …

The 2016 United States presidential election—or in John Oliver’s most recent definition: ‘lice-on-a-rat-on-a-horse-corpse-on-fire-2016’—has reached its final leg. As a political scientist and a computational text analyst, I cannot resist sharing my two cents on an election that has certainly broken a model or two. Following in the footsteps of two colleagues who recently produced two excellent articles (you can read them here and here), in this post I’d like to analyse a few examples of the exceptional language used in this elections cycle. Text analysis can help us understand two commonly held beliefs or facts (the distinction has become a bit blurred over the course of this year’s election cycle) about the US elections: Donald Trump is running a negative …

The final Presidential debate of 2016 was as heated as the previous two—well demonstrated by the following name-calling exchange: CLINTON: …[Putin would] rather have a puppet as president of the United States. TRUMP: No puppet. No puppet. CLINTON: And it’s pretty clear… TRUMP: You’re the puppet! CLINTON: It’s pretty clear you won’t admit … TRUMP: No, you’re the puppet. It is easy to form our opinions of the debate and on the differences between the Presidential candidates on excerpts like this and memorable one-liners. But are small extracts representative of the debate as a whole? Moreover, how can we objectively analyse what was said, who got to say the most, and how the candidates differed in their responses? One approach is …

How can we improve the quality of post-election survey data on electoral turnout? That is the core question of our recent paper. We present a novel way to question citizens about their voting behaviour that increases the truthfulness of responses. Our research finds that the inclusion of “face-saving” response items can drastically improve the accuracy of reported turnout. Usually, the turnout reported in post-election surveys is much higher than in reality, and this is partly due to actual abstainers pretending that they have voted. Why do they lie? In many countries, voting is a social norm widely shared by the population. In established democracies voting is considered a duty, and is part and parcel of being a “good citizen”. Public …