The Evolution of Legislation: a Bioinformatics Approach

About thirty major pieces of government legislation are produced annually in the UK. As there are five main opportunities to amend each bill (two stages in the Commons and three in the Lords) and bills may undergo hundreds, even thousands, of amendments, comprehensive quantitative analysis of legislative changes is almost impossible by manual methods. We used insights from bioinformatics to develop a semi-automatic procedure to map the changes in successive versions the text of a bill as it passes through parliament. This novel tool for scholars of the parliamentary process could be used, for example, to compare amendment patterns over time, between different topics or governments, and between legislatures.

Parliamentary amendments

A major role of parliament is to scrutinize and amend legislation. Analysis of legislative amendments throws light on the political process (Russell 2013). It has been suggested, for example, that UK legislation is increasingly poorly prepared, with more government amendments being used to repair defects during the passage of a bill, reducing the quality of parliamentary scrutiny (Foster 2005).

Such assertions are difficult to test, however, as no effective methods for quantifying legislative changes yet exist. The few existing quantitative studies of legislative amendments generally include only a few pieces of legislation (e.g. Hood and Dixon 2015 (29 UK bills); Kreppel 1999 (24 EU legislative proposals)). Exceptions are Tsebelis and colleagues (2001) who studied amendments to 231 EU legislative proposals, and Martin and Vanberg (2005) who looked at changes to 336 German and Dutch bills. This lack of quantitative research is partly because the process of detecting agreed amendments (e.g. from Hansard reports and minutes of committee proceedings) is extremely laborious. There is also little administrative data available: for instance the UK Public Bill Office provides statistics on the number of amendments made in the House of Lords, but there is no equivalent information from the House of Commons.

Bioinformatics – the study of mutation

Drawing on my background in Biochemistry, it struck me that a bill’s passage through parliament can be likened to biological evolution, which proceeds by the accumulation of mutations. DNA encodes genetic information in long sequences of four ‘bases’ (A, C, G, T). Procedures for comparing DNA sequences, a field known as bioinformatics, can be used to identify mutations. For instance, van’t Hof and colleagues (2016) showed recently that the dark colouration of the peppered moth is caused by the insertion of 22,000 bases into the DNA of a gene involved in wing development (Figure 1).

moth-pic — Part of Figure 1 from van’t Hof et al. (2016) showing the mutation for wing colouration in the British peppered moth.

Semi-automated procedure for detecting ‘mutations’ to legislation

Like a gene, a bill “mutates” when the information it contains is altered by insertion, deletion, and substitution of text. Bills typically contain less than a megabyte of data, and should be amenable to comparison methods similar to those used in bioinformatics. Accordingly, we developed a novel procedure which generates a visual display of the evolution of successive versions of a bill (see Figure 2). As well as the graphical output, detailed reports are produced of the number, length, and content of the text differences which can be used for further statistical and content analysis.

prsra2011-hoc-stages — Figure 2: Changes to the text of the Police Reform and Social Responsibility Act 2011 during House of Commons stages.

In order to exclude irrelevant differences between versions such as page headers and renumbering of clauses, we developed a Python script to ‘clean up’ texts so that conventional text-comparison software (Winmerge, in this case) could be applied. The clean-up process required some user input and, even so, failed to exclude certain types of irrelevant difference such as typo corrections and some formatting changes. Such differences had to be screened out manually before a second Python script was used to analyse the list of differences (the ‘patch file’) and produce the graphical output. Hence the process is ‘semi-automated’ rather than fully automatic. Nevertheless, the process can be used to map text alterations in a tiny fraction of the time that the same process would take by hand. Further details of the procedure are in my slides.

The number of text differences detected by this procedure is not precisely the same as the number of parliamentary amendments. This is because several parliamentary amendments may alter the same short section of text (resulting in a single difference being detected), or alternatively a single parliamentary amendment can result in several differences (as, for example, when a long section of text replaces a similar one). Our procedure, however, accurately portrays the actual cumulative effect of the parliamentary amendments on the text.

Note: this article was also published on the LSE’s British Politics and Policy Blog and on Ruth’s personal blog. It summarises her presentation (co-authored with Jonathan A. Jones) to the Political Studies Association Political Methodology Group Conference at University College, London, on 27 June 2016 (slides available here).

Comments

comments

Cookie	Duration	Description
_GRECAPTCHA	5 months 27 days	This cookie is set by Google. In addition to certain standard Google cookies, reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
connect.sid	1 day	This cookie is used for authentication and for secure log-in. It registers the log-in information.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other".
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_69029762_1	1 minute	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to deliver advertisement when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
fr	3 months	The cookie is set by Facebook to show relevant advertisments to the users and measure and improve the advertisements. The cookie also tracks the behavior of the user across the web on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
NID	6 months	This cookie is used to a profile based on user's interest and display personalized ads to the users.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.

Cookie	Duration	Description
CONSENT	16 years 8 months 26 days 14 hours	No description
lang		This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
yt-remote-connected-devices	never	This cookie is set by Youtube and stores user video player preferences for embedded YouTube videos
yt-remote-device-id	never	This cookie is set by Youtube and stores user video player preferences for embedded YouTube videos

The Evolution of Legislation: a Bioinformatics Approach

Parliamentary amendments

Bioinformatics – the study of mutation

Semi-automated procedure for detecting ‘mutations’ to legislation

Comments

How Much Does a Cyber Weapon Cost? Nobody Knows

How does austerity look in retrospect? The UK’s recent fiscal squeeze in historical perspective

Ruth Dixon

When does traditional statistics become machine learning?

The Shorter, the Better? What it Takes to and What to Take Away from Publishing a Very Short Article

In Polls We Do Not Trust: The British Polling Experience

Estimating the Effect of Feature Selection in Computational Text Analysis

The Evolution of Legislation: a Bioinformatics Approach

The Evolution of Legislation: a Bioinformatics Approach

Parliamentary amendments

Bioinformatics – the study of mutation

Semi-automated procedure for detecting ‘mutations’ to legislation

Comments

How Much Does a Cyber Weapon Cost? Nobody Knows

How does austerity look in retrospect? The UK’s recent fiscal squeeze in historical perspective

Ruth Dixon

Related Posts

When does traditional statistics become machine learning?

The Shorter, the Better? What it Takes to and What to Take Away from Publishing a Very Short Article

In Polls We Do Not Trust: The British Polling Experience

Estimating the Effect of Feature Selection in Computational Text Analysis

The Evolution of Legislation: a Bioinformatics Approach