The Evidence Paradox in Performance Measurement: When is a series not a series?

In a recent briefing paper on breaks and discontinuities in official data series in the UK, two of us [Dixon and Hood] highlighted the tension between the demand for quantitative evidence to drive performance improvement and the tendency to systematically destroy the very evidence by which performance can be evaluated. This paper was discussed, and further examples of data breaks across the public sector were explored, at a seminar at LSE in April, attended by senior civil servants and academics. The ensuing discussion embodied the same tensions, with some participants emphasising the need for indicator continuity, and others stressing that indicators must change as methodologies, purposes, and audiences evolve. Can this tension be resolved? In this article we suggest that recommendations arising from the seminar might point to a way to reconcile these demands.

Continuity versus Responsiveness

At the LSE event, most of the panellists argued that a degree of indicator continuity is essential to measure government performance or to track headline public spending numbers. Tony Travers showed examples of how (apparently) final ‘outturn’ public spending data in the official Public Expenditure Statistical Analyses (PESA) are altered in subsequent years. Ruth Dixon showed how the definition of central government administration costs in PESA changed so much over the 2000s that no meaningful comparison can be made over that period (see Figure 1). Nor did EU or OECD data series seem to be any more stable or long-lasting than national datasets.

Figure 1. Breaks and Discontinuities in Reported Government Administration Costs.

Alexander Jan (from Ove Arup, a consultancy) described how when Arup was commissioned by the Office of Rail Regulation to review the costs and efficiency of Network Rail, they found that the calculations changed radically at each reporting round. In some cases the data supplied were arguably not suitable for some of the purposes for which they were intended. Alex quoted Roger Ford, editor of Modern Railways, ‘this is the fourth year in a row that Arup has provided a qualified opinion for regulatory financial statements. Well, if Arup can’t work out what’s going on after four attempts, what hope for the rest of us?’ As Alex noted, the remuneration of some senior managers was linked to these indicators. Ideally, they need to be readily understood by those holding the organization to account such as ministers, non-executive directors and regulators.

Several participants noted that there is little incentive for rigorous data archiving, continuity or ‘ownership’ of data series in government. Some of the audience agreed that policymakers need to know whether a time-series used to inform public policy decisions ‘really is a series’ in any meaningful sense.

Other participants questioned the value of stability in every data series, arguing that many retrospective comparisons are of limited value. A former government accountant pointed out that financial reporting methods continue to mature, and the best (or newest) systems should not be rejected simply to ensure continuity. For instance, UK government moved from cash-based to resource-based accounting in the 2000s, and latterly moved to whole-of-government accounts. Others pointed out that the value of any indicator as an instrument of control over public bodies depends on its being changed to counter gaming and reflect current concerns. Indicator change may also reflect major changes in policy, as when the definitions of employment and benefit changed with the introduction of Universal Credit.

It is undeniable that indicators that never change become subject to gaming, cause output distortions, or simply cease to reflect the concept that they were originally designed to measure. But the question remains whether current practice represents the optimal balance between continuity and change. If the main official national records that purport to offer an authoritative picture of the state of the public finances (for instance, PESA and the Local Government Financial Statistics) cannot even be compared from year to year by experts, where should we turn? The data series in which we found serious discontinuities or complete breaks are not trivial or obscure, but are (or should be) of major importance to the evaluation of government of any political stripe, such as total public sector capital investment, local government spending (on which the Office for Budget Responsibility and the government differ whether it is rising or falling), how much the government costs to run, and how much the civil service costs to employ. It is possible to provide long-running, consistent official series. For instance, civil service staff numbers have been reported in Civil Service Statistics since it was first published in 1970. And consistent records of tax revenues have been included in the annual budget reports (the Treasury’s Red Book) for many decades.

Overlapping Series Provide Transparency

In our briefing paper [Dixon and Hood], we suggested that overlapped stepped series offer a possible reconciliation between continuity and responsiveness. By this, we meant that when indicators change their definition or methodology, the effects of the change must be shown by calculating the metric in the old and new ways for several overlapping years. If transparently reported, this method allows us to identify significant changes and understand the consequentiality of those changes. That technique was used by the Office for National Statistics (ONS) to show Total Managed Expenditure with and without the support given to banks after the financial crisis in 2007 as shown in Figure 2.

Figure 2. Example of Good Practice: Transparently Showing Effect of Classification Change.

Some participants argued that this requirement is simply too onerous—unavoidable changes are too frequent and complex for the tension between continuity and change to be reconciled in this way—and that officials are required to report the situation ‘as it is now’. Policymakers, they said, are not interested in some hypothetical ‘might have been’. But we argue that if that is the case for indicators on which public policy depends, the data should not be presented as a time-series at all. As Iain McLean demonstrated in his presentation, sometimes the data turns out to be so corrupt that there is no remedy other than ‘breaking’ the time-series and starting again, as he and his colleagues found when they looked into EU regional spending figures reported in PESA in the mid-2000s. In that case, they had to inform the Treasury that the data prior to 2004 was ‘irretrievable.’

Our Recommendations

In summary, we find ample evidence that breaks and discontinuities are so ubiquitous in official data sets that tracking government performance over time is severely compromised. Of course there is no easy solution for reconciling the need for continuity with the demand to update and modernize indicators. Nevertheless, we suggest that at least for key or leading official statistics such as those summarizing the state of the public finances, public organizations should (i) consider carefully whether the ‘pros’ of updating an indicator outweigh the ‘cons’ caused by the discontinuity; (ii) carefully report, explain, and justify each change in methodology or definition, properly archiving such documentation; and (iii) demonstrate the consequentiality of each change by means of overlapping stepped series. Oversight bodies such as parliamentary select committees and audit bodies should press for such practices to be followed, in consultation with professional authorities such as the Royal Statistical Society and the Chartered Institute for Public Finance and Accountancy.

Comments

comments

Cookie	Duration	Description
_GRECAPTCHA	5 months 27 days	This cookie is set by Google. In addition to certain standard Google cookies, reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
connect.sid	1 day	This cookie is used for authentication and for secure log-in. It registers the log-in information.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other".
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_69029762_1	1 minute	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to deliver advertisement when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
fr	3 months	The cookie is set by Facebook to show relevant advertisments to the users and measure and improve the advertisements. The cookie also tracks the behavior of the user across the web on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
NID	6 months	This cookie is used to a profile based on user's interest and display personalized ads to the users.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.

Cookie	Duration	Description
CONSENT	16 years 8 months 26 days 14 hours	No description
lang		This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
yt-remote-connected-devices	never	This cookie is set by Youtube and stores user video player preferences for embedded YouTube videos
yt-remote-device-id	never	This cookie is set by Youtube and stores user video player preferences for embedded YouTube videos

The Evidence Paradox in Performance Measurement: When is a series not a series?

Comments

‘Is the EU doomed?’ by Jan Zielonka

Democracy for our Digital Future (Part I): Seven reasons why ‘We, the People’ should take charge of constitutional change

Tony Travers

Christopher Hood

Ruth Dixon

OxPol Blogcast. Women in Politics – In Conversation with Rachel Bernhard: Can Gender-Typical Appearance and Behaviour Help Candidates Win Office?

Australia’s Unprecedented Coalition Politics: Unpacking Scott Morrison’s Ministerial Self-Appointments

OxPol Blogcast. Women in Politics – Formal and Informal Politics of Women’s Representation and Activism in Latin America

Who governs? When and why governments appoint technocrats?

No Comment