Public Health England has admitted that 16,000 confirmed coronavirus cases in the UK were missed from daily figures being reported between September 25 and October 2. The missing figures were subsequently added to the daily totals, but given the importance of these numbers for monitoring the outbreak and making key decisions, the results of the error are far-reaching.
Not only does it lead to underestimating the scale of coronavirus in the UK, but perhaps more important is the subsequent delay in entering the details of positive cases into the NHS Test and Trace system which is used by a team of contact tracers. Although all those who tested positive had been informed of their results, other people in close contact with them and potentially at risk of exposure were not immediately followed up (ideally within 48 hours). This was a serious error. What could have caused it?
It emerged later that that day a “technical glitch” was to blame. To be more specific, the lab test results were being transferred to Excel templates. The templates hit a limit in the number of rows they could handle and then failed to update with more cases added. The issue was resolved with all new cases added to the totals reported over the weekend by breaking the data down across smaller spreadsheets.
The issue may have been fixed, but people’s confidence in the testing system in place in England will undoubtedly take a knock. It’s also likely that politicians and media will use this as political ammunition to argue the incompetence of government and Public Health England. Is this the right response? What should we take away from this mistake?
An avoidable mistake
We should not forget that the government and public health workers are doing an incredibly challenging and demanding job dealing with a pandemic. But this kind of mistake was avoidable. We live in a world of big data, with artificial intelligence and machine learning permeating all aspects of our lives. We have smart factories and smart cities; we have self-driving cars and machines trained to exhibit human intelligence. And yet Public Health England used Microsoft Excel as an intermediary to manage a large volume of sensitive data. And herein lies the problem.
Although Excel is popular and commonly used for analysis, it has several limitations that make it unsuitable for large amounts of data and more sophisticated analyses.
The companies that analysed the swab tests to identify who had the virus submitted their results as comma-separated text files to PHE. These were then ingested into Excel templates to be uploaded to a central system to be made available to the Test and Trace team and government. Although today’s Excel spreadsheets can handle 1,048,576 rows and 16,384 columns, developers at PHE used an older Excel file format (XLS instead of XLSX) resulting in each template being able to store only around 65,000 rows of data (or around 1,400 cases). When the limit was reached, any further cases were left off the template and therefore positive cases of coronavirus were missed in the daily reporting.
The bigger issue is that, in light of the data-driven and technologically advanced age in which we live, that a system based on shipping around Excel templates was even deemed suitable in the first place. Data engineers have for a long time been supporting businesses with managing, transforming and serving up data, and developing methods for building efficient, robust and accurate data pipelines. Data professionals have also developed approaches to information governance, including assessing data quality and developing appropriate security protocols.
For this kind of custom application there are plenty of data management technologies that could have been used, ranging from on-site to cloud-based solutions that can scale and provide managed data storage for subsequent reporting and analysis. The Public Health England developers no doubt had some reason to transform the text files into Excel templates, presumably to fit with legacy IT systems. But avoiding Excel together and shipping the data from source (with appropriate cleaning and checks) into the system would have been better and reduced the number of steps in the pipeline.
The blame game
Despite the benefits and widespread use of using Excel, it is not always the right tool for the job, especially for a data-driven system with such an important function. You can’t accurately report, model or make decisions on inaccurate or poor quality data.
During this pandemic we are all on a journey of discovery. Rather than point the finger and play the blame game, we need to reflect and learn from our mistakes. From this incident, we need to work on getting the basics right – and that includes robust data management. Perhaps rather concerning are reports that Public Health England is now breaking the lab data into smaller batches to create a larger number of Excel templates. This seems a poor fix and doesn’t really get to the root of the problem – the need for a robust data management infrastructure.
It is also remarkable how quickly technology or the algorithm is blamed (especially by politicians), but herein lies another fundamental issue – accountability and taking responsibility. In the face of a pandemic we need to work together, take responsibility, and handle data appropriately.