Wednesday, 26 January, 2022
HomeTechWhy Microsoft Excel isn’t good enough to count coronavirus cases

Why Microsoft Excel isn’t good enough to count coronavirus cases

Although Excel is popular and commonly used for analysis, it has several limitations, as UK found out.

Text Size:

Public Health England has admitted that 16,000 confirmed coronavirus cases in the UK were missed from daily figures being reported between September 25 and October 2. The missing figures were subsequently added to the daily totals, but given the importance of these numbers for monitoring the outbreak and making key decisions, the results of the error are far-reaching.

Not only does it lead to underestimating the scale of coronavirus in the UK, but perhaps more important is the subsequent delay in entering the details of positive cases into the NHS Test and Trace system which is used by a team of contact tracers. Although all those who tested positive had been informed of their results, other people in close contact with them and potentially at risk of exposure were not immediately followed up (ideally within 48 hours). This was a serious error. What could have caused it?

It emerged later that that day a “technical glitch” was to blame. To be more specific, the lab test results were being transferred to Excel templates. The templates hit a limit in the number of rows they could handle and then failed to update with more cases added. The issue was resolved with all new cases added to the totals reported over the weekend by breaking the data down across smaller spreadsheets.

The issue may have been fixed, but people’s confidence in the testing system in place in England will undoubtedly take a knock. It’s also likely that politicians and media will use this as political ammunition to argue the incompetence of government and Public Health England. Is this the right response? What should we take away from this mistake?

Also read: 60% of mental health queries from young adults aged 21-30, says report by telemedicine app

An avoidable mistake

We should not forget that the government and public health workers are doing an incredibly challenging and demanding job dealing with a pandemic. But this kind of mistake was avoidable. We live in a world of big data, with artificial intelligence and machine learning permeating all aspects of our lives. We have smart factories and smart cities; we have self-driving cars and machines trained to exhibit human intelligence. And yet Public Health England used Microsoft Excel as an intermediary to manage a large volume of sensitive data. And herein lies the problem.

Although Excel is popular and commonly used for analysis, it has several limitations that make it unsuitable for large amounts of data and more sophisticated analyses.

The companies that analysed the swab tests to identify who had the virus submitted their results as comma-separated text files to PHE. These were then ingested into Excel templates to be uploaded to a central system to be made available to the Test and Trace team and government. Although today’s Excel spreadsheets can handle 1,048,576 rows and 16,384 columns, developers at PHE used an older Excel file format (XLS instead of XLSX) resulting in each template being able to store only around 65,000 rows of data (or around 1,400 cases). When the limit was reached, any further cases were left off the template and therefore positive cases of coronavirus were missed in the daily reporting.

The bigger issue is that, in light of the data-driven and technologically advanced age in which we live, that a system based on shipping around Excel templates was even deemed suitable in the first place. Data engineers have for a long time been supporting businesses with managing, transforming and serving up data, and developing methods for building efficient, robust and accurate data pipelines. Data professionals have also developed approaches to information governance, including assessing data quality and developing appropriate security protocols.

For this kind of custom application there are plenty of data management technologies that could have been used, ranging from on-site to cloud-based solutions that can scale and provide managed data storage for subsequent reporting and analysis. The Public Health England developers no doubt had some reason to transform the text files into Excel templates, presumably to fit with legacy IT systems. But avoiding Excel together and shipping the data from source (with appropriate cleaning and checks) into the system would have been better and reduced the number of steps in the pipeline.

The blame game

Despite the benefits and widespread use of using Excel, it is not always the right tool for the job, especially for a data-driven system with such an important function. You can’t accurately report, model or make decisions on inaccurate or poor quality data.

During this pandemic we are all on a journey of discovery. Rather than point the finger and play the blame game, we need to reflect and learn from our mistakes. From this incident, we need to work on getting the basics right – and that includes robust data management. Perhaps rather concerning are reports that Public Health England is now breaking the lab data into smaller batches to create a larger number of Excel templates. This seems a poor fix and doesn’t really get to the root of the problem – the need for a robust data management infrastructure.

It is also remarkable how quickly technology or the algorithm is blamed (especially by politicians), but herein lies another fundamental issue – accountability and taking responsibility. In the face of a pandemic we need to work together, take responsibility, and handle data appropriately.The Conversation

Paul Clough, Professor in Search & Analytics, University of Sheffield

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Also read:Your brain and testicles are home to viruses. Is the coronavirus any different?


Subscribe to our channels on YouTube & Telegram

Why news media is in crisis & How you can fix it

India needs free, fair, non-hyphenated and questioning journalism even more as it faces multiple crises.

But the news media is in a crisis of its own. There have been brutal layoffs and pay-cuts. The best of journalism is shrinking, yielding to crude prime-time spectacle.

ThePrint has the finest young reporters, columnists and editors working for it. Sustaining journalism of this quality needs smart and thinking people like you to pay for it. Whether you live in India or overseas, you can do it here.

Support Our Journalism


  1. This is a lame excuse. There is always option to link excel with live database hence, it is failure from the people who couldn’t understand the impact and count.

Comments are closed.

Most Popular