Incident: Centralized NHS Contact-Tracing App Failure due to Incompatibility with Apple System

Published Date: 2020-05-06

Postmortem Analysis
Timeline 1. The software failure incident regarding the NHS's contact-tracing app happened in May 2020 [Article 99691]. 2. The government's decision to abandon the centralised coronavirus contact-tracing app and switch to an alternative designed by Apple and Google was announced in June 2020 [Article 100983].
System 1. NHS's centralised contact-tracing app system [99691] 2. Apple's iPhone operating system design [100983]
Responsible Organization 1. NHSX, the digital arm of the health service, for relying on an "Android herd immunity" strategy and using workarounds that were deemed fragile and disruptive [99691]. 2. The UK government, specifically the Department of Health and Social Care, for insisting on a centralised version of the contact-tracing app that was not supported by Apple and Google, leading to the failure of the app [100983].
Impacted Organization 1. The UK government [Article 100983] 2. NHSX, the digital arm of the health service [Article 99691]
Software Causes 1. The software failure incident was caused by the decision to build a centralised contact-tracing app by the NHS, which led to limitations in the app's functionality on iOS devices due to restrictions imposed by Apple's operating system [99691]. 2. The centralised design of the NHS app, which required data to be sent back to the health service for analysis, prevented the app from utilizing the decentralized tools provided by Apple and Google for contact tracing, further contributing to its failure [99691]. 3. The NHS app's inability to effectively trace contacts on Apple iPhones was due to the design of Apple's operating system, which caused the app to only recognize 4% of Apple phones during testing on the Isle of Wight [100983]. 4. The NHS app's failure to accurately measure distance between devices, a critical aspect of contact tracing, was another software cause of the failure incident, as it could not distinguish between phones at different distances, impacting the app's effectiveness in identifying close contacts [100983].
Non-software Causes 1. Lack of a critical mass of Android phone users signing up for the contact-tracing app, leading to potential failure [99691]. 2. Reliance on a centralised database approach for contact tracing, which was not supported by tools built by Apple and Google for privacy-first decentralised contact tracing [99691]. 3. Inability of the NHS app to recognize a significant percentage of Apple phones and Google Android devices during testing due to the design of Apple's iPhone operating system [100983]. 4. Delay in switching to an alternative contact-tracing app designed by Apple and Google, resulting in wasted time and resources on the centralised app [100983].
Impacts 1. The software failure incident led to the abandonment of the centralised coronavirus contact-tracing app developed by the UK government after spending three months and millions of pounds on technology that experts had repeatedly warned would not work [Article 100983]. 2. The failure of the NHS app to recognize a significant portion of Apple phones and accurately measure distance on the Google Android devices during testing on the Isle of Wight impacted its effectiveness in contact tracing [Article 100983]. 3. The failure of the centralised NHS app to meet the necessary standards and functionality led to an embarrassing U-turn by the government, resulting in a shift to an alternative app designed by Apple and Google, which was months away from being ready [Article 100983]. 4. The delays and technical issues with the NHS app wasted precious time and public money, with the government being criticized for sticking to a flawed approach despite warnings from experts [Article 100983]. 5. The failure of the centralised NHS app highlighted the importance of adopting a decentralised approach in line with the tools provided by Apple and Google for effective contact tracing during the pandemic [Article 99691, Article 100983].
Preventions 1. Using the decentralized approach for contact tracing apps: The failure of the centralized NHS contact-tracing app in Article #100983 could have been prevented by adopting the decentralized approach supported by Apple and Google, as it proved to be more effective and compatible with various devices [100983]. 2. Conducting thorough testing and evaluation: Proper testing and evaluation of the app's functionality, especially in real-world scenarios, could have identified the limitations and shortcomings of the software before its deployment, potentially preventing the failure incident [100983]. 3. Listening to experts' warnings and recommendations: The government could have prevented the software failure by heeding the warnings and advice of experts who had raised concerns about the centralised approach and technical issues with the app design [100983].
Fixes 1. Switching to an alternative contact-tracing app designed by Apple and Google [Article 100983]. 2. Implementing an Exposure Notification API developed by Google in consultation with public health experts [Article 100983].
References 1. Experts who have examined the trial use of the NHS contact-tracing app on the Isle of Wight, including Michael Veale from UCL [Article 99691]. 2. NHSX, the digital arm of the health service [Article 99691]. 3. Oxford University experts [Article 99691]. 4. French minister for digital technology, Cédric O [Article 99691]. 5. Matt Hancock, the UK Health Secretary [Article 100983]. 6. Silkie Carlo, the director of the privacy charity Big Brother Watch [Article 100983]. 7. Apple and Google [Article 100983]. 8. James Bethell, a junior health minister [Article 100983]. 9. Jonathan Ashworth, the shadow health secretary [Article 100983]. 10. Sal Brinton, the Liberal Democrat health spokesperson in the Lords [Article 100983].

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization, multiple_organization (a) The software failure incident having happened again at one_organization: - The UK government had to abandon a centralised coronavirus contact-tracing app after spending months and millions of pounds on technology that experts had warned would not work [Article 100983]. - The NHS's contact-tracing app faced challenges due to relying on a centralised database approach, which led to issues with iPhone functionality and effectiveness [Article 99691]. (b) The software failure incident having happened again at multiple_organization: - Other countries that have made similar decisions to the UK in building centralised contact-tracing apps have faced limitations and challenges with their apps as well, such as Singapore and Australia [Article 99691]. - Italy and Germany launched their own contact-tracing apps based on the Google-Apple model, indicating a shift towards a more decentralised approach in contrast to the centralised approach taken by the UK government [Article 100983].
Phase (Design/Operation) design, operation (a) The software failure incident related to the design phase can be observed in Article #100983, where the government had to abandon a centralised coronavirus contact-tracing app after spending months and millions of pounds on technology that experts had repeatedly warned would not work. The centralised version of the app, which held anonymised data in an NHS database for better tracing and data analysis, was not supported by Apple and Google, leading to its failure [100983]. (b) The software failure incident related to the operation phase is evident in Article #99691, where the NHS's contact-tracing app faced challenges in operation due to the design decisions made. The app had issues with iPhones going into a "listen-only" mode after a period of inactivity, affecting its contact-tracing function. This operational issue was exacerbated by the need for workarounds to keep iPhones registering contact events and ensuring the app remains effective in the background on both iOS and Android phones [99691].
Boundary (Internal/External) within_system, outside_system (a) within_system: The software failure incident related to the NHS's contact-tracing app can be attributed to factors within the system. The centralised design of the app, which required a critical mass of Android users to ensure iPhone compatibility, led to issues with the app's functionality on iOS devices. The workaround implemented by the NHSX to create a centralised database and the decision not to use the decentralized approach recommended by Apple and Google contributed to the failure [99691]. (b) outside_system: The failure of the centralised coronavirus contact-tracing app developed by the UK government was also influenced by factors outside the system. The app's reliance on a centralised version of technology not supported by Apple and Google, as well as the limitations imposed by Apple's operating system on app functionality, were external factors that impacted the app's effectiveness [100983].
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident occurring due to non-human actions: - The failure of the NHS contact-tracing app was attributed to the design of Apple's iPhone operating system, which caused apps to quickly go to sleep when not in use, preventing them from being activated by Bluetooth [Article 100983]. - The centralised version of the NHS app, which relied on anonymised data stored in an NHS database for tracing and data analysis, was not supported by Apple and Google, leading to compatibility issues [Article 100983]. (b) The software failure incident occurring due to human actions: - The UK government decided to pursue a centralised approach for the contact-tracing app despite warnings from experts that it would not work, leading to wasted time and money on a design that ultimately failed [Article 100983]. - Matt Hancock, the Health Secretary, had been enthusiastic about the centralised NHS app and had set ambitious rollout targets, but the government's insistence on this approach led to delays and eventual abandonment in favor of the Apple-Google alternative [Article 100983].
Dimension (Hardware/Software) hardware, software (a) The software failure incident occurring due to hardware: - The failure of the NHS contact-tracing app was partly attributed to hardware issues related to the design of Apple's iPhone operating system. The app only recognized 4% of Apple phones during testing on the Isle of Wight because apps quickly go to sleep on iPhones when not in use and cannot be activated by Bluetooth [Article 100983]. (b) The software failure incident occurring due to software: - The failure of the NHS contact-tracing app was primarily attributed to software issues related to the centralised design of the app, which did not align with the privacy-first "decentralised" approach recommended by Apple and Google. The centralised design led to limitations in functionality and compatibility with Apple and Google's tools for contact tracing [Article 99691].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident related to the NHS's contact-tracing app can be categorized as non-malicious. The failure was primarily due to the decision to build a centralised app that did not align with the tools provided by Apple and Google for contact tracing [99691]. The centralised approach led to technical challenges, such as the app not being able to work effectively on iPhones due to their operating system shutting down app functions in the background, and the app only recognizing a small percentage of Apple phones during testing [100983]. The failure was a result of design choices and technical limitations rather than any malicious intent to harm the system.
Intent (Poor/Accidental Decisions) poor_decisions, accidental_decisions (a) The intent of the software failure incident related to poor decisions can be seen in Article #100983, where the government decided to pursue a centralised coronavirus contact-tracing app despite experts warning that it would not work. The government insisted on using a centralised version of the technology, which was not supported by Apple and Google, leading to an embarrassing U-turn after spending months and millions of pounds on a system that ultimately failed [100983]. (b) The intent of the software failure incident related to accidental decisions can be observed in Article #99691, where the NHS's contact-tracing app faced challenges due to the decision to build a centralised app instead of adopting the privacy-first "decentralised" approach recommended by Apple and Google. This decision led to workarounds and limitations in the app's functionality, ultimately impacting its effectiveness [99691].
Capability (Incompetence/Accidental) development_incompetence (a) The software failure incident related to development incompetence is evident in Article #100983, where the government had to abandon a centralised coronavirus contact-tracing app after spending three months and millions of pounds on technology that experts had repeatedly warned would not work. Despite weeks of work, officials admitted that the NHS app only recognized 4% of Apple phones and 75% of Google Android devices during testing on the Isle of Wight. This failure was attributed to the design of Apple's iPhone operating system, which caused apps to quickly go to sleep when not in use, making it impossible to activate them via Bluetooth [100983]. (b) The software failure incident related to accidental factors is seen in Article #99691, where the NHS's contact-tracing app faced challenges due to the decision to build a centralised app, which led to the app not being able to use tools built by Apple and Google for contact tracing. This decision resulted in workarounds that were described as fragile, disruptive to users, and risking apps not registering contacts when they should. The workaround required Android 'herd immunity' to ensure iPhone owners remain covered by the app's contact-tracing ability, showcasing unintended consequences of the centralised approach [99691].
Duration temporary The software failure incident related to the NHS's contact-tracing app can be categorized as a temporary failure. The initial centralised app designed by the UK government faced issues with recognition on different devices, particularly Apple phones and Google Android devices. Despite weeks of work, the centralised app only recognized 4% of Apple phones and 75% of Google Android devices during testing on the Isle of Wight [Article 100983]. This failure was due to the design of Apple's iPhone operating system, which caused apps to quickly go to sleep when not in use and could not be activated by Bluetooth. The centralised approach was not supported by Apple and Google, leading to the need for an alternative solution [Article 100983]. Additionally, the government had to abandon the centralised app after spending three months and millions of pounds on technology that experts had warned would not work. The Health Secretary mentioned that the government would switch to an alternative app designed by Apple and Google, which was still months away from being ready [Article 100983]. This shift indicates a temporary failure in the initial approach, leading to the need for a different solution.
Behaviour crash, omission, other (a) crash: The software failure incident described in Article 100983 can be categorized as a crash. The centralised coronavirus contact-tracing app developed by the UK government failed to work as intended, recognizing only 4% of Apple phones and 75% of Google Android devices during testing on the Isle of Wight. This failure led to the abandonment of the app after spending months and millions of pounds on technology that experts had warned would not work [100983]. (b) omission: The software failure incident can also be categorized as an omission. The NHS app, which was intended to trace close contacts of individuals with coronavirus symptoms using Bluetooth connectivity, failed to recognize a significant portion of devices during testing, indicating an omission in performing its intended function [100983]. (c) timing: The software failure incident does not align with the timing category as there is no indication in the articles that the system performed its intended functions either too late or too early. (d) value: The software failure incident does not align with the value category as there is no indication in the articles that the system performed its intended functions incorrectly. (e) byzantine: The software failure incident does not align with the byzantine category as there is no indication in the articles that the system behaved erroneously with inconsistent responses and interactions. (f) other: The other behavior exhibited by the software failure incident is the reliance on a centralised approach for contact tracing, which led to limitations in the app's functionality and compatibility with Apple and Google systems. This decision resulted in the app not being able to perform its intended functions effectively, ultimately leading to its abandonment in favor of an alternative developed by Apple and Google [99691, 100983].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence property, delay, theoretical_consequence (a) death: People lost their lives due to the software failure - No information in the articles suggests that people lost their lives due to the software failure incident. [99691, 100983] (b) harm: People were physically harmed due to the software failure - No information in the articles suggests that people were physically harmed due to the software failure incident. [99691, 100983] (c) basic: People's access to food or shelter was impacted because of the software failure - No information in the articles suggests that people's access to food or shelter was impacted due to the software failure incident. [99691, 100983] (d) property: People's material goods, money, or data was impacted due to the software failure - The software failure incident resulted in wasted public money on a design that was predicted to fail, amounting to millions of pounds spent on technology that did not work. [100983] (e) delay: People had to postpone an activity due to the software failure - The government had to abandon the centralised coronavirus contact-tracing app after spending three months on technology that experts warned would not work, resulting in a delay in the implementation of an effective app. [100983] (f) non-human: Non-human entities were impacted due to the software failure - No information in the articles suggests that non-human entities were impacted due to the software failure incident. [99691, 100983] (g) no_consequence: There were no real observed consequences of the software failure - The software failure incident had real observed consequences, such as wasted time, money, and effort on a design that ultimately failed to work effectively. [100983] (h) theoretical_consequence: There were potential consequences discussed of the software failure that did not occur - The articles discuss potential consequences of the software failure, such as the inability to effectively trace contacts of individuals with coronavirus symptoms, which could have led to further spread of the disease. However, these potential consequences did not occur as the government decided to switch to an alternative app. [99691, 100983] (i) other: Was there consequence(s) of the software failure not described in the (a to h) options? What is the other consequence(s)? - No other consequences of the software failure were described in the articles. [99691, 100983]
Domain health (a) The failed system was intended to support the health industry, specifically in the context of contact tracing for the coronavirus pandemic. The NHS's contact-tracing app was designed to track and notify individuals who may have come into contact with someone infected with COVID-19 [Article 99691], [Article 100983]. (j) The software failure incident was directly related to the health industry, specifically the NHS's efforts to implement a contact-tracing app to help combat the spread of COVID-19 [Article 99691], [Article 100983].

Sources

Back to List