Incident: NHS Wales GP Appointment System Failure Impacts Thousands.

Published Date: 2021-11-11

Postmortem Analysis
Timeline 1. The software failure incident happened on Thursday, as mentioned in the article. 2. The article was published on 2021-11-11. 3. Estimation: The incident occurred on Thursday, 2021-11-11. Therefore, the software failure incident happened on November 11, 2021 [120908].
System 1. VisionHealth computer system used by about a third of GP surgeries across Wales [120908]
Responsible Organization 1. Power failure - The power failure was responsible for causing the software failure incident at NHS Wales, affecting the VisionHealth computer system used by about 124 surgeries [120908].
Impacted Organization 1. Patients trying to book GP appointments 2. GP surgeries using the VisionHealth computer system 3. Institute of General Practice Management Wales 4. Digital Health and Care Wales 5. Practices affected by the downtime, including the surgery in Barry in the Vale of Glamorgan [120908]
Software Causes 1. The software causes of the failure incident were related to technical issues with the VisionHealth computer system used by about 124 GP surgeries in Wales, owned by NHS Wales [120908].
Non-software Causes 1. Power failure affecting systems at about 124 surgeries using a system owned by NHS Wales [Article 120908].
Impacts 1. Tens of thousands of people were unable to book GP appointments due to the software failure incident [Article 120908]. 2. Practices were not able to manage appointments for patients and access medical records during the downtime [Article 120908]. 3. Some practices had to revert to emergency-only systems for patient safety reasons, leading to the cancellation and rearrangement of planned appointments [Article 120908]. 4. The downtime caused a significant backlog in general practices, impacting services during an already busy time of year with winter pressures and increased demand for GP appointments [Article 120908]. 5. Significant rework needed to be managed over the coming days to address the effects of the software failure incident [Article 120908].
Preventions 1. Implementing robust backup power systems or uninterruptible power supplies (UPS) to prevent disruptions due to power failures [120908]. 2. Conducting regular system maintenance and updates to ensure the software is functioning properly and to address any potential vulnerabilities [120908]. 3. Having contingency plans in place to quickly switch to manual systems or alternative methods of appointment management in case of software failures [120908].
Fixes 1. Implementing robust backup power systems to prevent disruptions in case of power failures [120908]. 2. Conducting regular system maintenance and testing to identify and address potential technical issues before they escalate [120908]. 3. Enhancing system redundancy and failover mechanisms to ensure continuity of service even during unexpected incidents [120908].
References 1. Gareth Thomas of the Institute of General Practice Management Wales [Article 120908]

Software Taxonomy of Faults

Category Option Rationale
Recurring unknown The article does not provide information about the software failure incident happening again at either the same organization or at multiple organizations. Therefore, the specific details related to the recurrence of the incident are unknown.
Phase (Design/Operation) design (a) The software failure incident in Article 120908 was related to the design phase. The failure was caused by a power failure that affected the systems used by about 124 GP surgeries in Wales. This power failure led to technical issues with the system owned by NHS Wales, preventing the management of appointments for patients and access to medical records. The incident impacted the practices significantly, causing a backlog and forcing some practices to revert to emergency-only systems for patient safety reasons. The downtime resulted in planned appointments being canceled and rearranged, requiring significant rework to be managed over the coming days. This indicates that the failure was due to contributing factors introduced by the system development or procedures to operate the system [120908]. (b) The software failure incident in Article 120908 was not specifically related to the operation phase or misuse of the system. The primary cause of the failure was the power outage that affected the systems used by the GP surgeries, leading to technical issues and the inability to manage appointments or access medical records. There is no indication in the article that the failure was caused by the operation or misuse of the system by the users [120908].
Boundary (Internal/External) within_system (a) The software failure incident reported in Article 120908 was within_system. The failure was attributed to a power failure that affected the systems used by about 124 GP surgeries in Wales. This internal issue led to technical issues that prevented the management of appointments for patients and access to medical records within the affected practices [120908].
Nature (Human/Non-human) non-human_actions (a) The software failure incident in this case occurred due to a power failure affecting systems at about 124 GP surgeries using a system owned by NHS Wales [Article 120908]. This power failure was a non-human action that led to technical issues, impacting the ability of practices to manage appointments for patients and access medical records. The downtime caused by the power failure significantly impacted the affected general practices, leading to the cancellation and rearrangement of planned appointments, ultimately requiring significant rework to be managed over the coming days.
Dimension (Hardware/Software) hardware (a) The software failure incident in Article 120908 was due to a hardware issue. The article mentions that a power failure affected systems in about 124 surgeries using a system owned by NHS Wales, leading to technical issues and the inability to manage appointments and access medical records. This indicates that the root cause of the failure originated in hardware, specifically the power failure that impacted the systems [120908].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident described in the article does not indicate any malicious intent. It was primarily caused by a power failure that affected the systems used by GP surgeries in Wales, leading to technical issues and downtime in managing appointments and accessing medical records [120908].
Intent (Poor/Accidental Decisions) unknown (a) The software failure incident described in Article 120908 does not provide specific information indicating that the failure was due to poor decisions. The incident seems to be primarily attributed to a power failure affecting systems, leading to technical issues in managing appointments and accessing medical records at GP surgeries across Wales. Therefore, the failure does not appear to be a result of poor decisions but rather an external factor like a power outage.
Capability (Incompetence/Accidental) accidental (a) The software failure incident in Article 120908 was not explicitly attributed to development incompetence. The article primarily focused on a power failure affecting the systems used by GP surgeries in Wales, leading to technical issues and downtime in managing appointments and accessing medical records. There was no mention of the failure being caused by incompetence in development. (b) The software failure incident in Article 120908 was attributed to an accidental power failure that affected the systems used by about 124 GP surgeries in Wales. The article highlighted that the issue was resolved, and efforts were being made to bring all systems back online. The downtime caused by the accidental power failure significantly impacted the affected practices, leading to the cancellation and rearrangement of planned appointments.
Duration temporary (a) The software failure incident described in the article was temporary. The issue was caused by a power failure that affected systems in about 124 GP surgeries using a system owned by NHS Wales. The article mentions that the problem was resolved, and efforts were being made to bring all systems back online. One practice manager mentioned that the downtime lasted for about five hours, impacting general practices and causing planned appointments to be cancelled and rearranged. The incident resulted in a backlog that would need to be managed over the coming days, indicating a temporary disruption rather than a permanent failure [120908].
Behaviour crash, omission, other (a) crash: The software failure incident in Article 120908 resulted in a crash as the system lost state and was not able to perform its intended functions. This led to GP surgeries being unable to manage appointments for patients and access medical records [120908]. (b) omission: The software failure incident in Article 120908 also involved omission, as practices were not able to manage appointments for patients due to the technical issues caused by the power failure [120908]. (c) timing: The timing of the software failure incident in Article 120908 was crucial, as it occurred at 08:00 GMT, impacting the ability of GP surgeries to handle appointments during a busy time of year with winter pressures and increased demand for GP appointments [120908]. (d) value: The software failure incident in Article 120908 did not specifically mention the system performing its intended functions incorrectly, so there is no direct evidence of a value-related failure in this case. (e) byzantine: The software failure incident in Article 120908 did not exhibit behaviors of inconsistency or erratic responses that would align with a byzantine failure. (f) other: The software failure incident in Article 120908 also caused practices to revert to emergency-only systems for patient safety reasons, leading to planned appointments being canceled and rearranged. This additional impact on the workflow of GP practices could be categorized as an "other" behavior resulting from the software failure incident [120908].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence delay The consequence of the software failure incident described in the article [120908] was primarily related to delays in accessing medical services. The software failure led to the inability of people to book GP appointments, manage appointments, and access medical records at about 124 surgeries using the NHS Wales system. This resulted in practices having to cancel and rearrange planned appointments, causing a backlog and significant rework that needed to be managed over the coming days. Additionally, some practices had to revert to emergency-only systems for patient safety reasons. Overall, the impact was on the efficiency and effectiveness of healthcare services rather than direct harm or loss of life.
Domain health (a) The software failure incident reported in Article 120908 was related to the healthcare industry. The system that failed was used by about 124 GP surgeries in Wales, impacting their ability to manage appointments for patients and access medical records [120908].

Sources

Back to List