Incident: IT System Failure at Guy's and St Thomas' Hospital Amid Heatwave

Published Date: 2022-07-26

Postmortem Analysis
Timeline 1. The software failure incident at Guy's and St Thomas' Hospitals in south London occurred during the heatwave on 19 July [130057].
System 1. IT servers at Guy's and St Thomas' Hospital [130057] 2. IT infrastructure of Guy's and St Thomas' NHS Foundation Trust [130057]
Responsible Organization 1. Chronic underfunding and poor planning at Guy's and St Thomas' Hospital [130057]
Impacted Organization 1. Patients at Guy's and St Thomas' Hospitals in south London were impacted by the software failure incident, affecting their mental health and causing delays in receiving important medical results [130057].
Software Causes 1. Chronic underfunding and poor planning leading to IT servers breaking down during the heatwave [130057] 2. Outdated IT infrastructure reaching the end of its life [130057]
Non-software Causes 1. Poor planning and chronic underfunding were identified as non-software causes of the failure incident at Guy's and St Thomas' Hospitals [130057].
Impacts 1. Operations at Guy's and St Thomas' Hospitals were cancelled due to the IT servers breaking down in high temperatures, leading to a backlog in patient care and potentially compromising patient safety [Article 130057]. 2. Staff had to resort to using paper notes as the IT system was down, causing issues with tracking patients, misspelling names, and misplacing test results [Article 130057]. 3. The labs experienced significant delays in providing urgent blood results, impacting the interpretation of time-sensitive information [Article 130057]. 4. Patients faced delays in receiving critical medical information, such as biopsy results, leading to increased anxiety and mental health concerns [Article 130057]. 5. The broken IT system also resulted in the hospital being unable to access patient contact information for necessary communications, such as appointment cancellations [Article 130057].
Preventions 1. Regular maintenance and timely upgrades of the IT infrastructure could have prevented the software failure incident at Guy's and St Thomas' Hospital [130057]. 2. Adequate funding and resources allocated towards the IT systems could have helped in preventing the chronic underfunding issues that contributed to the failure [130057]. 3. Improved disaster recovery and backup systems could have mitigated the risk of losing critical patient data due to the software failure incident [130057]. 4. Better planning and preparedness for extreme weather conditions, such as the heatwave mentioned in the article, could have helped in preventing the IT servers from breaking down [130057].
Fixes 1. Upgrading the IT infrastructure at Guy's and St Thomas' Hospital, as outlined in an internal report earlier this year, which stated that the IT infrastructure had reached the end of its life [130057]. 2. Implementing better planning and addressing chronic underfunding to prevent similar issues in the future [130057]. 3. Ensuring proper backup and recovery mechanisms for shared drives to prevent loss of research data [130057]. 4. Resolving the backlog and communication issues with the labs to ensure timely interpretation of time-sensitive results [130057]. 5. Addressing the mental health impact on patients by improving communication and ensuring timely delivery of results [130057].
References 1. Whistleblower at Guy's and St Thomas' Hospital [Article 130057] 2. Doctor at Guy's and St Thomas' NHS Foundation Trust [Article 130057] 3. Second doctor, Jamie Wallis, a GP in Waterloo [Article 130057] 4. Patients, including Claire and Stephanie Andrews [Article 130057]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization, multiple_organization (a) The software failure incident at Guy's and St Thomas' Hospital in London is not the first time such issues have occurred within the organization. An internal report earlier this year highlighted that the hospital's IT infrastructure "had reached the end of its life" [130057]. This suggests that there were existing problems with the IT systems prior to the recent incident. (b) The incident at Guy's and St Thomas' Hospital is not isolated, as the whistleblower mentioned that other, less well-resourced district hospitals are functioning fine despite the challenges faced by Guy's and St Thomas' [130057]. This implies that similar IT issues may have occurred at other healthcare organizations as well.
Phase (Design/Operation) design, operation (a) The software failure incident at Guy's and St Thomas' Hospitals was attributed to poor planning and chronic underfunding, indicating issues related to the design phase of the IT system [130057]. The whistleblower mentioned that the hospital's IT infrastructure had reached the end of its life earlier in the year, suggesting a lack of proper design and maintenance of the system. (b) The operation phase also played a significant role in the software failure incident. The breakdown of IT servers during the heatwave led to operations being canceled, and staff had to resort to using paper notes due to the malfunctioning IT system [130057]. Patients reported delays in receiving critical test results and information, impacting their mental health and well-being. The inability to access necessary data and communicate effectively with patients highlighted operational challenges caused by the software failure.
Boundary (Internal/External) within_system (a) The software failure incident at Guy's and St Thomas' Hospitals was primarily within the system. The failure was attributed to issues such as poor planning, chronic underfunding, and the hospital's outdated IT infrastructure [130057]. The internal report mentioned that the IT infrastructure had reached the end of its life, indicating that the root cause of the failure was within the hospital's own systems and management. The impact of the failure, such as the breakdown of IT servers during a heatwave, the reliance on paper notes, misspelling of names, and delays in accessing critical patient data, all point to internal system issues contributing to the failure.
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident at Guy's and St Thomas' Hospitals was primarily attributed to non-human actions. The IT servers stopped working during a heatwave, reaching temperatures of 40C (104F) on 19 July, leading to the breakdown of the IT system [130057]. The incident was exacerbated by poor planning and chronic underfunding, indicating that the failure was primarily due to environmental conditions and technical issues rather than human actions. (b) However, human actions also played a role in the software failure incident. The whistleblower mentioned that the hospital had not been upfront about the IT problems, indicating a lack of transparency and potentially poor communication from the management [130057]. Additionally, the article highlighted issues such as misspelling names, misplacing patient results, and the inability to access critical patient data, which could be attributed to human errors in data entry or system management.
Dimension (Hardware/Software) hardware, software (a) The software failure incident at Guy's and St Thomas' Hospitals was primarily attributed to hardware issues. The IT servers stopped working during a heatwave, with temperatures reaching 40C (104F) on 19 July. An internal report earlier in the year had highlighted that the hospital's IT infrastructure "had reached the end of its life" [130057]. (b) The software failure incident also had software-related contributing factors. The breakdown of the IT servers led to a situation where staff had to resort to paper notes as the IT system was not functioning. This caused issues such as misspelling of names, scans not showing up, difficulty in tracking patients, and delays in accessing critical test results. The broken IT system affected various aspects of patient care and hospital operations, indicating software-related challenges in data management and system functionality [130057].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident at Guy's and St Thomas' Hospitals in London was non-malicious. The failure was primarily attributed to poor planning, chronic underfunding, and an outdated IT infrastructure that had reached the end of its life [130057]. The incident occurred during a heatwave, causing IT servers to stop working, leading to the cancellation of operations and impacting various aspects of patient care and hospital operations. The whistleblower highlighted issues such as staff reverting to paper notes, misspelling of names, difficulty in tracking patients, delays in accessing test results, and potential loss of research data. Patients also reported delays in receiving critical medical information, affecting their mental health. The hospital's spokesperson acknowledged the ongoing impact of the IT problems on services, including the postponement of operations and appointments.
Intent (Poor/Accidental Decisions) poor_decisions (a) The software failure incident at Guy's and St Thomas' Hospitals in London seems to have been influenced by poor decisions. A whistleblower mentioned that "poor planning" and "chronic underfunding" were contributing factors to the IT issues that arose during the heatwave [130057]. Additionally, an internal report earlier in the year highlighted that the hospital's IT infrastructure "had reached the end of its life," indicating a lack of proactive decision-making regarding IT system upgrades or replacements. These instances suggest that poor decisions played a role in the software failure incident.
Capability (Incompetence/Accidental) development_incompetence, accidental (a) The software failure incident at Guy's and St Thomas' Hospitals was attributed to "poor planning" and "chronic underfunding" according to a whistleblower who works as a doctor at the hospital [130057]. The incident was exacerbated by the fact that the hospital's IT infrastructure was reported to have "reached the end of its life" earlier in the year, indicating a lack of proper maintenance and upgrades [130057]. (b) The software failure incident at the hospital was worsened by the extreme heatwave conditions, with IT servers breaking down in temperatures of 40C (104F) [130057]. This accidental factor of the heatwave impacting the IT system's functionality contributed to the overall failure experienced by the hospital.
Duration temporary The software failure incident at Guy's and St Thomas' Hospital was temporary rather than permanent. The incident occurred due to specific circumstances, particularly the extreme heatwave that caused the IT servers to break down. The whistleblower mentioned that the IT issues were ongoing and having an impact on services, indicating that efforts were being made to address and resolve the problem [130057].
Behaviour omission, other (a) crash: The software failure incident at Guy's and St Thomas' Hospital resulted in operations being canceled after IT servers broke down in high temperatures, leading to the hospital reverting to paper notes and facing issues with tracking patients, misspelling names, and receiving incorrect test results [130057]. (b) omission: The broken IT system at the hospital caused delays in providing test results to patients, with some patients not being called for appointments and individuals like Claire experiencing anxiety and mental health issues due to the lack of communication and information about their health status [130057]. (c) timing: The software failure incident led to delays in accessing critical information such as urgent blood results, causing a backlog and making time-sensitive results impossible to interpret, impacting patient care and treatment timelines [130057]. (d) value: The malfunctioning IT system resulted in the hospital struggling to see patients, affecting the ability to request routine blood tests or urine cultures, which are essential for diagnosing and monitoring patients' health conditions [130057]. (e) byzantine: The labs at the hospital faced challenges due to the software failure, with staff spending hours trying to communicate with the lab for urgent blood results, encountering backlogs, and experiencing difficulties in interpreting time-sensitive results, leading to inconsistencies and delays in patient care [130057]. (f) other: The software failure incident also resulted in the hospital risking the loss of months or even years of research data if shared drives were not recovered, highlighting the potential long-term consequences and impact on data integrity and research activities [130057].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence harm, delay The consequence of the software failure incident described in the articles is primarily related to delays and harm to patients: (e) delay: Patients experienced delays in receiving critical medical results and information due to the software failure incident. For example, Claire mentioned waiting a week for biopsy results without any response, causing distress and anxiety [130057]. (b) harm: Patients were physically harmed or put at risk due to the software failure incident. For instance, Stephanie Andrews had to wait nearly a month to be informed that she was cancer-free after surgery, highlighting the potential harm and stress caused by delayed communication and information access [130057].
Domain health The software failure incident reported in Article 130057 is related to the **health** industry. The incident occurred at Guy's and St Thomas' Hospitals in south London, impacting various aspects of patient care and hospital operations [130057]. The malfunctioning IT system led to canceled operations, delays in accessing patient information, misplacement of test results, inability to interpret time-sensitive results, and difficulties in communicating with patients regarding their health status [130057]. Patients expressed concerns about the impact on their mental health due to delays in receiving critical medical information, such as biopsy results and cancer diagnosis updates [130057]. The incident highlights the critical role of IT systems in supporting healthcare services and the potential consequences of software failures in the health industry.

Sources

Back to List