Incident: Organ Transplant System Failure Due to Outdated Technology and Lack of Oversight

Published Date: 2022-07-31

Postmortem Analysis
Timeline 1. The software failure incident happened in February 2021 as mentioned in the article [130083].
System 1. The system for getting donated kidneys, livers, and hearts to patients relying on out-of-date technology that has crashed for hours at a time [130083]. 2. The technology powering the United Network for Organ Sharing (UNOS) transplant system, including aged software, periodic system failures, mistakes in programming, and overreliance on manual data input [130083].
Responsible Organization 1. The United Network for Organ Sharing (UNOS) was responsible for causing the software failure incident as highlighted in the article [130083].
Impacted Organization 1. The United Network for Organ Sharing (UNOS) [130083] 2. Health Resources and Services Administration (HRSA) [130083]
Software Causes 1. Aged software, periodic system failures, mistakes in programming, and overreliance on manual input of data were identified as software causes of the failure incident in the transplant system [130083].
Non-software Causes 1. Out-of-date technology and aged software [130083] 2. Overreliance on manual input of data [130083] 3. Lack of oversight and regulation by federal officials [130083] 4. Resistance to reform and modernization by UNOS [130083] 5. Inadequate organizational structure and priorities in technology development [130083]
Impacts 1. The software failure incident in the organ transplant system led to crashes lasting for hours at a time, with the critical computers connecting the transplant network crashing for a total of 17 days since 1999, affecting the timely delivery of organs to patients in need [130083]. 2. Manual data entry requirements in the system resulted in potential mistakes and narrowed the timing window for successful organ matches, impacting the efficiency and accuracy of the organ allocation process [130083]. 3. The outdated technology and clunky organizational structure of the software caused delays in implementing policy changes, with even a single change in priority policy taking a full year to be reflected in the code, hindering the system's adaptability and responsiveness [130083]. 4. The reliance on manual input of data and outdated technology led to difficulties in coordinating transportation for organs, as highlighted by incidents where delays in transporting organs resulted in the loss of a patient's life [130083]. 5. The lack of modernization and automation in the system hindered efforts to improve patient care and optimize the organ allocation process, impacting the overall effectiveness and efficiency of the transplant system [130083].
Preventions 1. Regular security audits and assessments by federal officials to identify and address security weaknesses in the software system [130083]. 2. Implementing a more modern and automated data entry system to reduce manual errors and improve efficiency [130083]. 3. Breaking up the monopoly held by the United Network for Organ Sharing (UNOS) and separating the contract for technology that powers the network from UNOS's policy responsibilities to encourage modernization and improvement of processes and technology [130083]. 4. Enforcing cybersecurity requirements on UNOS and conducting security assessments to protect the system from cyber-attacks [130083]. 5. Transitioning to cloud computing systems to reduce system lags, downtime, and improve automated access and computing power [130083].
Fixes 1. Overhauling the mechanics of the entire transplant system, including addressing aged software, periodic system failures, mistakes in programming, and overreliance on manual data input [130083]. 2. Breaking up the current monopoly held by the United Network for Organ Sharing (UNOS) and separating the contract for technology that powers the network from UNOS's policy responsibilities [130083]. 3. Implementing a state-of-the-art information technology infrastructure for the Organ Procurement and Transplantation Network to optimize the use of new technologies and improve oversight [130083]. 4. Transitioning to cloud computing systems to reduce system lags and downtime, allow greater automated access, and support machine learning [130083]. 5. Ensuring that the system moves away from manual data entry to automated processes to reduce errors and improve efficiency [130083].
References 1. The White House’s U.S. Digital Service [130083] 2. Health Resources and Services Administration (HRSA) [130083] 3. United Network for Organ Sharing (UNOS) [130083] 4. Senate Finance Committee [130083] 5. Department of Homeland Security [130083] 6. Intelligence agencies [130083] 7. Federal Chief Information Officer Clare Martorana [130083] 8. Office of Management and Budget (OMB) [130083] 9. National Academies of Sciences, Engineering and Medicine [130083] 10. Department of Health and Human Services (HHS) [130083]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization (a) The software failure incident having happened again at one_organization: The software failure incident related to the outdated technology and system crashes in the organ transplant system has been reported to have occurred multiple times within the United Network for Organ Sharing (UNOS), the nonprofit agency that operates the transplant system. The system has experienced periodic failures, mistakes in programming, and overreliance on manual data input, leading to the conclusion that the mechanics of the entire system need to be overhauled [130083]. (b) The software failure incident having happened again at multiple_organization: There is no specific mention in the articles about the software failure incident happening at multiple organizations or with their products and services. The focus of the articles is primarily on the issues within the organ transplant system operated by UNOS.
Phase (Design/Operation) design, operation (a) The software failure incident related to the design phase is evident in the outdated technology and aged software used in the transplant system, as highlighted in the confidential government review obtained by The Washington Post [130083]. The review pointed out mistakes in programming, periodic system failures, and overreliance on manual data input, indicating design flaws that have contributed to system crashes and inefficiencies. (b) The software failure incident related to the operation phase is reflected in the reliance on manual data entry, which can lead to mistakes and narrow the timing window for successful organ matches. The need for hospitals to modernize to automate data entry also suggests operational challenges impacting the efficiency of the system [130083].
Boundary (Internal/External) within_system, outside_system The software failure incident related to the transplant system for organs can be categorized as both within_system and outside_system. (a) within_system: The failure was attributed to factors originating from within the system itself, such as aged software, periodic system failures, mistakes in programming, and overreliance on manual input of data [130083]. (b) outside_system: The failure was also influenced by factors originating from outside the system, such as the lack of federal audits for security weaknesses, the outdated technology used, and the resistance from the United Network for Organ Sharing (UNOS) to modernize the operations of the system [130083].
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident occurring due to non-human actions: The software failure in the organ transplant system was attributed to aged software, periodic system failures, mistakes in programming, and overreliance on manual input of data. The system experienced crashes for hours at a time, with one outage lasting about three hours in February 2021. Additionally, a firewall failure was blamed for one of the crashes. These issues point to failures caused by technical deficiencies and system vulnerabilities rather than direct human actions [130083]. (b) The software failure incident occurring due to human actions: Human actions were also implicated in the software failure incident. The report highlighted that UNOS had allowed a programming error to push some lung patients lower on the priority list than they should have been, which was eventually caught by a different federal contractor analyzing patient data. This indicates that human errors in programming and data management contributed to the failure incident [130083].
Dimension (Hardware/Software) software (a) The articles do not provide information about a software failure incident occurring due to contributing factors originating in hardware. (b) The software failure incident reported in the articles is primarily due to contributing factors that originate in software. The system for getting donated organs to patients experienced crashes for hours at a time due to aged software, periodic system failures, mistakes in programming, and overreliance on manual input of data [130083]. The software system was described as clunky, requiring manual data entry that could lead to mistakes or narrow the timing window for successful organ matches. Additionally, the software's organizational structure was criticized for being slow to reflect policy changes, taking up to a year for a single change in priority policy to be implemented in the code [130083].
Objective (Malicious/Non-malicious) non-malicious (a) The articles do not provide any information indicating that the software failure incident was malicious in nature, i.e., due to contributing factors introduced by humans with the intent to harm the system [130083]. (b) The software failure incident discussed in the articles is non-malicious in nature. It is attributed to out-of-date technology, aged software, periodic system failures, mistakes in programming, overreliance on manual data input, and an overreliance on human intervention for critical processes within the transplant system [130083].
Intent (Poor/Accidental Decisions) poor_decisions, accidental_decisions The software failure incident related to the transplant system's technology issues can be attributed to both poor decisions and accidental decisions: (a) poor_decisions: The failure can be linked to poor decisions such as overreliance on manual data input, aged software, mistakes in programming, and the lack of modernization efforts despite complaints from transplant doctors about outdated technology [130083]. (b) accidental_decisions: The incident also involved accidental decisions or unintended consequences, as highlighted by the reliance on outdated technology, system failures, and the lack of proper oversight and regulation by federal agencies, leading to vulnerabilities in the system [130083].
Capability (Incompetence/Accidental) development_incompetence, accidental (a) The software failure incident related to development incompetence is evident in the article as it highlights various issues with the outdated technology and processes in the organ transplant system. The system relies on aged software, periodic system failures, mistakes in programming, and overreliance on manual input of data [130083]. The report from the U.S. Digital Service recommended breaking up the current monopoly held by the United Network for Organ Sharing (UNOS) due to the lack of incentives for modernizing operations and improving technology [130083]. Additionally, the article mentions that UNOS considers its code to be a trade secret and would require $55 million if the contract were to be given to someone else, indicating a lack of transparency and potential barriers to technological advancement [130083]. (b) The software failure incident related to accidental factors is also apparent in the article. For example, it mentions a three-hour system crash in February 2021 attributed to a firewall failure, which could be considered an accidental technical glitch [130083]. Furthermore, the article discusses a case where a programming error pushed some lung patients lower on the priority list, indicating accidental mistakes in the system that affected patient care [130083]. These incidents highlight how accidental factors can contribute to software failures in critical systems like the organ transplant network.
Duration temporary The software failure incident related to the transplant system for organs was temporary. The system experienced crashes for hours at a time, with one specific outage lasting about three hours in February 2021. The three-hour crash was attributed to a firewall failure [130083].
Behaviour crash, omission, other (a) crash: The software failure incident mentioned in the articles involved crashes of the system, where the critical computers connecting the transplant network crashed for a total of 17 days since 1999, with one outage lasting about three hours in February 2021 [130083]. (b) omission: The software failure incident also involved omissions by the system, as there were difficulties in getting the surgeons to a very sick woman in the ICU on life-support systems after an organ was offered, leading to delays and ultimately the woman's death [130083]. (c) timing: The software failure incident did not specifically mention failures related to timing issues where the system performed its intended functions correctly but too late or too early. (d) value: The software failure incident did not specifically mention failures related to the system performing its intended functions incorrectly. (e) byzantine: The software failure incident did not specifically mention failures related to the system behaving erroneously with inconsistent responses and interactions. (f) other: The software failure incident also highlighted issues with manual data entry leading to mistakes or narrowing the timing window for successful organ matches, as well as the clunky organizational structure of the software that made changes in priority policy take a full year to be reflected in the code [130083].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence death, harm, delay (a) death: People lost their lives due to the software failure - The article mentions a case where difficulties in getting surgeons to a patient and transporting a liver resulted in the death of a very sick woman in the ICU [130083]. (b) harm: People were physically harmed due to the software failure - The incident where a sick woman in the ICU died due to difficulties in getting the surgeons and the liver in time can be considered as physical harm resulting from the software failure [130083]. (e) delay: People had to postpone an activity due to the software failure - The delays in getting organs to the right place quickly due to outdated technology and manual data entry processes were highlighted in the article, impacting the timely delivery of organs for transplantation [130083].
Domain health The failed system mentioned in the article is related to the **health** industry. The system in question is the one responsible for getting donated kidneys, livers, and hearts to desperately ill patients awaiting organ transplants [Article 130083].

Sources

Back to List