Incident: Southwest Airlines Faces Flight Disruptions Due to Computer Issues

Published Date: 2021-06-15

Postmortem Analysis
Timeline 1. The software failure incident happened on June 14, 2021 [115486].
System 1. Computer reservation system 2. Network connectivity 3. Third-party weather data provider [115486]
Responsible Organization 1. Southwest Airlines was responsible for causing the software failure incident as they experienced computer reservation issues and network connectivity problems leading to flight cancellations and delays [Article 115486].
Impacted Organization 1. Southwest Airlines - The software failure incident impacted Southwest Airlines, leading to the cancellation of about 500 flights and delays for hundreds of others [115486].
Software Causes 1. The software causes of the failure incident were intermittent performance issues with Southwest Airlines' network connectivity, leading to a computer reservation issue that forced the airline to temporarily halt operations [115486].
Non-software Causes 1. The failure incident was caused by a computer reservation issue that led to a temporary nationwide groundstop at the request of Southwest Airlines [Article 115486]. 2. Another issue was related to a third-party weather data provider experiencing intermittent performance issues, which prevented the transmission of weather information necessary for safe aircraft operation [Article 115486].
Impacts 1. Southwest Airlines had to cancel about 500 flights and delay hundreds of others due to a computer issue, leading to significant disruptions in their operations [115486]. 2. The Federal Aviation Administration issued a temporary nationwide groundstop at the request of Southwest Airlines, causing further delays and interruptions in air travel [115486]. 3. The computer reservation issue resulted in Southwest delaying nearly 1,300 flights, which accounted for 37% of its total flights on that day [115486].
Preventions To prevent the software failure incident experienced by Southwest Airlines, the following measures could have been taken: 1. Implementing robust network connectivity solutions to avoid intermittent performance issues that impact operations [115486]. 2. Conducting thorough testing and monitoring of third-party systems, like the weather data provider, to ensure their reliability and prevent disruptions in transmitting critical information [115486].
Fixes 1. Improving network connectivity to address the intermittent performance issues with the network [115486]. 2. Ensuring the reliability and performance of the third-party weather data provider to prevent future interruptions in transmitting weather information [115486].
References 1. Southwest Airlines (LUV.N) [Article 115486] 2. Federal Aviation Administration [Article 115486] 3. FlightAware [Article 115486]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization (a) The software failure incident happened again at one_organization: Southwest Airlines experienced a computer issue that forced the airline to temporarily halt operations for the second time in 24 hours [Article 115486]. The first issue was related to a third-party weather data provider experiencing intermittent performance issues, while the second issue was due to intermittent performance issues with Southwest's network connectivity. (b) The software failure incident happened again at multiple_organization: There is no information in the provided article indicating that the software failure incident happened again at multiple organizations.
Phase (Design/Operation) operation (a) The software failure incident reported in Article 115486 was primarily related to the operation phase. Southwest Airlines experienced computer reservation issues and network connectivity problems that led to flight cancellations and delays. The article mentions that the groundstop was requested by Southwest Airlines to resolve a computer reservation issue, and the airline attributed the problem to intermittent performance issues with their network connectivity. Additionally, a separate issue was reported where a third-party weather data provider's performance issues affected flight operations. These issues point towards operational challenges rather than design-related failures [115486].
Boundary (Internal/External) within_system (a) The software failure incident reported in the article was within the system. Southwest Airlines experienced computer issues that led to the cancellation of about 500 flights and delays for hundreds more. The Federal Aviation Administration issued a temporary nationwide groundstop at the request of Southwest Airlines to resolve a computer reservation issue, indicating that the problem originated within the airline's system [115486].
Nature (Human/Non-human) non-human_actions (a) The software failure incident in the Southwest Airlines case was primarily attributed to non-human actions. The article mentions that the issue was caused by "intermittent performance issues with our network connectivity" [Article 115486]. Additionally, another separate issue was reported where a "third-party weather data provider experienced intermittent performance issues" [Article 115486]. These factors point towards technical or system-related issues rather than human actions as the root cause of the software failure incident.
Dimension (Hardware/Software) software (a) The software failure incident reported in the articles was not attributed to hardware issues but rather to software issues. Southwest Airlines experienced computer reservation issues and intermittent performance issues with network connectivity, leading to flight cancellations and delays [115486].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident reported in Article 115486 was non-malicious. Southwest Airlines experienced computer issues that led to the cancellation of about 500 flights and delays in hundreds of others. The Federal Aviation Administration issued a temporary nationwide groundstop at the request of Southwest Airlines to resolve a computer reservation issue, which was described as "intermittent performance issues with our network connectivity." Additionally, there was a separate issue related to a third-party weather data provider experiencing intermittent performance issues, which also impacted the airline's operations [115486].
Intent (Poor/Accidental Decisions) accidental_decisions (a) The software failure incident related to Southwest Airlines' flight cancellations and delays was not explicitly attributed to poor decisions. Instead, it was mentioned that the issues were caused by "intermittent performance issues with our network connectivity" and problems with a third-party weather data provider [115486].
Capability (Incompetence/Accidental) accidental (a) The software failure incident reported in the article does not indicate any specific evidence of development incompetence as the cause of the issue. The issues mentioned, such as computer reservation problems and network connectivity issues, appear to be more related to technical glitches rather than incompetence in development [115486]. (b) The software failure incident seems to be more aligned with accidental factors rather than intentional actions or incompetence. The article mentions issues like intermittent performance problems with network connectivity and a third-party weather data provider experiencing issues, which point towards accidental technical glitches rather than deliberate actions or incompetence [115486].
Duration temporary The software failure incident reported in Article 115486 was temporary. Southwest Airlines experienced a computer issue that led to the cancellation of about 500 flights and delays in hundreds of others. The Federal Aviation Administration issued a temporary nationwide groundstop at the request of Southwest Airlines to resolve the computer reservation issue. The groundstop lasted about 45 minutes, indicating that the software failure was temporary and not permanent [115486].
Behaviour crash, omission, other (a) crash: The software failure incident in the Southwest Airlines case led to the cancellation of about 500 flights and the delay of hundreds more due to a computer issue, resulting in a temporary halt of operations. This can be indicative of a crash where the system lost its state and was unable to perform its intended functions [115486]. (b) omission: The article mentions that Southwest Airlines had to stop flights due to a computer reservation issue and a separate problem with a third-party weather data provider. These issues led to the omission of performing the intended functions of transmitting weather information and managing reservations, causing delays and cancellations [115486]. (c) timing: There is no specific mention of the software failure incident being related to timing issues in the articles. (d) value: The articles do not provide information indicating that the software failure incident was due to the system performing its intended functions incorrectly. (e) byzantine: The articles do not suggest that the software failure incident exhibited behaviors of inconsistency or erroneous responses. (f) other: The software failure incident could also be categorized as a network connectivity issue leading to intermittent performance problems, impacting the airline's operations. This could be considered as another behavior of the software failure incident [115486].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence delay The consequence of the software failure incident reported in Article 115486 was primarily related to delays in flight operations. Southwest Airlines had to cancel about 500 flights and delay hundreds more due to a computer issue that forced the airline to temporarily halt operations. The Federal Aviation Administration issued a temporary nationwide groundstop to resolve the computer reservation issue, causing delays for passengers [115486].
Domain transportation (a) The failed system in the reported incident was related to the transportation industry, specifically affecting Southwest Airlines. The software failure incident caused the airline to cancel about 500 flights and delay hundreds of others due to a computer issue [Article 115486].

Sources

Back to List