Incident: Aerodata System Outage Grounds Flights Across Major Airlines

Published Date: 2019-04-01

Postmortem Analysis
Timeline 1. The software failure incident happened on Monday morning [82538]. Therefore, the estimated timeline for the software failure incident would be Monday morning of the same week as the article was published on April 1, 2019.
System 1. Aerodata system produced by a vendor for tracking a plane's weight and balance [82538]
Responsible Organization 1. The software failure incident was caused by a system called Aerodata produced by a vendor to track a plane's weight and balance, affecting several major airlines including Southwest, Delta, JetBlue, and United [Article 82538].
Impacted Organization 1. Several major airlines nationwide, including Southwest, Delta, JetBlue, and United, were impacted by the software failure incident [Article 82538].
Software Causes 1. The software outage was caused by a problem with the Aerodata system, which is used by airlines to track a plane's weight and balance during flight planning [82538].
Non-software Causes 1. The software outage was caused by a problem with a system called Aerodata produced by a vendor to track a plane's weight and balance [82538]. 2. Technical problems with Sabre, a company that airlines use for printing baggage tickets, check-in, and making reservations, also contributed to the issues faced by several airlines [82538].
Impacts 1. Several major airlines nationwide grounded their planes due to a software outage affecting the Aerodata system used for flight planning, impacting airlines like Southwest, Delta, JetBlue, and United [Article 82538]. 2. Southwest experienced an internal ground stop for about 40 minutes with scattered flight delays anticipated [Article 82538]. 3. Delta reported that some connecting flights were affected, resulting in delays, but no cancellations were expected [Article 82538]. 4. JetBlue also faced delays on some flights due to the Aerodata problem [Article 82538]. 5. United mentioned that about 150 of its flights were affected, with some flights departing and efforts being made to get all affected flights back on schedule [Article 82538]. 6. FlightAware.com reported around 100 late or canceled flights at Baltimore-Washington International, Dulles International, and Reagan National airports [Article 82538].
Preventions 1. Regular software testing and quality assurance procedures could have potentially prevented the software failure incident by identifying any issues with the Aerodata system before it caused disruptions [82538]. 2. Implementing redundancy or backup systems for critical software components like Aerodata could have helped mitigate the impact of the outage and allowed for a smoother transition during such incidents [82538]. 3. Improved communication and coordination between the vendor responsible for the Aerodata system and the airlines could have facilitated a quicker resolution of the issue and minimized the disruption to flight operations [82538].
Fixes 1. Updating the Aerodata software system to address the issue causing the outage [82538] 2. Implementing a backup system or redundancy measures to prevent similar incidents in the future [82538]
References 1. Greg Martin, spokesman for the Federal Aviation Administration [Article 82538] 2. Southwest Airlines officials [Article 82538] 3. Delta Airlines officials [Article 82538] 4. JetBlue officials [Article 82538] 5. United Airlines officials [Article 82538] 6. FlightAware.com [Article 82538]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization, multiple_organization (a) The software failure incident having happened again at one_organization: - Last week, several airlines had trouble after Sabre, a company that airlines use for printing baggage tickets, check-in, and making reservations, had technical problems. This indicates a recurring software failure incident related to Sabre within the airline industry [82538]. (b) The software failure incident having happened again at multiple_organization: - The article mentions that several major airlines nationwide grounded their planes due to a software outage related to the Aerodata system. Airlines affected included Southwest, Delta, JetBlue, and United, indicating a recurring software failure incident across multiple organizations in the airline industry [82538].
Phase (Design/Operation) operation (a) The software failure incident in the article was related to the operation phase. The issue was with a system called Aerodata, produced by a vendor to track a plane's weight and balance, which is used in flight planning. The problem caused several major airlines to ground their planes, resulting in delays and disruptions to flight schedules [82538]. (b) The software failure incident was not explicitly linked to the design phase in the articles provided.
Boundary (Internal/External) within_system (a) within_system: The software failure incident was within the system as it was caused by a problem with a system called Aerodata that's produced by a vendor to track a plane's weight and balance [82538]. The issue originated from within the software system itself, affecting multiple airlines and causing flight delays and cancellations.
Nature (Human/Non-human) non-human_actions (a) The software failure incident was due to non-human actions, specifically a software outage related to the Aerodata system used by several major airlines to track a plane's weight and balance [82538]. The problem was with the system itself, not introduced by human actions.
Dimension (Hardware/Software) hardware (a) The software failure incident reported in Article 82538 was due to a hardware-related issue. The problem was specifically with a system called Aerodata, which is produced by a vendor to track a plane's weight and balance. This system is used in flight planning and is a hardware component used in the aviation industry [82538].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident reported in Article 82538 was non-malicious. The issue was attributed to a software outage related to the Aerodata system used by several major airlines to track a plane's weight and balance during flight planning. The problem was identified as a system glitch rather than a deliberate act to harm the system. The outage caused delays and disruptions in flight operations for airlines such as Southwest, Delta, JetBlue, and United [82538].
Intent (Poor/Accidental Decisions) accidental_decisions (a) The software failure incident related to the grounding of planes due to a software outage was not explicitly linked to poor decisions. The incident was primarily attributed to a system called Aerodata, produced by a vendor, which is used for tracking a plane's weight and balance in flight planning. The outage affected several major airlines, including Southwest, Delta, JetBlue, and United, leading to delays and disruptions in flight operations [82538]. (b) The software failure incident was more aligned with accidental decisions or unintended consequences rather than poor decisions. The issue with the Aerodata system was described as a software outage, and efforts were being made to quickly resolve the problem to minimize the impact on flight operations. The incident was not portrayed as a result of deliberate poor decisions but rather as an unexpected technical problem affecting multiple airlines [82538].
Capability (Incompetence/Accidental) accidental (a) The software failure incident reported in Article 82538 was not attributed to development incompetence. The issue was specifically mentioned to be related to a system called Aerodata, produced by a vendor, which is used for tracking a plane's weight and balance in flight planning. The incident was described as a software outage affecting several major airlines, including Southwest, Delta, JetBlue, and United. The spokesperson for the Federal Aviation Administration mentioned that the problem was with the Aerodata system, and it was expected to be resolved quickly [82538]. (b) The software failure incident in Article 82538 was categorized as accidental. The issue was described as a software outage related to the Aerodata system used by airlines for flight planning. The outage resulted in grounding planes and causing delays for several airlines, including Southwest, Delta, JetBlue, and United. The incident was not attributed to intentional actions or malicious intent but rather to an unexpected technical problem with the software system [82538].
Duration temporary (a) The software failure incident described in the articles was temporary. The article mentions that several major airlines grounded their planes due to a software outage caused by a problem with the Aerodata system. The spokesman for the Federal Aviation Administration mentioned that the issue was expected to be resolved quickly, and airlines like Southwest, Delta, JetBlue, and United were working to get affected flights back on schedule. Additionally, FlightAware.com reported about 100 flights being late or canceled at certain airports just after 8 a.m., indicating a temporary disruption [82538].
Behaviour value, other (a) crash: The software failure incident in the article was not described as a crash where the system loses state and does not perform any of its intended functions [82538]. (b) omission: The software failure incident in the article did not involve the system omitting to perform its intended functions at an instance(s) [82538]. (c) timing: The software failure incident in the article did not involve the system performing its intended functions correctly, but too late or too early [82538]. (d) value: The software failure incident in the article involved the system performing its intended functions incorrectly, specifically related to tracking a plane's weight and balance, which caused delays and flight disruptions for several major airlines [82538]. (e) byzantine: The software failure incident in the article was not described as a byzantine failure where the system behaves erroneously with inconsistent responses and interactions [82538]. (f) other: The software failure incident in the article can be categorized as a value failure, where the system performed its intended functions incorrectly, leading to disruptions in flight operations for multiple airlines due to issues with the Aerodata system used for flight planning [82538].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence delay The consequence of the software failure incident described in the articles is primarily related to delays experienced by the airlines and passengers. The software outage affected several major airlines, including Southwest, Delta, JetBlue, and United, leading to delays in flights. For example, Southwest had an internal ground stop for about 40 minutes, Delta mentioned that some connecting flights were affected, and United reported that about 150 of its flights were impacted. Additionally, FlightAware.com indicated that around 100 flights were late or canceled at various airports due to the software issue [82538]. Therefore, the relevant consequence of the software failure incident is (e) delay.
Domain transportation (a) The failed system was intended to support the transportation industry. The software outage affected several major airlines, including Southwest, Delta, JetBlue, and United, grounding planes and causing delays in flight operations [82538].

Sources

Back to List