Incident: Sabre System Glitch Causes Airline Delays and Frustration

Published Date: 2019-04-29

Postmortem Analysis
Timeline 1. The software failure incident with Sabre system happened in March [83128]. Therefore, the software failure incident with the Sabre system occurred in March.
System The system that failed in the software failure incident reported in the article is: 1. Sabre system - Sabre, a software system used by airlines for various purposes, experienced a technical glitch causing delays for multiple airlines [83128].
Responsible Organization 1. Sabre Corporation [83128]
Impacted Organization 1. Airlines such as Alaska, American, and JetBlue were impacted by the software failure incident [83128].
Software Causes 1. The software cause of the failure incident was a technical glitch in the Sabre system used by airlines, which caused delays and backups across the country [Article 83128].
Non-software Causes 1. Overload of transactions leading to a backlog due to the system going down [83128].
Impacts 1. The software failure incident caused delays for a number of airlines, leading to backups across the country [83128]. 2. Travelers vented their frustration on social media platforms like Twitter, blaming airlines such as Alaska, American, and JetBlue for the disruptions caused by the technical glitch [83128]. 3. The failure of the Sabre system, a crucial software used by airlines for various functions like tracking bookings and calculating baggage weight, resulted in disruptions in taking reservations and scheduling flight crews, impacting the smooth operation of airlines [83128]. 4. The effects of the software failure incident rippled across the global flight network, causing long-lasting disruptions even after the initial problem was resolved [83128]. 5. The incident highlighted the challenges in recovering from a glitch in systems like Sabre, which are responsible for tracking hundreds of millions of data points, leading to backlogs and a challenging recovery process [83128].
Preventions 1. Implementing more robust testing procedures to catch potential glitches before they impact the system [83128]. 2. Enhancing redundancy and failover mechanisms within the software system to minimize the impact of any failures [83128]. 3. Continuous monitoring and proactive maintenance to identify and address any potential issues before they escalate [83128].
Fixes 1. Continuous improvement efforts by companies like Sabre to enhance their software reliability and performance [83128]. 2. Implementing robust backup and redundancy systems to minimize the impact of glitches or failures in critical software systems [83128]. 3. Investing in advanced monitoring and alerting systems to quickly identify and address any issues in the software before they escalate into major incidents [83128].
References 1. Samuel Engel, senior vice president and head of the aviation practice at the consulting firm ICF [Article 83128] 2. Henry Harteveldt, founder of Atmosphere Research Group [Article 83128]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization, multiple_organization (a) The software failure incident related to Sabre has happened again within the same organization. The article mentions that the same system experienced a failure in March, indicating a recurring issue within Sabre's software [83128]. (b) The software failure incident related to Sabre has also happened at multiple organizations. The article mentions that Sabre is not the only system of its kind, with major competitors like Amadeus, Travelport, and TravelSky. This suggests that similar incidents could potentially occur with these other systems as well [83128].
Phase (Design/Operation) design, operation (a) The software failure incident related to the design phase can be attributed to the technical glitch experienced by a number of airlines due to the Sabre system. The article mentions that Sabre, a software system used by airlines for various purposes like tracking bookings and calculating baggage weight, experienced a failure causing delays and backups across the country [83128]. (b) The software failure incident related to the operation phase can be linked to the challenges faced in recovering from a glitch in the Sabre system. The article highlights that Sabre is responsible for tracking hundreds of millions of data points, and when a part of the system goes down, it creates a backlog, making the recovery process challenging [83128].
Boundary (Internal/External) within_system, outside_system (a) The software failure incident involving Sabre, which caused delays for multiple airlines, was due to contributing factors that originated from within the system itself. Sabre is a crucial back-end system used by airlines for various functions like tracking bookings and scheduling flight crews. When the system experiences a glitch, it can have widespread effects on the global flight network, indicating that the failure was within the system [83128]. (b) Additionally, the article mentions that Sabre interfaces with various other systems, indicating that external factors could also contribute to the software failure incident. However, the primary cause of the delays and glitches was attributed to issues within the Sabre system itself [83128].
Nature (Human/Non-human) non-human_actions (a) The software failure incident with Sabre system was due to non-human actions, specifically a technical glitch within the system itself. This glitch caused delays for multiple airlines and impacted various back-end functions like tracking bookings and calculating baggage weight [83128]. (b) The article does not mention any specific human actions contributing to the software failure incident.
Dimension (Hardware/Software) software (a) The software failure incident mentioned in the article was not attributed to hardware issues but rather to a technical glitch in the Sabre software system used by airlines [83128]. The glitch in the software caused delays and backups across the country, impacting the operations of multiple airlines. (b) The software failure incident was specifically related to a technical glitch in the Sabre software system, which is a critical back-end system used by airlines for various functions like tracking bookings and scheduling flight crews [83128]. The article highlights how even a small snag in the software system can have significant ripple effects across the global flight network, emphasizing the importance of software reliability in such systems.
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident related to the technical glitch in the Sabre system that caused delays for airlines was non-malicious. The glitch was not caused by any malicious intent but rather by a technical issue within the system itself, leading to backups across the country [83128].
Intent (Poor/Accidental Decisions) unknown (a) The software failure incident related to the Sabre system causing delays for airlines was not due to poor decisions but rather a technical glitch in the system itself. The incident was attributed to a snag in the software, which caused backups across the country, affecting airlines like Alaska, American, and JetBlue [83128]. (b) The software failure incident was not due to accidental decisions but rather a technical glitch in the Sabre system, which is a crucial back-end system used by airlines for various functions like tracking bookings and calculating baggage weight. The failure was not a result of mistakes or unintended decisions but rather a system issue that caused disruptions in airline operations [83128].
Capability (Incompetence/Accidental) accidental (a) The software failure incident related to development incompetence is not explicitly mentioned in the provided article. Therefore, it is unknown whether the failure was due to contributing factors introduced due to lack of professional competence by humans or the development organization. (b) The software failure incident related to an accidental cause is mentioned in the article. The article describes how a technical glitch in the Sabre system, a crucial software system used by airlines, caused delays and backups across the country. This glitch was not intentional but occurred accidentally, leading to disruptions in airline operations [83128].
Duration temporary (a) The software failure incident related to Sabre causing delays for airlines was temporary. The article mentions that the technical glitch in the Sabre system caused delays for airlines on Monday, but it was not a permanent failure. The effects of the glitch rippled across the global flight network, but the system was eventually fixed [83128].
Behaviour crash, value, other (a) crash: The software failure incident related to Sabre system experiencing a technical glitch causing delays for airlines can be categorized as a crash. The glitch led to backups across the country, impacting the airlines' operations and causing frustration among travelers [83128]. (b) omission: The article does not specifically mention the failure as an omission where the system omitted to perform its intended functions at an instance(s). (c) timing: The software failure incident related to the Sabre system does not align with a timing failure where the system performed its intended functions too late or too early. (d) value: The failure of the Sabre system can be attributed to a value failure as it did not perform its intended functions correctly, leading to delays and disruptions in airline operations [83128]. (e) byzantine: The software failure incident related to the Sabre system does not exhibit a byzantine behavior where the system behaves erroneously with inconsistent responses and interactions. (f) other: The behavior of the software failure incident can be described as a glitch that caused disruptions in the airline operations, impacting the system's reliability and leading to delays across the global flight network [83128].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence delay (delay) The consequence of the software failure incident reported in the article was primarily related to delays experienced by a number of airlines. The technical glitch in the Sabre system caused backups across the country, leading to delays in flight operations for the airlines involved [83128].
Domain transportation (a) The Sabre system, which experienced a software failure incident, is used by airlines for various purposes such as tracking bookings and calculating baggage weight. It is crucial for back-end airline functions like taking reservations and scheduling flight crews [83128]. (b) The software failure incident affected the transportation industry, specifically airlines, causing delays and backups across the country [83128].

Sources

Back to List