Incident: United Airlines Grounds Flights Due to Faulty Computer Network Router

Published Date: 2015-07-08

Postmortem Analysis
Timeline 1. The software failure incident involving United Airlines' faulty computer network router disrupting its passenger reservations system occurred on July 8, 2015 [38007].
System 1. Network router [38007] 2. Computer system [38007]
Responsible Organization 1. A faulty computer network router was responsible for causing the software failure incident at United Airlines [38007]. 2. Hackers were responsible for causing a similar incident at a Polish state-owned airline two weeks prior to the United Airlines incident [38007].
Impacted Organization 1. United Airlines - The software failure incident impacted United Airlines by grounding planes nationwide, causing flight cancellations, delays, and passenger inconvenience [38007].
Software Causes 1. The software cause of the failure incident was a faulty computer network router disrupting United Airlines' passenger reservations system, leading to flight cancellations and delays [38007].
Non-software Causes 1. The failure incident was caused by a faulty computer network router disrupting United Airlines' passenger reservations system, leading to flight cancellations and delays [38007]. 2. The concentration of air traffic among a few major carriers due to mega-mergers like the one between United and Continental Airlines in 2010 was mentioned as a factor that lessened redundancy for the airline system, making a greater percentage of the system at risk if a glitch occurs [38007].
Impacts 1. 61 flights were canceled and more than 1,100 other flights were delayed [38007]. 2. Passengers experienced difficulties in getting their boarding passes online, with delays reported even after United restarted its operations [38007]. 3. The software failure incident caused disruptions in United's domestic operations, affecting both United and United Express flights [38007]. 4. The incident highlighted the vulnerability of airlines to computer malfunctions, with previous incidents involving other airlines such as American Airlines and Polish LOT airline [38007].
Preventions 1. Implementing robust network redundancy and failover systems to prevent disruptions caused by faulty network equipment like the router issue experienced by United Airlines [38007]. 2. Conducting regular security audits and implementing strong cybersecurity measures to prevent potential security incidents that could lead to network disruptions [38007]. 3. Enhancing system testing and quality assurance processes to catch and address software bugs or glitches before they impact operations, as seen in the case of American Airlines' flight delays due to a software bug in iPad software [38007].
Fixes 1. Implementing more robust network infrastructure to prevent router failures [38007]. 2. Conducting thorough forensic investigations to identify the root cause of the issue and address any security incidents that may have contributed to the failure [38007]. 3. Enhancing cybersecurity measures to protect against potential hacking attempts that could disrupt the system [38007].
References 1. United Airlines spokeswoman, Jennifer Dohm [38007] 2. Ryan Ver Berkmoes, a travel writer [38007] 3. Security experts [38007] 4. Sebastian Mikosz, Polish LOT airline’s chief executive [38007] 5. George Hoffer, transportation economist at the University of Richmond [38007] 6. United’s chief executive, Jeff Smisek [38007]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization, multiple_organization (a) The software failure incident at United Airlines happened again within the same organization. This incident was the second time in five weeks that United had to ground flights due to a computer issue [38007]. (b) The articles mention that the incident at United Airlines was part of a series of recent software failures in the airline industry. For example, a Polish state-owned airline also faced a similar issue when hackers flooded their computer system with web traffic, grounding 1,400 travelers [38007]. Additionally, American Airlines experienced delays in some flights in April due to a bug in its iPad software [38007].
Phase (Design/Operation) design, operation (a) The software failure incident at United Airlines, where planes were grounded due to a faulty computer network router disrupting the passenger reservations system, can be attributed to a design-related issue. The article mentions that a network router problem was identified as the culprit for the disruption [38007]. This indicates that the failure was due to contributing factors introduced during the system development or system updates. (b) Additionally, the incident could also be linked to an operation-related issue. The delays and cancellations experienced by United Airlines, as well as the difficulties faced by passengers in obtaining boarding passes online even after the operations were restarted, point towards operational challenges [38007]. This suggests that contributing factors introduced by the operation or misuse of the system may have played a role in the software failure incident.
Boundary (Internal/External) within_system, outside_system (a) within_system: The software failure incident at United Airlines was attributed to a network router problem within the system. United Airlines grounded flights due to a faulty computer network router disrupting its passenger reservations system, causing cancellations and delays [38007]. The article mentions that United's chief executive, Jeff Smisek, stated that the computer issue prevented the airline from dispatching planes and took about 90 minutes for United's computer technicians to correct the problem [38007]. (b) outside_system: The article also mentions a previous incident where a Polish state-owned airline grounded travelers after hackers flooded the airline's computer system with web traffic. The cause was attributed to external hackers attacking the system, highlighting the vulnerability of airlines to such external attacks [38007].
Nature (Human/Non-human) non-human_actions (a) The software failure incident at United Airlines was attributed to a network router problem, which disrupted the passenger reservations system and led to flight cancellations and delays [38007]. Additionally, the article mentions a previous incident where a Polish airline was grounded due to hackers flooding the computer system with web traffic, highlighting the vulnerability of airlines to such non-human actions [38007]. (b) The article does not provide specific information about the software failure incident at United Airlines being directly caused by human actions.
Dimension (Hardware/Software) hardware, software (a) The software failure incident at United Airlines was attributed to a hardware issue, specifically a faulty computer network router. The article mentions that a network router problem was the culprit behind the disruption in the passenger reservations system, leading to flight cancellations and delays [38007]. (b) While the incident was primarily caused by a hardware issue with the network router, the article also mentions other software-related failures in the aviation industry. For example, it highlights incidents where pilots could not access digital flight plans due to software issues and where a bug in iPad software caused flight delays for American Airlines [38007].
Objective (Malicious/Non-malicious) malicious, non-malicious (a) The article mentions a previous incident where a Polish state-owned airline was grounded after hackers flooded the airline’s computer system with web traffic, indicating a malicious software failure incident [38007]. (b) The main software failure incident discussed in the article was related to a faulty computer network router disrupting United Airlines' passenger reservations system, causing flight cancellations and delays. This incident is categorized as a non-malicious software failure [38007].
Intent (Poor/Accidental Decisions) accidental_decisions The software failure incident involving United Airlines' grounded flights was primarily attributed to a network router problem [38007]. This incident falls under the category of accidental_decisions, as it was caused by a technical issue rather than poor decisions. The article mentions that United's chief executive, Jeff Smisek, stated that the computer issue prevented the dispatching of planes and took about 90 minutes for the technicians to correct [38007].
Capability (Incompetence/Accidental) accidental (a) The software failure incident related to development incompetence is not explicitly mentioned in the provided article [38007]. (b) The software failure incident related to accidental factors is evident in the article. The incident at United Airlines, where planes were grounded due to a faulty computer network router disrupting the passenger reservations system, was described as a technical problem caused by a network router issue [38007]. This indicates that the failure was accidental in nature, stemming from a technical glitch rather than intentional actions.
Duration temporary The software failure incident reported in Article 38007 was temporary. The United Airlines' grounding of planes lasted for nearly two hours due to a faulty computer network router disrupting its passenger reservations system. The issue caused 61 flights to be canceled and more than 1,100 other flights to be delayed [38007]. The ground stop was lifted for all of United’s domestic operations at 9:49 a.m., indicating a temporary nature of the software failure incident [38007].
Behaviour crash, omission, value, other (a) crash: The software failure incident in the article is related to a crash as United Airlines had to ground planes nationwide for nearly two hours due to a faulty computer network router disrupting its passenger reservations system, causing flight cancellations and delays [38007]. (b) omission: The software failure incident could also be related to omission as the system omitted to perform its intended functions by not dispatching many domestic flights, leading to cancellations and delays [38007]. (c) timing: The software failure incident could be related to timing as the system performed its intended functions incorrectly at specific times, causing disruptions in flight operations [38007]. (d) value: The software failure incident could be related to value as the system performed its intended functions incorrectly, leading to delays and cancellations of flights [38007]. (e) byzantine: The software failure incident does not seem to be related to a byzantine behavior as there is no mention of inconsistent responses or interactions in the system's behavior [38007]. (f) other: The other behavior exhibited by the software failure incident could be related to network connectivity issues caused by a faulty network router, which disrupted the passenger reservations system, leading to flight cancellations, delays, and difficulties for passengers in obtaining boarding passes [38007].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence property, delay, non-human, theoretical_consequence, other (a) death: There is no mention of any deaths resulting from the software failure incident reported in the articles [38007]. (b) harm: There is no mention of physical harm to individuals resulting from the software failure incident reported in the articles [38007]. (c) basic: There is no mention of people's access to food or shelter being impacted due to the software failure incident reported in the articles [38007]. (d) property: The software failure incident caused flight cancellations, delays, and disruptions for passengers, impacting their travel plans and potentially leading to financial losses [38007]. (e) delay: The software failure incident led to the cancellation of 61 flights and delays for over 1,100 other flights, affecting passengers' travel schedules [38007]. (f) non-human: The software failure incident affected the operations of United Airlines and its passenger reservations system, leading to flight cancellations and delays [38007]. (g) no_consequence: The software failure incident resulted in real consequences such as flight cancellations, delays, and disruptions for passengers [38007]. (h) theoretical_consequence: There were discussions about potential security incidents related to the software failure, but no concrete evidence was provided in the articles [38007]. (i) other: The software failure incident highlighted the vulnerabilities in the airline industry to cyber attacks, as seen in a previous incident with a Polish state-owned airline being grounded due to hackers flooding the computer system with web traffic [38007].
Domain transportation, finance (a) The failed system in the incident was related to the transportation industry, specifically affecting United Airlines' passenger reservations system [38007]. (h) The incident also had implications for the finance industry as trading was suspended on the New York Stock Exchange on the same day due to a technical problem, although it was confirmed that the malfunctions at United and the stock exchange were unrelated internal technical issues and not a security episode [38007]. (m) Additionally, the article mentions that the incident was part of a series of recent mishaps in the airline industry, such as the grounding of flights due to pilots not being able to access digital flight plans and delays caused by software bugs in iPad software at American Airlines [38007].

Sources

Back to List