Incident: Heathrow Airport IT Failure Disrupts Flights and Check-in Systems

Published Date: 2020-02-16

Postmortem Analysis
Timeline 1. The software failure incident at Heathrow Airport happened on February 16, 2020 [Article 95722]. 2. The incident date can be directly determined from the article.
System 1. Heathrow Airport's IT system [95722, 95697] 2. Heathrow app [95697]
Responsible Organization 1. Heathrow Airport [95722, 95697] 2. British Airways [95722, 95697]
Impacted Organization 1. Passengers at Heathrow Airport [95722, 95697] 2. British Airways, the biggest airline at Heathrow [95722, 95697]
Software Causes 1. IT issues at Heathrow Airport affecting departure boards and check-in systems [95697, 95722] 2. Technical failures in Heathrow's IT system [95722] 3. IT glitch involving online check-in and flight departures systems at British Airways [95722]
Non-software Causes 1. Storm Dennis causing disruption at Heathrow Airport [95722, 95697] 2. Overcrowding and chaos at the airport due to the half-term weekend [95722]
Impacts 1. Thousands of passengers had their flights cancelled or severely delayed at Heathrow Airport due to technical issues affecting departure boards and check-in systems [95697]. 2. British Airways cancelled 20 flights as a result of the IT issues at Heathrow Airport [95722]. 3. Passengers faced chaos, confusion, and inconvenience, with some being stranded and having to rely on handwritten whiteboards for flight information [95722, 95697]. 4. The IT failures led to disruptions in flight schedules, causing delays and cancellations, further compounded by bad weather conditions [95722, 95697]. 5. Passengers had to deal with lack of information, long queues at customer service desks, and challenges in finding staff for assistance [95722, 95697].
Preventions 1. Implementing robust IT system monitoring and maintenance protocols to detect and address potential issues before they escalate [95722]. 2. Conducting regular system testing and updates to ensure the smooth functioning of IT systems [95722]. 3. Enhancing communication channels between airport staff and passengers to provide timely and accurate information during disruptions [95722]. 4. Developing contingency plans and backup systems to mitigate the impact of technical failures on airport operations [95722]. 5. Improving coordination between airlines and airport authorities to streamline response efforts during IT incidents [95722].
Fixes 1. Implementing robust IT system monitoring and maintenance protocols to prevent future technical issues [95722, 95697] 2. Enhancing communication channels between airport staff and passengers to provide real-time updates during disruptions [95722] 3. Conducting a thorough review of the Heathrow app to address any potential issues causing inconvenience to passengers [95697]
References 1. Heathrow Airport officials [Article 95722, Article 95697] 2. Passengers affected by the incident, such as Sam Mills and Caitlin Gould [Article 95722, Article 95697] 3. British Airways [Article 95722, Article 95697] 4. Heathrow spokesperson [Article 95697] 5. Twitter users who reported issues with the Heathrow app [Article 95697]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization, multiple_organization (a) The software failure incident has happened again at one_organization: British Airways, the biggest airline at Heathrow Airport, experienced cancellations due to Heathrow's IT issues combined with disruption caused by Storm Dennis. British Airways had previously faced high-profile IT failures, including an incident in August the previous year where over 100 flights were cancelled and 200 were delayed due to an IT glitch involving online check-in and flight departures [95722]. (b) The software failure incident has happened again at multiple_organization: The article mentions that the technical issues at Heathrow Airport affected a number of airlines, not just British Airways. The disruption caused by Heathrow's IT problems impacted several airlines operating at the airport [95722].
Phase (Design/Operation) design, operation (a) The software failure incident at Heathrow Airport was primarily due to issues with the airport's IT system, which affected departure boards, check-in systems, and flight updates [95722, 95697]. This indicates a failure related to the design phase, where contributing factors introduced by system development or updates led to disruptions in the airport's operations. (b) Additionally, the articles mention that passengers had to rely on handwritten whiteboards for flight information, and there were issues with the Heathrow app sending unrelated push notifications about the coronavirus [95697]. These aspects point towards a failure related to the operation phase, where factors introduced by the operation or misuse of the system caused inconvenience to passengers and added to the chaos at the airport.
Boundary (Internal/External) within_system, outside_system (a) within_system: The software failure incident at Heathrow Airport was primarily caused by technical issues within the airport's IT system. The failure affected departure boards, check-in systems, and flight information updates, leading to flight cancellations and severe delays for thousands of passengers [95722, 95697]. British Airways, the largest airline at Heathrow, had to cancel 20 flights due to the IT issues [95722]. The airport's app also experienced problems, sending push notifications about the coronavirus, which may have been related to the IT issue affecting check-in systems [95697]. (b) outside_system: The software failure incident at Heathrow Airport was exacerbated by external factors such as Storm Dennis, which caused additional disruption to flights [95722, 95697]. Air traffic control was not affected by the technical failures, indicating that the issue was contained within the airport's IT systems [95722]. The airport aimed to resume normal service despite potential ongoing impacts on some airlines' operations due to the storm and the technical issues [95697].
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident at Heathrow Airport was primarily due to non-human actions. The incident was caused by technical issues affecting the airport's IT system, leading to disruptions in departure boards and check-in systems [95722, 95697]. The IT failures were exacerbated by the existing disruption caused by Storm Dennis, indicating that the primary contributing factors were technical in nature rather than human actions. (b) Human actions also played a role in the software failure incident at Heathrow Airport. Passengers reported confusion and chaos due to the lack of information on flight boards, leading to difficulties in finding gates and boarding flights. Additionally, British Airways had to cancel flights and provide refunds or re-bookings for affected passengers due to the IT issues. The airline also mentioned bringing in extra colleagues to assist customers and providing overnight accommodation if needed, indicating human intervention to manage the situation caused by the software failure [95722, 95697].
Dimension (Hardware/Software) software (a) The software failure incident at Heathrow Airport was primarily attributed to IT issues affecting its systems, including departure boards and check-in systems. The incident caused disruptions leading to flight cancellations and delays for thousands of passengers [95722, 95697]. (b) The software failure incident was specifically related to technical issues in the airport's IT systems, which impacted the departure boards and check-in systems. The IT failures led to chaos at the airport, with passengers having to rely on manual methods such as whiteboards to obtain flight information. British Airways also mentioned that the cancellations were a result of Heathrow's IT issues combined with disruptions caused by Storm Dennis [95722, 95697].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident at Heathrow Airport was non-malicious. The incident was primarily attributed to technical issues affecting the airport's IT systems, which led to disruptions in departure boards and check-in systems [95722, 95697]. The disruptions were exacerbated by the existing weather-related delays caused by Storm Dennis. The airport and airlines like British Airways worked to resolve the technical issues and assist affected passengers, offering refunds, re-bookings, and overnight accommodations as necessary. The incident was characterized by chaos, flight cancellations, delays, and reliance on manual methods like whiteboards to convey flight information, indicating a non-malicious failure scenario.
Intent (Poor/Accidental Decisions) poor_decisions (a) The software failure incident at Heathrow Airport was primarily due to poor decisions. The incident was a result of technical issues affecting the airport's IT system, leading to flight cancellations and delays. British Airways, the biggest airline at Heathrow, had to cancel 20 flights due to the IT issues combined with disruptions caused by Storm Dennis [95722, 95697]. The incident also caused chaos for passengers, with handwritten flight information displayed on whiteboards and passengers struggling to find accurate information about their flights [95722, 95697]. Additionally, the incident led to passengers being stranded and facing difficulties in rebooking or getting refunds for their flights [95722].
Capability (Incompetence/Accidental) accidental (a) The software failure incident at Heathrow Airport was not explicitly attributed to development incompetence. The articles mainly highlight technical issues affecting the airport's IT systems, resulting in flight cancellations and delays. The incident was exacerbated by existing disruptions caused by Storm Dennis [95722, 95697]. (b) The software failure incident at Heathrow Airport was described as causing "utter chaos" and passengers had to rely on handwritten whiteboards for flight information, indicating a significant disruption in the airport's operations. Additionally, there were reports of issues with the Heathrow app sending unrelated push notifications about the coronavirus, which could be considered an accidental factor contributing to the incident [95722, 95697].
Duration temporary The software failure incident at Heathrow Airport was temporary. The IT issues affecting the departure boards and check-in systems were resolved, and the airport aimed to resume normal service the following day [Article 95722, Article 95697]. The incident was attributed to technical issues at the airport combined with disruptions caused by Storm Dennis. Additionally, British Airways mentioned that the cancellations were a result of Heathrow's IT issues and the storm, indicating that the failure was not permanent but rather a temporary disruption [Article 95722, Article 95697].
Behaviour crash, omission, other (a) crash: The software failure incident at Heathrow Airport resulted in flight boards not updating, displaying 'Delayed' messages, and providing incorrect gate information, leading to passengers missing their flights and being stranded [95722]. (b) omission: Passengers at Heathrow Airport reported issues with the lack of information at the gates, reliance on whiteboards for flight details, and discrepancies between online information and the boards, causing confusion and difficulty in finding where they should be [95722, 95697]. (c) timing: The software failure incident caused delays in flights and disruptions to operations at Heathrow Airport, impacting passengers' travel plans during a busy period [95722, 95697]. (d) value: The cancellations of flights by British Airways were a result of the software failure incident at Heathrow Airport, combined with disruptions caused by Storm Dennis, leading to passengers being entitled to refunds or re-bookings [95722, 95697]. (e) byzantine: There is no specific mention of the software failure incident exhibiting byzantine behavior in the provided articles. (f) other: The software failure incident led to the Heathrow app sending a stream of push notifications about the coronavirus, which may have been unrelated to the IT issue affecting check-in systems, causing additional inconvenience to passengers [95697].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence delay (a) death: People lost their lives due to the software failure (b) harm: People were physically harmed due to the software failure (c) basic: People's access to food or shelter was impacted because of the software failure (d) property: People's material goods, money, or data was impacted due to the software failure (e) delay: People had to postpone an activity due to the software failure (f) non-human: Non-human entities were impacted due to the software failure (g) no_consequence: There were no real observed consequences of the software failure (h) theoretical_consequence: There were potential consequences discussed of the software failure that did not occur (i) other: Was there consequence(s) of the software failure not described in the (a to h) options? What is the other consequence(s)? The consequence of the software failure incident based on the articles is primarily related to delays (e). Passengers experienced flight cancellations, severe delays, and chaos at Heathrow Airport due to the technical issues with the IT systems [95722, 95697]. The delays caused inconvenience, frustration, and disruptions to travel plans for the affected individuals.
Domain transportation (a) The software failure incident at Heathrow Airport affected the transportation industry. The technical issues with the airport's IT system disrupted the departure boards and check-in systems, leading to flight cancellations and severe delays for thousands of passengers [Article 95722, Article 95697]. British Airways, the biggest airline at Heathrow, had to cancel 20 flights due to the IT issues combined with the disruption caused by Storm Dennis [Article 95722]. Passengers faced challenges such as not receiving updated flight information, relying on whiteboards for flight details, and experiencing confusion at the gates due to the IT failures [Article 95722]. The incident impacted the operations of multiple airlines at the airport, highlighting the significant disruption caused in the transportation sector by the software failure incident.

Sources

Back to List