Incident: Air Traffic Control System Failure Caused by U-2 Spy Plane

Published Date: 2014-05-11

Postmortem Analysis
Timeline 1. The software failure incident happened in April 2014. - The incident was reported in Article 26710, published on 2014-05-11, mentioning the failure occurred on April 30, 2014.
System 1. En Route Automation Modernization (ERAM) system [Article 26710] 2. Computer system involved in the air traffic control for Los Angeles and surrounding area, part of a system known as Eram [Article 26590]
Responsible Organization 1. The U-2 spy plane passing overhead caused the air traffic control system to crash, leading to the software failure incident [Article 26590]. 2. A common design problem in the U.S. air traffic control system, specifically related to a vulnerability that allowed the U-2 spy plane to trigger the computer glitch, was also responsible for the software failure incident [Article 26710].
Impacted Organization 1. Air traffic control for Los Angeles and a wide surrounding area [Article 26590] 2. Hundreds of services being grounded [Article 26590] 3. Bob Hope airport in Burbank, California; John Wayne airport in Santa Ana, California; and McCarran International in Las Vegas [Article 26590] 4. Flights in other parts of the country bound for a wide swath of airspace in the southwestern US [Article 26590] 5. A broad swath of the southwestern United States, from the West Coast to western Arizona and from southern Nevada to the Mexico border [Article 26710]
Software Causes 1. The software failure incident was caused by a common design problem in the U.S. air traffic control system, which made it possible for a U-2 spy plane to spark a computer glitch due to a vulnerability in the system [Article 26710]. 2. The error was triggered by a lack of altitude information in the U-2's flight plan, which led to the system cycling off and on trying to fix the error, overwhelming the software [Article 26710]. 3. The ERAM system failed because it limits how much data each plane can send it, and the complex flight plan of the U-2 operating at high altitude exceeded this limit, causing the system to fail [Article 26710]. 4. The flight plan of the U-2 did not contain an altitude, leading the system to consider all altitudes between ground level and infinity, generating error messages and causing the system to cycle through restarts [Article 26710].
Non-software Causes 1. The U-2 spy plane flying at high altitude under visual flight rules caused the air traffic control system to crash due to a computer glitch, as it was perceived as a low-altitude operation by the system [Article 26590]. 2. The design problem in the U.S. air traffic control system allowed the U-2 spy plane to trigger the computer glitch, leading to the grounding or delay of hundreds of flights in the Los Angeles area [Article 26710].
Impacts 1. The software failure incident caused the air traffic control for Los Angeles and a wide surrounding area to crash, leading to hundreds of services being grounded [Article 26590]. 2. Flights at Bob Hope airport in Burbank, California; John Wayne airport in Santa Ana, California; and McCarran International in Las Vegas were among other facilities affected by the order to keep planes grounded [Article 26590]. 3. The error blanked out a broad swath of the southwestern United States, from the West Coast to western Arizona and from southern Nevada to the Mexico border, leading to numerous flights being delayed or canceled [Article 26710]. 4. The system failure required air traffic controllers to resort to emergency back-up procedures, manually calling each other to keep track of planes already flying in the busy airspace while the system was rebooted and fixed [Article 26590]. 5. The incident prompted the FAA to change the way flight plans were logged to prevent a similar issue in the future, requiring specific altitude information for each flight plan and adding more memory to the system for processing flights [Article 26590]. 6. The failure led to a temporary switch to a back-up system by air traffic controllers in the regional center, using paper slips and telephones to relay information about planes to other control centers [Article 26710].
Preventions 1. Properly testing the software system to identify vulnerabilities and design flaws before deployment could have prevented the software failure incident [Article 26710]. 2. Implementing stricter validation checks for flight plans, including ensuring that altitude information is provided for each flight plan, could have prevented the system from being overwhelmed and crashing [Article 26590, Article 26710]. 3. Enhancing the system's memory capacity to handle a larger volume of data processing could have prevented the interruption of other flight-processing functions and subsequent failure [Article 26590, Article 26710]. 4. Regularly updating and improving the software system based on lessons learned from incidents like this to enhance its resilience and prevent similar failures in the future [Article 26710].
Fixes 1. Implementing a fix to require specific altitude information for each flight plan and adding more memory for processing flights [26590]. 2. Conducting robust testing on the En Route Automation Modernization (ERAM) system to identify and address vulnerabilities [26710]. 3. Setting limits on the amount of data each plane can send to the system to prevent exceeding system capabilities [26710]. 4. Enhancing the system's design to handle complex flight plans without exceeding data limits [26710]. 5. Continuously monitoring and addressing potential weaknesses in the system, such as flight plans that could cause failures [26710].
References 1. Federal Aviation Administration (FAA) [Article 26590, Article 26710] 2. Pentagon [Article 26590] 3. National Air Traffic Controllers Association [Article 26590, Article 26710] 4. Lockheed Martin Corp [Article 26710] 5. Security experts [Article 26710] 6. Former military and commercial pilots [Article 26710] 7. Laura Brown, FAA spokeswoman [Article 26710] 8. Nate Pair, president of the Los Angeles Center for the National Air Traffic Controllers Association [Article 26710]

Software Taxonomy of Faults

Category Option Rationale
Recurring unknown (a) The software failure incident related to the U-2 spy plane causing air traffic control issues in Los Angeles was not specifically mentioned to have happened again within the same organization or with its products and services. The incident was attributed to a design problem in the U.S. air traffic control system and a vulnerability that could have been exploited by an attacker [Article 26590, Article 26710]. (b) The articles did not mention any specific instances of a similar software failure incident happening at other organizations or with their products and services. The focus was primarily on the specific incident involving the U-2 spy plane and the air traffic control system in the southwestern United States [Article 26590, Article 26710].
Phase (Design/Operation) design, operation (a) The software failure incident was primarily related to a design issue in the U.S. air traffic control system that allowed a U-2 spy plane to trigger a computer glitch, leading to the grounding or delay of hundreds of flights in the Los Angeles area [26710]. The vulnerability in the system was exploited by the lack of altitude information in the U-2's flight plan, causing the system to cycle off and on in an attempt to fix the error. This design flaw in the system, specifically in the En Route Automation Modernization (ERAM) system made by Lockheed Martin Corp, led to the disruption in air traffic control operations [26710]. (b) The operation of the system during the incident involved air traffic controllers switching to a backup system to continue monitoring planes, using paper slips and telephones to relay information about flights to other control centers [26710]. This operational response was necessary to maintain control and communication during the software failure incident caused by the design flaw in the system. Additionally, the FAA later adjusted the system to require altitudes for every flight plan and added memory to prevent similar problems in the future, indicating operational changes made post-incident to enhance system reliability [26710].
Boundary (Internal/External) within_system (a) within_system: The software failure incident was primarily caused by factors originating from within the system itself. The incident was triggered by a lack of altitude information in the U-2 spy plane's flight plan, which led to the system cycling off and on trying to fix the error [Article 26710]. The system's limitation on how much data each plane could send it, coupled with the complex flight plan of the U-2 operating at high altitude, contributed to the failure [Article 26710]. The FAA later adjusted the system to require altitudes for every flight plan and added memory to prevent such problems in the future [Article 26710]. (b) outside_system: There is no explicit mention in the articles of contributing factors originating from outside the system that led to the software failure incident.
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident occurring due to non-human actions: - The software failure incident was triggered by a U-2 spy plane passing overhead, causing the air traffic control system to crash [26590]. - The computer glitch was sparked by a lack of altitude information in the U-2's flight plan, overwhelming the system and leading to the failure [26710]. - The error caused a broad swath of the southwestern United States to be affected, from the West Coast to western Arizona and from southern Nevada to the Mexico border [26710]. (b) The software failure incident occurring due to human actions: - The error in the system was due to a common design problem in the U.S. air traffic control system, which made it possible for the U-2 spy plane to trigger the computer glitch [26710]. - The flight plan for the U-2 plane did not contain an altitude, and when a controller entered the altitude, it led to the system considering all altitudes between ground level and infinity, causing the failure [26710]. - Former military and commercial pilots mentioned that flight plans are generally carefully checked and manually entered into the air traffic control computers, indicating a potential human error in entering the flight plan data [26710].
Dimension (Hardware/Software) hardware, software (a) The software failure incident occurring due to hardware: - The incident involving the U-2 spy plane causing the air traffic control system to crash was attributed to a common design problem in the U.S. air traffic control system, which made it possible for the U-2 to trigger a computer glitch [Article 26710]. - The error was triggered by a lack of altitude information in the U-2's flight plan, which caused the system to cycle off and on trying to fix the error [Article 26710]. (b) The software failure incident occurring due to software: - The software failure incident was primarily attributed to a software issue where the computer perceived the U-2 spy plane as a low-altitude operation and began rerouting it down to 10,000 feet, overwhelming the system with adjustments to other planes' routes [Article 26590]. - The FAA mentioned that the system used a large amount of available memory and interrupted the computer's other flight-processing functions, indicating a software-related issue [Article 26590]. - The FAA later adjusted the system to require specific altitude information for each flight plan and added more memory to prevent such problems in the future, highlighting a software-related solution to the issue [Article 26710].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident related to the U-2 spy plane causing air traffic control systems to crash was non-malicious. The incident was triggered by a design problem in the U.S. air traffic control system, specifically a vulnerability that allowed the U-2's flight plan lacking altitude information to overwhelm the system, leading to a computer glitch and subsequent grounding or delays of hundreds of flights in the Los Angeles area [Article 26590, Article 26710]. (b) The software failure incident was not caused by a malicious attack but rather by a routine programming mistake and a complex flight plan that exceeded the system's data processing capabilities. The incident was not intentional but rather a result of the system's limitations being exceeded due to the unique circumstances of the U-2's flight plan and the design flaw in the air traffic control system [Article 26710].
Intent (Poor/Accidental Decisions) accidental_decisions (a) The software failure incident related to the U-2 spy plane causing air traffic control systems to crash in Los Angeles and surrounding areas was primarily due to poor decisions. The incident was triggered by a design problem in the U.S. air traffic control system that made it possible for the U-2 spy plane to spark a computer glitch [Article 26710]. The vulnerability in the system allowed the error to occur, leading to the grounding or delay of hundreds of flights. The error was caused by a lack of altitude information in the U-2's flight plan, which overwhelmed the software and led to the system cycling off and on in an attempt to fix the issue. The incident highlighted a basic limitation of the system and the need for better testing and identification of such vulnerabilities before deployment. Additionally, the incident raised concerns about potential cyber-attacks on aviation systems, indicating that the failure was made possible by a routine programming mistake that should have been identified in testing [Article 26710]. Security experts emphasized the importance of addressing such vulnerabilities to prevent similar failures in the future. The incident demonstrated that the flight plan itself could be considered an 'attack surface' if it could cause the automated system to fail, indicating a need for improved system resilience against such scenarios.
Capability (Incompetence/Accidental) development_incompetence (a) The software failure incident was related to development incompetence as it was caused by a common design problem in the U.S. air traffic control system that made it possible for a U-2 spy plane to spark a computer glitch [Article 26710]. The error was triggered by a lack of altitude information in the U-2's flight plan, which overwhelmed the software and caused it to cycle off and on trying to fix the error. The system failed because it limits how much data each plane can send it, and the complex flight plan of the U-2 exceeded that limit, leading to the failure [Article 26710]. (b) The software failure incident was accidental as it was not a deliberate shut-down but rather a result of a vulnerability in the air traffic control system that was exploited by the U-2 spy plane's flight plan lacking altitude information [Article 26710]. The incident was not caused by any signal from the plane's equipment but rather by the system's inability to handle the unexpected data overload caused by the U-2's flight plan [Article 26590].
Duration temporary From the provided articles [26590, 26710], the software failure incident related to the U-2 spy plane causing air traffic control issues in the Los Angeles area was temporary. The incident was temporary because the system failed due to specific circumstances related to the U-2's flight plan lacking altitude information, which overwhelmed the software and triggered the glitch. The FAA was able to resolve the issue within an hour and implemented changes to prevent similar problems in the future, such as requiring specific altitude information for each flight plan and adding more memory to the system [26590, 26710].
Behaviour crash, omission, value (a) crash: The software failure incident in the articles can be categorized as a crash. The incident led to the air traffic control system for Los Angeles and surrounding areas to crash, causing hundreds of services to be grounded [26590]. The system had to be rebooted and fixed, and controllers had to resort to emergency back-up procedures while the software was being addressed [26590]. (b) omission: The software failure incident can also be related to omission. The error in the system was triggered by a lack of altitude information in the U-2 spy plane's flight plan, which led to the system cycling off and on trying to fix the error [26710]. The flight plan did not contain an altitude for the flight, causing the system to consider all altitudes between ground level and infinity, leading to error messages and system restarts [26710]. (c) timing: The software failure incident does not seem to be related to timing issues. The system did not exhibit failures due to performing its intended functions too late or too early. (d) value: The software failure incident can be associated with a value issue. The system failed because it exceeded the limit of how much data each plane could send it, particularly when dealing with a complex flight plan like that of the U-2 spy plane [26710]. The system was unable to handle the complexity of the flight plan, leading to the failure. (e) byzantine: The software failure incident does not align with a byzantine behavior. The system did not exhibit inconsistent responses or interactions that would classify it as a byzantine failure. (f) other: The software failure incident can be categorized as a crash and omission, as explained above.

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence property, delay, non-human, theoretical_consequence (a) death: People lost their lives due to the software failure - No mention of any deaths resulting from the software failure incident in the articles [26590, 26710]. (b) harm: People were physically harmed due to the software failure - No mention of any physical harm to individuals due to the software failure incident in the articles [26590, 26710]. (c) basic: People's access to food or shelter was impacted because of the software failure - No mention of people's access to food or shelter being impacted by the software failure incident in the articles [26590, 26710]. (d) property: People's material goods, money, or data was impacted due to the software failure - The software failure incident led to the grounding and delay of hundreds of flights in the Los Angeles area, affecting travelers and causing inconvenience [26590, 26710]. (e) delay: People had to postpone an activity due to the software failure - The software failure incident resulted in numerous flights being delayed or canceled, causing disruptions to air travel schedules [26710]. (f) non-human: Non-human entities were impacted due to the software failure - The software failure incident affected the air traffic control system, leading to the grounding of flights and the need for manual coordination between controllers [26590, 26710]. (g) no_consequence: There were no real observed consequences of the software failure - The software failure incident had real consequences, including flight delays and cancellations, as well as the need for manual coordination and system adjustments [26590, 26710]. (h) theoretical_consequence: There were potential consequences discussed of the software failure that did not occur - The articles discuss the potential for a deliberate shut-down using the same vulnerability that caused the U-2 incident, but it was noted to be difficult to replicate the exact conditions for such an attack [26710]. (i) other: Was there consequence(s) of the software failure not described in the (a to h) options? What is the other consequence(s)? - No other specific consequences of the software failure incident were mentioned in the articles [26590, 26710].
Domain transportation The software failure incident reported in the articles is related to the transportation industry. The incident specifically affected the U.S. air traffic control system, causing hundreds of flights in the Los Angeles area to be grounded or delayed [Article 26590, Article 26710]. The system involved in the failure was the En Route Automation Modernization (ERAM) system, which is a critical component of the air traffic control system used to manage and monitor air traffic in the U.S. [Article 26710]. The failure of the ERAM system due to a computer glitch caused by a U-2 spy plane flying at high altitude without proper altitude information in its flight plan disrupted air traffic control operations in a broad swath of the southwestern United States [Article 26710]. The incident highlighted a vulnerability in the air traffic control system that could potentially be exploited by attackers to cause deliberate shutdowns, although replicating the exact conditions of the incident would be challenging [Article 26710]. Overall, the software failure incident directly impacted the transportation industry by disrupting air travel operations in the affected region.

Sources

Back to List