Incident: Air Traffic Control Shutdown in Northeast Due to 1960s Software Glitch

Published Date: 2010-01-05

Postmortem Analysis
Timeline 1. The software failure incident happened in January 2000 [3].
System 1. 1960s software at the air traffic control center in Washington, D.C. [3]
Responsible Organization 1. The glitch in the 1960s computer software at the air traffic control center in Washington, D.C. was responsible for causing the software failure incident [3].
Impacted Organization 1. Airlines in the Northeast [3] 2. Air traffic controllers [3] 3. Flights from Boston, New York, Philadelphia, and Washington, D.C. [3] 4. Flights around the country experiencing related delays [3]
Software Causes 1. The software glitch in a 1960s computer at the air traffic control center in Washington, D.C. that prevented the deletion of flight plans, leading to system overload and shutdown [3].
Non-software Causes 1. The glitch in the 1960s computer at the air traffic control center in Washington, D.C. that slowed and shut down airlines in the Northeast was caused by a hardware issue in the software system [3].
Impacts 1. The software failure incident led to a virtual shutdown of air traffic on the East Coast in January 2000, affecting hundreds of flights from Boston, New York, Philadelphia, and Washington, D.C. [3]. 2. Flights around the country experienced related delays as a ripple effect of the air traffic gridlock caused by the software glitch. [3]. 3. Air traffic controllers had to resort to an old-school method of using paper slips with flight information typed out and carrying them by hand from one controller to another, causing a significant slowdown in the system. [3].
Preventions 1. Regular software maintenance and updates: Regularly updating and maintaining the 1960s software could have potentially prevented the glitch that caused the air traffic control shutdown [3]. 2. Implementation of modern software systems: Upgrading to modern software systems with better error handling and capacity management could have prevented the overload and shutdown of the outdated system [3]. 3. Robust testing procedures: Implementing robust testing procedures to identify and fix software bugs before they cause major disruptions could have prevented the incident [3]. 4. Redundant systems and backup plans: Having redundant systems in place and effective backup plans, such as the use of paper slips as a temporary workaround, could have minimized the impact of the software failure incident [3].
Fixes 1. Implementing software updates or patches to fix the glitch in the 1960s computer software at the air traffic control center in Washington, D.C. [3] 2. Enhancing the system's capacity to handle flight plans more efficiently to prevent overloading and shutdowns. This could involve upgrading hardware or optimizing software algorithms. [3] 3. Implementing redundant systems or backup solutions to ensure continuity of air traffic control operations in case of software failures. For example, maintaining the use of manual paper-based systems as a backup. [3]
References 1. Air traffic controllers at the air traffic control center in Washington, D.C. [3] 2. Washington Center, one of the Air Route Traffic Control Centers [3] 3. Historical background on the development of air traffic control systems in the United States [3]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization, multiple_organization (a) The software failure incident related to air traffic control disruptions due to a glitch in a 1960s computer at the Washington Center in 2000 is an example of a failure that happened within the same organization or system. This incident led to delays and disruptions in air traffic control operations in the Northeast [3]. (b) The article mentions that earlier in the same first week of 2000, a similar kind of computer problem shut down air traffic control in New England, indicating that such incidents have occurred at different locations or organizations within the air traffic control system [3].
Phase (Design/Operation) design (a) The software failure incident described in the article is related to the design phase. The glitch in the 1960s software at the air traffic control center in Washington, D.C., which led to the shutdown of airlines in the Northeast, was caused by a failure in the system's design. The article mentions that the computer was not deleting flight plans as it should have been, causing an overload and subsequent shutdown of the system when new flight plans came in. This design flaw in the software led to the need for air traffic controllers to resort to manual, paper-based methods to handle flight information transfer [3]. (b) The software failure incident is not related to the operation phase or misuse of the system. The article does not mention any issues arising from the operation or misuse of the system contributing to the software failure incident. Instead, the focus is on the design flaw in the 1960s software that caused the system to slow down and eventually shut down, leading to delays in air traffic control operations [3].
Boundary (Internal/External) within_system (a) The software failure incident described in the article is within_system. The glitch in the 1960s computer software at the air traffic control center in Washington, D.C., led to the system being overloaded and shutting down. This glitch prevented the deletion of flight plans, causing a virtual shutdown of air traffic on the East Coast [3]. The issue originated from within the system itself, specifically from the outdated software that was unable to handle the incoming flight plans properly.
Nature (Human/Non-human) non-human_actions (a) The software failure incident described in the article was due to a non-human action, specifically a glitch in a 1960s computer software at the air traffic control center in Washington, D.C. This glitch caused the system to be overloaded and shut down, leading to significant delays in air traffic operations [3]. (b) The software failure incident was not attributed to human actions but rather to a glitch in the 1960s computer software that led to the system overload and shutdown [3].
Dimension (Hardware/Software) hardware, software (a) The software failure incident described in the article was primarily due to a glitch in a 1960s computer system at the air traffic control center in Washington, D.C. This glitch in the hardware caused the computer to stop deleting flight plans as it normally should, leading to an overload and subsequent shutdown of the system [3]. (b) The software failure incident was also attributed to a software problem. The glitch in the 1960s software caused the computer system to become overloaded and shut down when new flight plans were received, disrupting air traffic control operations in the Northeast and leading to significant delays for hundreds of flights [3].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident described in the article is non-malicious. The incident was caused by a glitch in a 1960s computer at the air traffic control center in Washington, D.C., which led to the system being overloaded and shutting down. This glitch prevented the computer from deleting flight plans as new ones came in, causing delays and disruptions in air traffic control operations [3].
Intent (Poor/Accidental Decisions) accidental_decisions (a) The software failure incident described in the article was not due to poor decisions but rather an accidental glitch in a 1960s computer system at the air traffic control center in Washington, D.C. The glitch caused the system to be overloaded and shut down, leading to significant delays in air traffic operations [3]. This incident was not a result of deliberate poor decisions but rather an unintended consequence of a software flaw in the aging system.
Capability (Incompetence/Accidental) accidental (a) The software failure incident described in the article was not due to development incompetence but rather a glitch in a 1960s computer system at the air traffic control center in Washington, D.C. The glitch caused the system to be overloaded and shut down, leading to delays in air traffic operations [3]. (b) The software failure incident was accidental in nature, as it was caused by a glitch in the 1960s software that was not intentionally introduced but rather occurred unexpectedly, impacting the air traffic control operations in the Northeast [3].
Duration temporary The software failure incident described in the article was temporary. The glitch in the 1960s software at the air traffic control center in Washington, D.C., caused the system to be overloaded and shut down, leading to delays in air traffic operations [3]. This incident was not a permanent failure but rather a temporary disruption caused by specific circumstances related to the software glitch.
Behaviour crash, omission (a) crash: The software failure incident described in the article can be categorized as a crash. The glitch in the 1960s computer at the air traffic control center in Washington, D.C., led to the system being overloaded and ultimately shutting down, causing a virtual shutdown of air traffic on the East Coast [3]. (b) omission: The software failure incident also involved an omission. The glitch in the 1960s software prevented the computer from deleting flight plans as it normally would, leading to an accumulation of flight plans and the system being overwhelmed [3]. (c) timing: The timing of the software failure incident can be considered a factor in the overall impact. The incident occurred shortly after the Y2K scare, where fears were focused on the ability of air traffic control computers to handle the year 2000. However, the actual problem that caused the shutdown was related to a software glitch rather than the Y2K issue [3]. (d) value: The software failure incident did not involve a failure related to the system performing its intended functions incorrectly. (e) byzantine: The software failure incident did not exhibit behavior indicative of a byzantine failure. (f) other: The behavior of the software failure incident can be described as a combination of a crash and omission, where the system lost its state and failed to perform its intended functions due to the glitch in the 1960s software at the air traffic control center [3].

IoT System Layer

Layer Option Rationale
Perception processing_unit, embedded_software (a) sensor: The article does not mention any sensor-related failures. (b) actuator: The article does not mention any actuator-related failures. (c) processing_unit: The software failure incident described in the article is related to a glitch in a 1960s computer at the air traffic control center in Washington, D.C. This glitch in the processing unit's software caused the computer to stop deleting flight plans, leading to an overload and subsequent shutdown of the system [3]. (d) network_communication: The article does not mention any network communication-related failures. (e) embedded_software: The software failure incident is attributed to a glitch in the 1960s software used in the air traffic control center, indicating a failure related to embedded software [3].
Communication unknown The software failure incident described in the article does not directly relate to the communication layer of the cyber-physical system. The incident was caused by a glitch in a 1960s computer software at the air traffic control center in Washington, D.C., which led to the system being overloaded and shutting down. This glitch affected the functionality of the software in handling flight plans, leading to delays in air traffic control operations [3]. The failure was more related to the software processing and management rather than issues at the communication layer of the system.
Application TRUE The software failure incident described in the article [3] was related to the application layer of the air traffic control system. The glitch in the 1960s software caused the computer to stop deleting flight plans, leading to an overload and subsequent shutdown of the system. This issue was not due to hardware failure but rather a software problem introduced by a bug in the application layer of the system. The workaround of using paper slips to transfer flight information highlights the impact of the software failure on the operational efficiency of the air traffic control system.

Other Details

Category Option Rationale
Consequence delay (e) delay: People had to postpone an activity due to the software failure. The consequence of the software failure incident described in the article was significant delays in air travel. Due to the glitch in the 1960s computer software at the air traffic control center in Washington, D.C., the system was overloaded and shut down, leading to a virtual shutdown of air traffic on the East Coast. This resulted in hundreds of flights from major cities like Boston, New York, Philadelphia, and Washington, D.C., being directly affected, causing related delays for flights around the country as well [3].
Domain transportation (a) The failed system was intended to support the transportation industry, specifically air traffic control. The incident described in the article pertains to a glitch in a 1960s computer at the air traffic control center in Washington, D.C., which led to significant delays and disruptions in air travel in the Northeast region [3]. The air traffic control system is crucial for managing and directing the movement of aircraft in the skies, highlighting its connection to the transportation industry.

Sources

Back to List