Incident: Computer Error Leaves Astronaut Perched on ISS Ledge

Published Date: 2011-03-01

Postmortem Analysis
Timeline 1. The software failure incident happened on March 1, 2011. [Article 4283]
System The software failure incident in the reported article was attributed to a computer software glitch that caused a technical malfunction during the spacewalk at the International Space Station. The specific system that failed was the work station controlling the robotic arm used by the astronauts outside the ISS. The glitch led to the shutdown of the work station, leaving astronaut Stephen Bowen stranded with a broken cooling pump in his hands for nearly half an hour. The astronauts had to rush to another computer station to resume control of the robotic arm after the initial failure. 1. Work station controlling the robotic arm [Article 4283]
Responsible Organization 1. The computer software glitch was responsible for causing the software failure incident [4283].
Impacted Organization 1. Astronaut Stephen Bowen was impacted by the software failure incident as he was left perched on a ledge at the International Space Station due to the technical glitch [Article 4283].
Software Causes 1. The software cause of the failure incident was a computer software glitch that led to a work station controlling the robot arm shutting down, leaving astronaut Stephen Bowen stranded with a broken cooling pump for nearly half an hour [4283].
Non-software Causes 1. The technical glitch that left astronaut Stephen Bowen perched on a ledge at the International Space Station was caused by a work station controlling the robot arm shutting down, leading to a delay in operations [Article 4283].
Impacts 1. The software failure incident caused astronaut Stephen Bowen to be stuck on a ledge at the International Space Station for nearly half an hour during a spacewalk, impacting the progress of the mission [Article 4283]. 2. The glitch on one of the station's robotic arms left Bowen gripping a broken cooling pump, leading to a delay in the spacewalk operations [Article 4283]. 3. The astronauts had to rush to another computer station with manuals, notes, and laptops to get the robotic arm working again, causing a disruption in the spacewalk activities [Article 4283]. 4. NASA officials attributed the incident to a computer software glitch, which was later corrected, highlighting the importance of software reliability in space missions [Article 4283].
Preventions 1. Proper testing and validation of the software controlling the robotic arm before the spacewalk could have potentially prevented the software failure incident [4283]. 2. Implementing redundant systems or backup procedures in case of software glitches could have helped mitigate the impact of the computer error that left the astronaut stranded during the spacewalk [4283].
Fixes 1. Implementing a software update or patch to correct the computer software glitch that caused the robotic arm control station to shut down [4283].
References 1. NASA officials [4283]

Software Taxonomy of Faults

Category Option Rationale
Recurring unknown (a) The software failure incident having happened again at one_organization: The article does not mention any previous similar incidents within the same organization (NASA) or with its products and services. Therefore, there is no evidence of this specific software failure incident happening again at NASA or with its products and services. (b) The software failure incident having happened again at multiple_organization: The article does not provide information about similar incidents happening at other organizations or with their products and services. Hence, there is no indication of this specific software failure incident occurring again at multiple organizations.
Phase (Design/Operation) design (a) The software failure incident in the article was related to the design phase. The incident occurred due to a computer software glitch that caused a technical glitch on one of the station's robotic arms, leaving astronaut Stephen Bowen stuck with a cooling pump in his hands for nearly half an hour [4283]. The glitch was attributed to a failure in the system development or system updates, as NASA officials later blamed a computer software glitch for the issue. (b) There is no information in the article indicating that the software failure incident was related to the operation phase.
Boundary (Internal/External) within_system (a) within_system: The software failure incident in the article was due to a computer software glitch within the system. NASA officials later confirmed that the glitch was a computer software issue that caused the robotic arm controlling the spacewalk to shut down, leaving astronaut Stephen Bowen stranded with a broken cooling pump for nearly half an hour [4283].
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident in the article was attributed to a computer software glitch, which was a non-human action. NASA officials later confirmed that the glitch had been corrected [4283]. (b) Human actions were involved in responding to the software failure incident. The astronauts operating the robotic arm inside the space station had to rush with all their manuals, notes, and laptops to another computer station in another room when the work station controlling the robot arm shut down. It took them a while to get the second station working, indicating human intervention in addressing the issue [4283].
Dimension (Hardware/Software) software (a) The software failure incident in the article was attributed to a computer software glitch, which is a contributing factor originating in software. NASA officials later confirmed that the glitch had been corrected [4283]. (b) The software failure incident in the article was specifically due to a computer software glitch, indicating that the contributing factor originated in the software [4283].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident in the article was non-malicious. The incident was attributed to a computer software glitch, which caused a technical glitch that left astronaut Stephen Bowen stuck with a broken cooling pump for nearly half an hour during a spacewalk at the International Space Station [4283]. The glitch was not intentional but rather a result of a technical issue within the software controlling the robotic arm.
Intent (Poor/Accidental Decisions) unknown (a) The software failure incident in the article was not due to poor decisions but rather a computer software glitch, as mentioned by NASA officials. The glitch caused a technical issue with the robotic arm, leaving astronaut Stephen Bowen stuck with a cooling pump in his hands for nearly half an hour [4283]. (b) The software failure incident was not due to accidental decisions but rather a computer software glitch that was later corrected by NASA officials. The glitch caused a technical issue with the robotic arm during the spacewalk outside the International Space Station [4283].
Capability (Incompetence/Accidental) accidental (a) The software failure incident in the article was attributed to a computer software glitch, which caused the robotic arm controlling the spacewalk to shut down unexpectedly. This glitch was identified as the reason behind astronaut Stephen Bowen being left stranded on a ledge at the International Space Station for nearly half an hour [4283]. (b) The software failure incident in the article was described as accidental, as it was not intentional but rather a result of an unexpected technical glitch that occurred during the spacewalk operation [4283].
Duration temporary The software failure incident described in Article 4283 was temporary. The glitch on one of the station's robotic arms left astronaut Stephen Bowen stuck with a cooling pump in his hands for nearly half an hour. NASA officials later blamed a computer software glitch for the issue, which was corrected. The astronauts had to rush to another computer station in another room to get the arm working again, indicating a temporary failure [4283].
Behaviour crash (a) crash: The software failure incident in the article can be categorized as a crash. The article mentions that a technical glitch caused the robotic arm controlling system to shut down, leaving astronaut Stephen Bowen stuck with a broken cooling pump for nearly half an hour. This resulted in the system losing its state and not performing its intended function of controlling the robotic arm, leading to Bowen being stranded on a ledge at the International Space Station [Article 4283]. (b) omission: There is no specific mention of the software failure incident omitting to perform its intended functions at an instance(s) in the article. (c) timing: The software failure incident is not related to the system performing its intended functions too late or too early. (d) value: The software failure incident is not described as the system performing its intended functions incorrectly. (e) byzantine: The software failure incident is not characterized by the system behaving erroneously with inconsistent responses and interactions. (f) other: The behavior of the software failure incident in the article can be categorized as a crash, as described above.

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence delay, non-human The consequence of the software failure incident in Article 4283 was a delay. The software glitch in the computer controlling the robotic arm at the International Space Station left astronaut Stephen Bowen stuck with a cooling pump in his hands for nearly half an hour, causing a delay in the spacewalk operation [4283].
Domain knowledge (a) The software failure incident reported in Article 4283 was related to the space exploration industry. The incident occurred during a spacewalk outside the International Space Station (ISS) when a technical glitch left astronaut Stephen Bowen perched on a ledge due to a computer software glitch [4283].

Sources

Back to List