Incident: Starliner Capsule Experienced Thruster Malfunction During Test Flight

Published Date: 2022-05-25

Postmortem Analysis
Timeline 1. The software failure incident with the malfunctioning thrusters on the Starliner spacecraft happened during the test flight that took place on Wednesday, as mentioned in Article 127573. 2. Published on 2022-05-25 3. The incident with the malfunctioning thrusters on the Starliner spacecraft occurred on May 25, 2022.
System 1. Global positioning satellites communication system 2. Capsule's thrusters 3. Service module's thrusters 4. Cooling system [127573]
Responsible Organization 1. Boeing's Starliner spacecraft experienced software failures during its test flight, including glitches during landing and malfunctioning thrusters [127573].
Impacted Organization 1. Boeing's Starliner spacecraft [127573]
Software Causes 1. Software flaws during a previous test flight in December 2019 caused the mission to be cut short without Starliner docking at the space station [127573].
Non-software Causes 1. Malfunctioning thrusters on the Starliner spacecraft during landing [127573]. 2. Stuck valves on the spacecraft that scuttled the countdown for a previous launch attempt [127573].
Impacts 1. The software failure incident during the Starliner spacecraft's landing caused a drop in communication with global positioning satellites and a potential malfunction in one of the capsule's thrusters, impacting the spacecraft's navigation and control systems [127573]. 2. The malfunctioning thrusters on the service module, which were part of the software failure incident, prevented engineers from directly examining them, potentially hindering the root cause analysis and resolution process [127573]. 3. The software flaws experienced during a previous test flight in December 2019 led to the mission being cut short without docking at the space station, highlighting the critical impact of software failures on mission success and objectives [127573].
Preventions 1. Thorough testing and validation of the software before the mission could have potentially prevented the software failure incident [127573]. 2. Implementing robust software quality assurance processes to identify and address any potential software flaws or bugs prior to launch [127573]. 3. Conducting more extensive simulations and scenario testing to uncover any potential issues with the software under various conditions [127573].
Fixes 1. Conduct a thorough investigation to identify the root cause of the software failure incident, particularly focusing on the malfunctioning thrusters and the sluggish cooling system [127573]. 2. Develop and implement software patches or updates to address the identified software flaws that led to the malfunctioning thrusters and cooling system operation issues [127573]. 3. Perform rigorous testing and validation of the software fixes to ensure that the issues with the thrusters and cooling system have been effectively resolved before proceeding with the crewed flight test [127573].
References 1. Steve Stich, manager of the commercial crew program at NASA [127573] 2. Bob Hines, a NASA astronaut currently aboard the space station [127573]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization The software failure incident related to the Starliner spacecraft by Boeing had occurred before within the same organization. During a previous test flight in December 2019, software flaws caused the mission to be cut short without Starliner docking at the space station, leading to the need for a second uncrewed test flight [127573].
Phase (Design/Operation) design, operation (a) The software failure incident related to the design phase can be seen in the article where it mentions that during the troubleshooting on Wednesday, two of the spacecraft’s thrusters failed to provide the expected thrust, and engineers will have to work on that after the flight [127573]. This indicates a design flaw or issue with the thrusters that was present during the development phase. (b) The software failure incident related to the operation phase is evident in the article where it describes that during the approach to the space station, two smaller thrusters failed on Sunday, but they worked as expected when tested on Wednesday [127573]. This highlights a failure related to the operation or functioning of the thrusters during the mission.
Boundary (Internal/External) within_system (a) within_system: The software failure incident related to the Starliner spacecraft included glitches during landing, such as dropping communication with global positioning satellites and potential malfunctioning of one of the capsule's thrusters [127573]. Additionally, during launch, two of the spacecraft's thrusters failed, and during troubleshooting, these thrusters exhibited problems, putting out only about one-quarter of the expected thrust [127573]. These issues were internal to the spacecraft's system and software. (b) outside_system: The article does not mention any contributing factors originating from outside the system that led to the software failure incident.
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident related to non-human actions in the article is the malfunctioning of the Starliner spacecraft's thrusters during landing. The article mentions that during the troubleshooting, the two thrusters exhibited problems, putting out only about one-quarter of the expected thrust, and the smaller thrusters used during the approach to the space station also failed on Sunday [127573]. (b) The software failure incident related to human actions in the article is the software flaws that occurred during a previous test flight in December 2019, causing the mission to be cut short without Starliner docking at the space station. This led to NASA requiring Boeing to undertake a second uncrewed test flight. Additionally, the article mentions that Boeing had to spend months investigating and remediating stuck valves on the spacecraft, which scuttled the countdown for a planned launch in August 2021 [127573].
Dimension (Hardware/Software) hardware, software (a) The software failure incident related to hardware: - The article mentions that during the troubleshooting on Wednesday, two of the spacecraft’s thrusters failed during launch, just before Starliner entered orbit [127573]. - It is also noted that the malfunctioning thrusters were on the service module, which burned up upon re-entering the atmosphere, making it impossible for engineers to directly examine them [127573]. (b) The software failure incident related to software: - The article highlights that software flaws during a previous test flight in December 2019 caused the mission to be cut short without Starliner docking at the space station, leading to the requirement for a second uncrewed test flight [127573]. - Additionally, the article mentions that the cooling system of the Starliner operated a bit sluggishly during the recent test flight, indicating a software-related issue [127573].
Objective (Malicious/Non-malicious) non-malicious (a) The articles do not mention any malicious software failure incident related to the Starliner spacecraft. (b) The software failure incidents mentioned in the articles were non-malicious. The article discusses glitches experienced during landing, communication issues with global positioning satellites, malfunctioning thrusters, and failures of certain thrusters during different stages of the mission. These issues were identified as part of the test flight process to uncover problems and improve the spacecraft's performance for future missions [127573].
Intent (Poor/Accidental Decisions) unknown The articles do not mention any software failure incident related to poor_decisions or accidental_decisions.
Capability (Incompetence/Accidental) development_incompetence (a) The software failure incident related to development incompetence is evident in the article as it mentions software flaws during a previous test flight in December 2019 that caused the mission to be cut short without Starliner docking at the space station. This led to NASA requiring Boeing to undertake a second uncrewed test flight [127573]. (b) The software failure incident related to accidental factors is seen in the article when it mentions that during the troubleshooting on Wednesday, two of the spacecraft's thrusters failed to provide the expected thrust, and other thrusters exhibited problems. This issue arose during the launch when two thrusters failed just before Starliner entered orbit, and other thrusters had to compensate automatically [127573].
Duration permanent, temporary The software failure incident related to the Starliner spacecraft's landing included both temporary and permanent aspects: (a) Permanent: The article mentions that during the troubleshooting on Wednesday, two of the spacecraft's thrusters exhibited problems, putting out only about one-quarter of the expected thrust. These malfunctioning thrusters were on the service module, which burned up upon re-entering the atmosphere, making it impossible for engineers to directly examine them [127573]. (b) Temporary: The article also notes that during the approach to the space station, two smaller thrusters failed on Sunday but worked as expected when tested on Wednesday. Additionally, four other smaller thrusters were tested without any problems [127573].
Behaviour other (a) crash: The software failure incident in the article is not described as a crash where the system loses state and does not perform any of its intended functions [127573]. (b) omission: The software failure incident in the article is not described as an omission where the system omits to perform its intended functions at an instance(s) [127573]. (c) timing: The software failure incident in the article is not described as a timing issue where the system performs its intended functions correctly, but too late or too early [127573]. (d) value: The software failure incident in the article is not described as a failure due to the system performing its intended functions incorrectly [127573]. (e) byzantine: The software failure incident in the article is not described as a byzantine failure where the system behaves erroneously with inconsistent responses and interactions [127573]. (f) other: The software failure incident in the article is described as glitches during landing, communication drop with GPS satellites, malfunctioning thrusters, and issues with the service module, which could be categorized as other unexpected behaviors [127573].

IoT System Layer

Layer Option Rationale
Perception embedded_software The software failure incident related to the perception layer of the cyber physical system in the Starliner spacecraft was primarily associated with the embedded software. The incident involved glitches during landing, including communication issues with global positioning satellites and potential malfunctions in one of the capsule's thrusters [127573]. Additionally, the article mentions that during the troubleshooting process, some thrusters exhibited problems, with two of them failing during the approach to the space station [127573]. This indicates that the failure was not solely related to the actuator or processing unit but rather involved issues with the embedded software controlling the thrusters. Therefore, the software failure incident in the Starliner spacecraft was mainly attributed to contributing factors introduced by embedded software errors.
Communication connectivity_level The software failure incident related to the communication layer of the cyber physical system that failed was at the connectivity_level. The incident involved the Starliner spacecraft experiencing glitches during landing, including dropping communication with global positioning satellites for a while before reconnecting, and a malfunction in one of the capsule's thrusters [127573]. These issues point towards problems introduced at the network or transport layer of the communication system.
Application FALSE The software failure incident mentioned in the articles was not specifically related to the application layer of the cyber physical system. The issues mentioned were primarily related to glitches during landing, malfunctioning thrusters, and a sluggish cooling system, rather than being explicitly attributed to bugs, operating system errors, unhandled exceptions, or incorrect usage typically associated with the application layer [127573].

Other Details

Category Option Rationale
Consequence delay, other (a) death: People lost their lives due to the software failure - There is no mention of any deaths resulting from the software failure incident reported in the articles [127573]. (b) harm: People were physically harmed due to the software failure - There is no mention of any physical harm to individuals due to the software failure incident reported in the articles [127573]. (c) basic: People's access to food or shelter was impacted because of the software failure - There is no mention of people's access to food or shelter being impacted by the software failure incident reported in the articles [127573]. (d) property: People's material goods, money, or data was impacted due to the software failure - The software failure incident involving Boeing's Starliner spacecraft did not result in any direct impact on people's material goods, money, or data [127573]. (e) delay: People had to postpone an activity due to the software failure - The software failure incident did cause delays in the Starliner mission, as mentioned in the article. For example, the mission had to be cut short during a previous test flight in December 2019 due to software flaws, leading to the need for a second uncrewed test flight [127573]. (f) non-human: Non-human entities were impacted due to the software failure - The software failure incident primarily affected the Starliner spacecraft and its systems, such as the malfunctioning thrusters and cooling system. There is no mention of non-human entities being directly impacted [127573]. (g) no_consequence: There were no real observed consequences of the software failure - The software failure incident did have consequences, such as glitches during landing, malfunctioning thrusters, and a sluggish cooling system. These issues required troubleshooting and could impact the timeline for future crewed flights [127573]. (h) theoretical_consequence: There were potential consequences discussed of the software failure that did not occur - The articles do not mention any potential consequences discussed that did not actually occur as a result of the software failure incident [127573]. (i) other: Was there consequence(s) of the software failure not described in the (a to h) options? What is the other consequence(s)? - The software failure incident led to glitches during landing, communication issues with global positioning satellites, malfunctioning thrusters, and a sluggish cooling system. These issues required troubleshooting and could impact the timeline for future crewed flights, as mentioned in the articles [127573].
Domain transportation, knowledge (a) The failed system was intended to support the transportation industry. The software failure incident occurred during the landing of Boeing's Starliner spacecraft, which is a transportation system developed for NASA to take astronauts to and from the International Space Station [127573].

Sources

Back to List