Incident Details

Incident: Inadequacies and Vulnerabilities in U.S. Army's DCGS-A System

Published Date: 2012-08-08

Postmortem Analysis
Timeline	1. The software failure incident with the Distributed Common Ground System-Army (DCGS-A) was reported in an article published on August 8, 2012 [13921]. Therefore, the software failure incident with DCGS-A happened in August 2012.
System	1. Distributed Common Ground System-Army (DCGS-A) [13921]
Responsible Organization	1. The U.S. Army's Distributed Common Ground System-Army (DCGS-A) was responsible for causing the software failure incident [13921].
Impacted Organization	1. U.S. Army's intelligence network in Afghanistan [13921] 2. Marines, Special Operations Forces, and intelligence officers involved in the guerrilla operation to supplant the system with Palantir software [13921] 3. Army's planning directorate [13921]
Software Causes	1. Poor reliability with server failures and reboots every 5.5 hours of test, impacting operational tempo and confidence of operators [13921]. 2. Hardware and software 'ease of use' characteristics negatively impacting operator confidence and increasing frustration, such as multiple open screens required to complete a task, workstation freeze-ups, and data conversion errors [13921]. 3. Identified vulnerabilities in the system that were exploited by the Threat Computer Network Operations Team, leading to concerns about security flaws [13921].
Non-software Causes	1. Lack of reliability in the low operational tempo environment, with server failures every 5.5 hours of testing [13921]. 2. Hardware and software ease of use characteristics negatively impacting operator confidence and increasing frustration, such as multiple open screens required to complete a task, workstation freeze-ups, and data conversion errors [13921]. 3. Concerns about security vulnerabilities identified by the Threat Computer Network Operations Team, leading to potential exploitation of vulnerabilities in the system [13921].
Impacts	1. The software failure incident with the Distributed Common Ground System-Army (DCGS-A) had significant impacts on the U.S. Army's intelligence network in Afghanistan, as it was deemed "Effective with Significant Limitations, Not Suitable, and Not Survivable" by internal testers [13921]. 2. The incident led to operational challenges, with server failures requiring reboots/restarts every 5.5 hours of testing, impacting reliability in low operational tempo environments [13921]. 3. The software's poor ease of use characteristics negatively affected operator confidence and increased frustration, with issues such as multiple open screens required to complete tasks, workstation freeze-ups, and data conversion errors [13921]. 4. Security vulnerabilities were identified by the Threat Computer Network Operations Team, raising concerns about potential exploitation of the system [13921]. 5. The incident sparked controversy within the military community, with efforts to replace DCGS-A with rival software from Palantir, leading to bureaucratic resistance and investigations by the House Oversight Committee [13921].
Preventions	1. Conducting thorough testing and evaluation before deploying the software system could have potentially prevented the failure incident [13921]. 2. Implementing a more user-friendly interface and improving the overall usability of the software could have reduced operator frustration and errors, thus preventing some of the issues [13921]. 3. Addressing and fixing the identified vulnerabilities promptly could have prevented potential security breaches and exploitation by external threats [13921]. 4. Considering alternative software solutions, like the one from Palantir, and conducting a fair comparison to determine the best fit for the Army's intelligence network could have potentially prevented the issues with the current system [13921].
Fixes	1. Conduct a thorough security audit and address the identified vulnerabilities to enhance the system's resilience against potential exploits [13921]. 2. Implement measures to improve the reliability of the system, such as reducing server failures and minimizing reboots/restarts during operations [13921]. 3. Simplify the user interface and streamline the software's usability to boost operator confidence and reduce frustration, potentially by adopting some of the intuitive features found in Palantir's software [13921]. 4. Develop a timeline for mitigating the identified vulnerabilities promptly to enhance the overall security posture of the system [13921].
References	1. Army's internal testers 2. Army Test and Evaluation Command 3. Maj. Gen. Genaro Dellarocco 4. Washington Times 5. House Oversight Committee

Software Taxonomy of Faults

Category	Option	Rationale
Recurring	one_organization, multiple_organization	(a) The software failure incident related to the Distributed Common Ground System-Army (DCGS-A) has been ongoing within the U.S. Army. The system has been described as difficult to operate, prone to crashes, extremely hackable, and not survivable in its current state [13921]. (b) The incident involving the DCGS-A system has also sparked controversy within the military community, leading to a network of Marines, Special Operations Forces, and intelligence officers attempting to replace the system with rival software from the Silicon Valley start-up Palantir. This indicates that similar incidents or dissatisfaction with software systems have occurred in multiple organizations within the military sector [13921].
Phase (Design/Operation)	design, operation	(a) The software failure incident related to the design phase is evident in the article. The Distributed Common Ground System-Army (DCGS-A) was criticized for being difficult to operate, prone to crashes, extremely hackable, and having poor reliability in the low operational tempo environment. The system's hardware and software characteristics negatively impacted operator confidence and increased frustration, with issues such as multiple open screens required to complete a task, workstation freeze-ups, and data conversion errors [13921]. (b) The software failure incident related to the operation phase is also highlighted in the article. The Army Test and Evaluation Command identified vulnerabilities in the DCGS-A that were exploited by the Threat Computer Network Operations Team. The testers recommended issuing a warning to all units using DCGS-A about one vulnerability and putting together a timeline for mitigating the rest as soon as possible to improve the system's survivability [13921].
Boundary (Internal/External)	within_system, outside_system	(a) within_system: The software failure incident related to the Distributed Common Ground System-Army (DCGS-A) was primarily due to factors originating from within the system itself. The Army's internal testers found the system to be difficult to operate, prone to crashes, extremely hackable, and with poor reliability in low operational tempo environments [13921]. Additionally, the system had issues with hardware and software ease of use, such as requiring multiple open screens to complete a task, workstation freeze-ups, and data conversion errors, which negatively impacted operator confidence and increased frustration [13921]. (b) outside_system: The software failure incident did not primarily involve contributing factors originating from outside the system. However, there were mentions of a network of Marines, Special Operations Forces, and intelligence officers attempting to supplant the DCGS-A system with rival software from a Silicon Valley start-up, Palantir. This external competition and push for an alternative system added to the controversy surrounding DCGS-A [13921].
Nature (Human/Non-human)	non-human_actions, human_actions	(a) The software failure incident in the U.S. Army's Distributed Common Ground System-Army (DCGS-A) was attributed to non-human actions such as poor reliability, server failures, hardware and software ease of use issues, and vulnerabilities that were exploited by the Threat Computer Network Operations Team [13921]. (b) Human actions also played a role in the software failure incident as there were controversies within the military community regarding the DCGS-A system and efforts by certain officers to replace it with rival software from Palantir. Bureaucrats from the Army's planning directorate were involved in blasting those officers as Palantir stooges, rescinding reports recommending Palantir servers, and ordering the destruction of all copies. Additionally, the Army had signed a cooperative research agreement with Palantir to potentially improve DCGS-A based on Palantir's simplicity [13921].
Dimension (Hardware/Software)	hardware, software	(a) The software failure incident related to hardware: - The article mentions that poor reliability was observed in the low operational tempo environment, with server failures that resulted in reboots/restarts every 5.5 hours of test [13921]. - The hardware and software 'ease of use' characteristics negatively impacted operator confidence and increased their frustration, with issues like multiple open screens required to complete a single task, workstation freeze-ups, and data conversion errors [13921]. (b) The software failure incident related to software: - The Distributed Common Ground System-Army (DCGS-A) was described as difficult to operate, prone to crashes, and extremely hackable [13921]. - The Army Test and Evaluation Command survey found Palantir's software to be more stable and more intuitive to operate compared to the "overcomplicated" DCGS-A [13921].
Objective (Malicious/Non-malicious)	malicious	(a) The software failure incident in the article is related to malicious factors introduced by humans with the intent to harm the system. The article mentions that the Distributed Common Ground System-Army (DCGS-A) was extremely hackable, and the Threat Computer Network Operations Team was able to identify and exploit several vulnerabilities with DCGS-A [13921]. Additionally, there were concerns about security flaws in the system, prompting recommendations for issuing warnings to all units and mitigating vulnerabilities as soon as possible to make the system survivable with limitations [13921].
Intent (Poor/Accidental Decisions)	poor_decisions, accidental_decisions	(a) The intent of the software failure incident related to poor decisions can be seen in the article. The Distributed Common Ground System-Army (DCGS-A) was described as difficult to operate, prone to crashes, extremely hackable, and not survivable according to the Army's own internal testers. Despite these issues, there was controversy within the military community as bureaucrats from the Army's planning directorate resisted efforts to replace DCGS-A with rival software from Palantir, a Silicon Valley start-up. The decision to resist change and continue with a system that was deemed ineffective and vulnerable can be considered a poor decision contributing to the software failure incident [13921]. (b) The software failure incident can also be attributed to accidental decisions or unintended consequences. The Army Test and Evaluation Command chief, Maj. Gen. Genaro Dellarocco, highlighted various issues with DCGS-A, such as poor reliability, server failures, operator frustration due to complex operations, and potential security vulnerabilities. These issues were not intentional but rather a result of the system's design and implementation, leading to unintended consequences that contributed to the failure of the software [13921].
Capability (Incompetence/Accidental)	development_incompetence	(a) The software failure incident related to development incompetence is evident in the case of the Distributed Common Ground System-Army (DCGS-A) mentioned in Article 13921. The Army's internal testers found the system to be difficult to operate, prone to crashes, extremely hackable, and not survivable. The system was described as "a piece of junk" and had poor reliability, with server failures occurring every 5.5 hours of testing. Additionally, the system's hardware and software characteristics negatively impacted operator confidence and increased frustration, with multiple open screens required to complete a single task, leading to freeze-ups and data transfer errors. The Army Test and Evaluation Command survey found Palantir's software to be more stable and intuitive compared to the overcomplicated DCGS-A [13921]. (b) The software failure incident related to accidental factors is not explicitly mentioned in the article.
Duration	temporary	The software failure incident described in the article is more aligned with a temporary failure rather than a permanent one. The article discusses specific contributing factors that led to the failure, such as the system being difficult to operate, prone to crashes, extremely hackable, and having poor reliability in certain operational conditions [13921]. Additionally, there are recommendations for issuing warnings about vulnerabilities and putting together timelines for mitigating these issues, indicating that there are steps that can be taken to address the problems and potentially improve the system's survivability with limitations [13921].
Behaviour	crash, other	(a) crash: The software failure incident mentioned in the articles includes crashes as a significant issue. The Distributed Common Ground System-Army (DCGS-A) experienced server failures that resulted in reboots/restarts every 5.5 hours of testing, indicating a high frequency of crashes [13921]. (b) omission: The software failure incident does not specifically mention instances of the system omitting to perform its intended functions at a particular instance. However, the overall difficulty in operating the system, multiple open screens required to complete a single task, and potential data transfer errors due to converting data into different formats could potentially lead to instances of omission [13921]. (c) timing: The software failure incident does not directly relate to timing issues where the system performs its intended functions too late or too early. The focus of the incident is more on the reliability, ease of use, and security vulnerabilities of the DCGS-A system [13921]. (d) value: The software failure incident does not explicitly mention the system performing its intended functions incorrectly in terms of providing incorrect outputs or results. The primary concerns highlighted in the incident are related to reliability, ease of use, and security vulnerabilities of the DCGS-A system [13921]. (e) byzantine: The software failure incident does not describe the system behaving with inconsistent responses and interactions that would align with a byzantine behavior. The main issues highlighted are related to crashes, difficulty in operation, and potential security vulnerabilities of the DCGS-A system [13921]. (f) other: The other behavior observed in the software failure incident is the system requiring multiple open screens to complete a single task, experiencing workstation freeze-ups due to multiple windows being open, and the need to convert data into different formats which added steps and may have introduced data transfer errors. This behavior falls under the category of operational inefficiencies and usability issues [13921].

IoT System Layer

Layer	Option	Rationale
Perception	None	None
Communication	None	None
Application	None	None

Other Details

Category	Option	Rationale
Consequence	theoretical_consequence	(a) unknown (b) unknown (c) unknown (d) unknown (e) unknown (f) unknown (g) no_consequence (h) theoretical_consequence: The software failure incident with the Distributed Common Ground System-Army (DCGS-A) in Afghanistan was reported to be "Effective with Significant Limitations, Not Suitable, and Not Survivable" by the Army Test and Evaluation Command. The system was described as difficult to operate, prone to crashes, extremely hackable, and with poor reliability. There were concerns about vulnerabilities that could be exploited by the Threat Computer Network Operations Team. Recommendations were made to issue warnings about vulnerabilities and to mitigate them promptly to improve the survivability of the system [13921].
Domain	information, government	(a) The failed system, DCGS-A, was intended to support the information industry by serving as the primary source for mining intelligence and surveillance data on the battlefield in Afghanistan [13921]. The system was designed to handle a vast amount of data from various sources, including informants' tips, drone camera footage, and militants' recorded phone calls [13921].

Sources

Brain, Damaged: Army Says Its Software Mind Is 'Not Survivable' - WIRED - Published on: 2012-08-08
Article ID: 13921

Back to List