Incident: Vulnerabilities in Treck's TCP-IP Stack Lead to Ripple20 Incident

Published Date: 2020-06-16

Postmortem Analysis
Timeline 1. The software failure incident, known as Ripple20, happened before the article was published on June 16, 2020 [101208]. Therefore, the incident occurred before June 2020.
System 1. Treck's TCP-IP stack - contained hackable vulnerabilities [101208] 2. Various internet-of-things devices from over 10 manufacturers including HP, Intel, Rockwell Automation, Caterpillar, and Schneider Electric - affected by the Ripple20 vulnerabilities [101208]
Responsible Organization 1. Treck, the Ohio-based software company, was responsible for causing the software failure incident by having a code riddled with hackable vulnerabilities [101208].
Impacted Organization 1. Hundreds of millions of gadgets across the globe, including medical devices, printers, power grid equipment, and railway equipment, were impacted by the software failure incident [101208].
Software Causes 1. The software failure incident was caused by a collection of vulnerabilities in the TCP-IP stack code sold by Treck, a software company, which were identified by Israeli security firm JSOF [101208].
Non-software Causes 1. Lack of standardization and safe coding guidelines in the industry [101208] 2. Insecure coding practices that made the vulnerabilities exploitable [101208]
Impacts 1. The software failure incident known as Ripple20 resulted in the discovery of 19 hackable bugs in software code sold by Treck, affecting hundreds of millions of gadgets globally, including medical devices, printers, power grid equipment, and railway equipment [101208]. 2. The vulnerabilities in the software code allowed hackers to potentially take complete control of affected devices, run their own commands remotely, or leak sensitive information, posing severe security risks [101208]. 3. The Cybersecurity and Infrastructure Security Agency (CISA) rated six of the 19 bugs as severe, with two bugs scoring a 10 out of 10 on the CVSS score, indicating the critical nature of the vulnerabilities [101208]. 4. The impacted devices ranged from power supply systems in data centers to programmable logic controllers in power grids and manufacturing, as well as medical infusion pumps, highlighting the diverse range of devices at risk due to the software failure incident [101208]. 5. The incident raised concerns about the potential for sabotage in critical infrastructure sectors like power utilities, railway, manufacturing, and medical environments, as hackers could exploit the vulnerabilities for espionage or malicious activities once inside the network [101208].
Preventions 1. Conducting thorough vulnerability analysis and security testing of the software code before deployment could have prevented the software failure incident [101208]. 2. Following safe coding guidelines and standards recommended by cybersecurity authorities like the US Computer Emergency Response Team and the Department of Defense could have helped in identifying and fixing the vulnerabilities in the software code [101208]. 3. Regularly updating and patching software to address known vulnerabilities could have mitigated the risk of exploitation by hackers [101208].
Fixes 1. Patching vulnerable devices with software updates to protect them from attacks [101208]. 2. Implementing defensive measures such as firewalls and removing connections to the public internet to minimize the risk of exploitation of vulnerabilities [101208]. 3. Conducting thorough vulnerability analysis and following safe coding guidelines to prevent similar incidents in the future [101208].
References 1. Security experts 2. Israeli security firm JSOF 3. Treck (Ohio-based software company) 4. Cybersecurity and Infrastructure Security Agency (CISA) 5. Red Balloon Security 6. Shodan (search engine for internet-connected devices) 7. Digi (embedded device firm) 8. Intel 9. HP 10. US Computer Emergency Response Team 11. Department of Defense [101208]

Software Taxonomy of Faults

Category Option Rationale
Recurring multiple_organization (a) The software failure incident related to the Ripple20 vulnerabilities has affected multiple organizations. The vulnerabilities were found in software code sold by Treck, a provider of software used in internet-of-things devices, and were present in devices from more than 10 manufacturers, including HP, Intel, Rockwell Automation, Caterpillar, and Schneider Electric [101208]. (b) The incident has not been specifically mentioned to have happened again within the same organization (Treck) or with its products and services.
Phase (Design/Operation) design, operation (a) The software failure incident in the article is primarily related to the design phase. The incident occurred due to vulnerabilities in the software code developed by a company called Treck, which is a provider of software used in internet-of-things devices. The vulnerabilities in the TCP-IP stack developed by Treck were found in devices from more than 10 manufacturers, affecting hundreds of millions of gadgets globally [101208]. (b) The software failure incident also has implications for the operation phase. The vulnerabilities discovered in the software code could allow hackers to take complete control of affected devices, leading to potential paralysis or the execution of malicious code. This poses a significant risk, especially for devices in critical sectors such as power utilities, railway systems, manufacturing, and medical environments. The need for software updates and patches to protect vulnerable devices highlights the importance of proper operation and maintenance practices to mitigate the risks posed by such vulnerabilities [101208].
Boundary (Internal/External) within_system (a) The software failure incident related to the Ripple20 vulnerabilities can be categorized as within_system. The vulnerabilities were found in the TCP-IP stack code provided by Treck, a software company, which was integrated into various devices from different manufacturers [101208]. The vulnerabilities were inherent to the code itself and were present in a wide range of devices, indicating that the failure originated from within the system.
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident in this case was primarily due to non-human actions, specifically vulnerabilities in the software code provided by the Ohio-based software company Treck. These vulnerabilities, known as Ripple20, were identified by security researchers from JSOF and were found in the TCP-IP protocol stack used in a wide range of internet-of-things devices from various manufacturers [101208]. (b) Human actions also played a role in this software failure incident as the vulnerabilities in the software code were introduced by the developers at Treck. The insecure coding practices and lack of proper vulnerability analysis allowed these vulnerabilities to exist in the software used in hundreds of millions of devices across different industries [101208].
Dimension (Hardware/Software) software (a) The software failure incident reported in the articles is primarily due to contributing factors that originate in software. The incident involves a collection of vulnerabilities named Ripple20 found in software designed to enable internet connections, specifically in the TCP-IP stack provided by a company called Treck. These vulnerabilities have affected hundreds of millions of devices globally, from medical devices to power grid equipment, due to flaws in the software code [101208]. The vulnerabilities in the software have allowed hackers to potentially take complete control of the affected devices, leading to serious security risks [101208]. (b) The software failure incident is not directly attributed to hardware issues but rather to vulnerabilities in the software code provided by Treck. The vulnerabilities in the TCP-IP stack have led to a situation where a significant number of devices across various industries are at risk of being hacked or controlled by malicious actors [101208]. The incident underscores the importance of addressing software vulnerabilities to prevent widespread security breaches in connected devices.
Objective (Malicious/Non-malicious) malicious (a) The software failure incident related to the Ripple20 vulnerabilities can be categorized as malicious. Security experts have warned about the vulnerabilities in the software designed to enable internet connections, which have ended up in hundreds of millions of gadgets across the globe, making them vulnerable to attacks by hackers [101208]. The vulnerabilities in the code sold by Treck, a provider of software used in internet-of-things devices, were identified by researchers, allowing hackers to potentially take complete control of affected devices, run their own commands remotely, or leak sensitive information [101208]. The Cybersecurity and Infrastructure Security Agency rated some of the vulnerabilities as severe, with two bugs scoring a 10 out of 10 on the CVSS score [101208]. The potential for sabotage or espionage through exploiting these vulnerabilities highlights the malicious nature of the software failure incident [101208]. (b) The software failure incident related to the Ripple20 vulnerabilities can also be categorized as non-malicious. The vulnerabilities in the TCP-IP stack code sold by Treck were discovered during a security analysis of a single device, indicating that the bugs were unintentionally present in a wide range of connected devices due to the complex supply chain involved in IoT devices [101208]. While the vulnerabilities were not intentionally introduced to harm the systems, they still posed a significant risk to the security of the affected devices, requiring software updates to patch the flaws and protect them from potential attacks [101208]. The prevalence of these vulnerabilities across various devices for years underscores the challenges in ensuring the security of interconnected systems in the internet of things ecosystem [101208].
Intent (Poor/Accidental Decisions) poor_decisions, accidental_decisions From the provided article [101208], the software failure incident related to the Ripple20 vulnerabilities can be attributed to both poor decisions and accidental decisions: (a) poor_decisions: The incident can be linked to poor decisions in terms of insecure coding practices that made the Ripple20 bugs exploitable. The vulnerabilities in the software were a result of inadequate security measures and lack of adherence to safe coding guidelines, which should have been followed to prevent such issues [101208]. (b) accidental_decisions: The incident also involves accidental decisions or unintended consequences, as the vulnerabilities were not intentionally introduced but rather stemmed from mistakes in the coding and security practices of the software. The bugs were not deliberately included but were a byproduct of the software development process [101208].
Capability (Incompetence/Accidental) development_incompetence (a) The software failure incident reported in the articles is related to development incompetence. The incident involved a collection of vulnerabilities named Ripple20, which were identified in code sold by a software company called Treck. These vulnerabilities were found in the devices of more than 10 manufacturers, affecting hundreds of millions of gadgets globally, from medical devices to printers to power grid and railway equipment [101208]. The vulnerabilities in the software code were serious, allowing hackers to run their own commands on target devices (remote code execution) or leak sensitive information. The Cybersecurity and Infrastructure Security Agency rated six of the 19 bugs between 7 and 10 on the CVSS score, with two bugs scoring a 10 out of 10, indicating severe vulnerabilities [101208]. The incident highlighted the lack of secure coding practices and vulnerability analysis in the development process, as the vulnerabilities in the TCP-IP stack provided by Treck were not identified and addressed before being integrated into a wide range of connected devices. This failure in professional competence led to the widespread presence of hackable vulnerabilities in IoT devices, emphasizing the need for improved security practices in software development [101208]. (b) The software failure incident was not accidental but rather a result of vulnerabilities introduced due to development incompetence and lack of secure coding practices. The vulnerabilities in the software code were identified as intentional weaknesses that could be exploited by hackers to take control of devices or execute malicious code, indicating a deliberate design flaw rather than an accidental error [101208].
Duration permanent, temporary (a) The software failure incident described in the articles seems to fall under the category of a permanent failure. The vulnerabilities in the software code, known as Ripple20, were identified in devices from various manufacturers, including HP, Intel, Rockwell Automation, Caterpillar, and Schneider Electric. These vulnerabilities have likely been present in the devices for years, making them hackable and in need of patching to protect them from potential attacks [101208]. The article also mentions that some internet-of-things devices, especially those in industrial settings with little downtime, often go unpatched for years, indicating a long-term vulnerability [101208]. (b) The software failure incident can also be considered as a temporary failure in some aspects. While the vulnerabilities have been present in the devices for years, efforts have been made to address the issue. JSOF, the Israeli security firm that discovered the vulnerabilities, has contacted vendors of affected devices and many companies have released software updates to address the vulnerabilities [101208]. Additionally, some companies like Digi and Intel have already fixed some of the vulnerabilities in their products through updates or patches, indicating a temporary nature of the failure as it is being actively mitigated [101208].
Behaviour value (a) crash: The software failure incident in the article is not described as a crash where the system loses state and does not perform any of its intended functions. (b) omission: The software failure incident in the article is not described as an omission where the system omits to perform its intended functions at an instance(s). (c) timing: The software failure incident in the article is not described as a timing issue where the system performs its intended functions correctly but too late or too early. (d) value: The software failure incident in the article is described as a value issue where the system performs its intended functions incorrectly. The vulnerabilities in the software allowed hackers to run their own commands on target devices, leading to remote code execution and leakage of sensitive information [101208]. (e) byzantine: The software failure incident in the article is not described as a byzantine failure where the system behaves erroneously with inconsistent responses and interactions. (f) other: The software failure incident in the article is not described as a crash, omission, timing, or byzantine failure. The behavior of the software failure incident is related to the system performing its intended functions incorrectly due to hackable vulnerabilities in the software, allowing hackers to take control of devices and run malicious code [101208].

IoT System Layer

Layer Option Rationale
Perception network_communication, embedded_software (a) sensor: The software failure incident related to the Ripple20 vulnerabilities was not specifically attributed to sensor errors. The vulnerabilities were found in the TCP-IP stack code provided by Treck, which is used in various internet-of-things devices, affecting the network communication layer rather than the sensor layer [101208]. (b) actuator: The articles did not mention any specific failures related to actuator errors. The focus of the software vulnerability was on the TCP-IP stack code provided by Treck, impacting the network communication layer of the affected devices [101208]. (c) processing_unit: The software failure incident was not directly linked to processing errors in the processing unit. The vulnerabilities identified in the Treck's TCP-IP stack code affected the network communication functionality of the devices rather than the processing unit itself [101208]. (d) network_communication: The software failure incident was primarily related to network communication errors introduced by vulnerabilities in the TCP-IP stack code provided by Treck. These vulnerabilities allowed hackers to exploit the network communication layer of various internet-of-things devices, potentially leading to remote code execution and other malicious activities [101208]. (e) embedded_software: The failure incident was specifically related to vulnerabilities found in the embedded software provided by Treck, which is used in internet-of-things devices. The vulnerabilities in the TCP-IP stack code impacted the functionality of a wide range of connected devices, highlighting the importance of securing embedded software in such systems [101208].
Communication connectivity_level The software failure incident described in the article [101208] is related to the communication layer of the cyber physical system that failed at the connectivity_level. The vulnerabilities identified in the software stack provided by Treck, which is used in internet-of-things devices, were related to the TCP-IP protocol that connects devices to networks and the internet. These vulnerabilities allowed hackers to exploit network or transport layer weaknesses to paralyze or take control of the affected devices. The bugs in the software stack were found in devices from various manufacturers, including HP, Intel, Rockwell Automation, Caterpillar, and Schneider Electric, indicating a failure at the connectivity_level of the cyber physical system.
Application TRUE The software failure incident described in the articles is related to vulnerabilities found in the TCP-IP stack code provided by Treck, a software company. These vulnerabilities, known as Ripple20, were identified by security researchers from JSOF. The vulnerabilities in the TCP-IP stack code could allow hackers to exploit devices connected to networks and the internet, leading to potential remote code execution, sensitive information leakage, and complete control of affected devices [101208]. This failure can be attributed to the application layer of the cyber physical system, as it involves bugs in the TCP-IP stack code that could be exploited by hackers to run their own commands on target devices, demonstrating a failure due to contributing factors introduced by bugs and incorrect usage [101208].

Other Details

Category Option Rationale
Consequence property, non-human, theoretical_consequence (a) unknown (b) unknown (c) unknown (d) Property: The software failure incident resulted in hundreds of millions of gadgets across the globe being vulnerable to attacks, including medical devices, printers, power grid equipment, and railway equipment [101208]. (e) unknown (f) Non-human: The software failure incident impacted a wide range of devices, from power supply systems in data centers to programmable logic controllers used in power grids and manufacturing to medical infusion pumps [101208]. (g) unknown (h) Theoretical_consequence: There were discussions about potential consequences of the software failure incident, such as the possibility of hackers being able to paralyze target devices, run malicious code, or take control of devices [101208]. (i) unknown
Domain manufacturing, utilities, health, government (a) The software failure incident related to the Ripple20 vulnerabilities affected a wide range of industries, including the healthcare industry with medical infusion pumps being vulnerable [101208]. Additionally, the incident impacted the manufacturing industry with programmable logic controllers used in power grids and manufacturing also being at risk [101208]. (g) The utilities industry was also affected by the software failure incident, as power supply systems in data centers were among the devices vulnerable to the Ripple20 bugs [101208]. (l) The government sector was indirectly impacted by the software failure incident, as the vulnerabilities in the affected devices could have implications for critical infrastructure and public services [101208].

Sources

Back to List