Incident: Inadequate IoT Device Monitoring Leads to Network Vulnerabilities

Published Date: 2017-09-22

Postmortem Analysis
Timeline 1. The software failure incident happened in September 2017. [Article 63159]
System 1. Unsecured devices connected to the network, including security cameras and uninterruptible power supplies (UPSs) [63159]
Responsible Organization 1. Lack of industry-wide standards for IoT devices [63159] 2. Hackers exploiting unidentified devices on the network [63159]
Impacted Organization 1. RWJBarnabas Health [63159]
Software Causes 1. Lack of awareness of unsecured devices connected to the network, leading to potential vulnerabilities and security risks [63159] 2. Failure to register systems with IT and ensure they meet security standards, allowing for unidentified devices on the network [63159] 3. Insufficient monitoring and identification of devices accessing the network, creating opportunities for hackers to exploit vulnerabilities [63159]
Non-software Causes 1. Lack of awareness of the number of unsecured devices connected to the network [63159] 2. Complexity of IT systems due to company mergers [63159] 3. Lack of industry-wide standards for IoT devices [63159]
Impacts 1. The software failure incident led to the discovery of around 70,000 internet-enabled devices accessing the health firm's network, which was far more than expected, posing a major security threat [63159]. 2. Unidentified devices, such as security cameras and uninterruptible power supplies (UPSs), were found connected to the network, potentially serving as access points for hackers to compromise the network's security [63159]. 3. The incident highlighted the risk of hackers being able to switch off life-critical machines or steal sensitive patient data, emphasizing the importance of network security and monitoring [63159].
Preventions 1. Conducting regular audits and assessments of the network to identify all connected devices and ensure they meet security standards could have prevented the software failure incident [63159]. 2. Implementing specialized IoT cybersecurity programs to monitor and identify all devices accessing the network, not just from their IP addresses, but also from other attributes and fingerprints, could have helped prevent the incident [63159]. 3. Utilizing network monitoring services from companies like ForeScout, Solar Winds, IBM, SecureWorks, Gigamon, and others to continuously monitor the network for any aberrant behavior and automatically disconnect rogue devices could have prevented the incident [63159].
Fixes 1. Conducting a full audit using a specialist IoT cyber-security program to identify all devices connected to the network, ensuring they meet security standards and are registered with IT [63159]. 2. Implementing network monitoring software like ForeScout that can identify every device trying to access the network and analyze their behavior to spot aberrant activities [63159]. 3. Utilizing services from network monitoring firms such as ForeScout, Solar Winds, IBM, SecureWorks, Gigamon, and others to enhance network security in a world where IoT devices are proliferating [63159]. 4. Establishing industry-wide standards for IoT devices to address major security concerns related to the lack of uniform security protocols [63159]. 5. Implementing IoT security credentialing services like the one launched by Verizon to allow only trusted and verified devices to access the company's network [63159]. 6. Forming strategic partnerships with technology companies like Intel to authenticate IoT devices and create end-to-end platforms for IoT security [63159]. 7. Investing in behavioral network monitoring as a security measure to detect and disconnect rogue devices automatically, strengthening overall network defenses [63159].
References 1. Hussein Syed, chief information security officer for RWJBarnabas Health [63159] 2. Mike DeCesare, chief executive of ForeScout [63159] 3. Darren Thomson, chief technology officer and vice president, technology services at Symantec [63159] 4. Tom Reilly, chief executive of Cloudera [63159] 5. Bill Ruh, GE Digital chief executive [63159]

Software Taxonomy of Faults

Category Option Rationale
Recurring multiple_organization (a) In the provided articles, there is no specific mention of a similar software failure incident happening again at the same organization (RWJBarnabas Health) or with its products and services. Therefore, there is no information available to indicate a repeated incident within the same organization. (b) The article discusses the general cybersecurity challenges faced by organizations due to the increasing number of connected devices and the vulnerabilities they pose. It highlights the risks associated with IoT devices and the lack of industry-wide standards for their security, which is a concern for multiple organizations beyond RWJBarnabas Health. The article mentions the need for improved security measures and monitoring solutions to address the growing threat landscape posed by IoT devices across various industries.
Phase (Design/Operation) design, operation (a) The article discusses a software failure incident related to the design phase where the failure was due to contributing factors introduced by system development and system updates. The incident involved a hospital's IT network where the chief information security officer discovered that there were around 70,000 internet-enabled devices accessing the network, far more than expected. These devices included security cameras and uninterruptible power supplies (UPSs) that were not registered with IT and did not meet security standards, posing a significant security risk [63159]. (b) The article also touches upon a software failure incident related to the operation phase, specifically due to contributing factors introduced by the operation or misuse of the system. It mentions the potential consequences of hackers gaining access to the network through unidentified devices like UPSs, which could lead to switching off life-critical machines or stealing patient data for ransom. This highlights the importance of monitoring and analyzing the behavior of devices on the network to prevent unauthorized access and potential security breaches [63159].
Boundary (Internal/External) within_system (a) within_system: The software failure incident discussed in the article is primarily within the system. The failure was due to the presence of numerous unsecured devices connected to the network of RWJBarnabas Health, which were not registered with IT and did not meet security standards. This internal vulnerability allowed for potential access points for hackers to exploit, leading to risks such as switching off life-critical machines or stealing patient data [63159]. (b) outside_system: The article does not provide information indicating that the software failure incident was primarily due to contributing factors originating from outside the system.
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident occurring due to non-human actions: The article discusses a potential software failure incident where a hacker could remotely turn off a life support machine in a hospital or shut down a power station due to the presence of unsecured devices connected to the network. The incident highlights the vulnerability of the network to non-human actions such as unauthorized access by hackers [63159]. (b) The software failure incident occurring due to human actions: The article mentions that the chief information security officer of a health provider in New Jersey discovered numerous unregistered systems and devices on the network that did not meet security standards. These human actions of not registering devices properly and not ensuring security compliance could have potentially led to a software failure incident if hackers had exploited these vulnerabilities [63159].
Dimension (Hardware/Software) hardware, software (a) The article discusses a potential software failure incident related to hardware. It mentions the scenario where hackers could potentially switch off life-critical machines by hacking into uninterruptible power supplies (UPSs) connected to the network [63159]. This highlights a hardware-related vulnerability that could lead to a software failure incident. (b) The article also addresses a software failure incident related to software itself. It talks about the complexity of IT systems in hospitals, with a large number of internet-enabled devices accessing the network. The discovery of unidentified devices that did not meet security standards poses a risk of being access points for hackers to exploit the network [63159]. This indicates a software failure incident originating from vulnerabilities in the software and lack of proper security measures.
Objective (Malicious/Non-malicious) malicious (a) The software failure incident discussed in the article is related to a malicious objective. The incident involves the potential threat of hackers gaining access to the hospital's network through unidentified devices like security cameras and uninterruptible power supplies (UPSs) [63159]. The article highlights the risks associated with hackers potentially switching off life-critical machines or stealing patient data for ransom, indicating a malicious intent behind the failure incident.
Intent (Poor/Accidental Decisions) poor_decisions (a) The software failure incident discussed in the article is related to poor decisions made regarding the lack of awareness and control over the number of unsecured devices connected to the network. The incident at RWJBarnabas Health revealed that there were around 70,000 internet-enabled devices accessing the network, which was far more than expected. This lack of knowledge and oversight regarding the connected devices posed a significant security threat, potentially allowing hackers to exploit these devices as access points to critical assets on the network [63159].
Capability (Incompetence/Accidental) development_incompetence (a) The article discusses a software failure incident related to development incompetence. The Chief Information Security Officer, Hussein Syed, discovered that there were around 70,000 internet-enabled devices accessing the health firm's network, which was far more than he had expected. Many of these devices were not registered with IT and did not meet security standards, indicating a lack of awareness and control over the network's devices [63159]. (b) The article also mentions the potential for accidental software failure incidents due to the proliferation of IoT devices, which has increased the attack surface for hackers. Businesses often underestimate the number of devices linked to their network, leading to shocks when they find out the actual numbers. Accidental failures can occur when unidentified devices become access points for hackers, compromising the network's security [63159].
Duration unknown The articles do not provide specific information about a software failure incident being either permanent or temporary.
Behaviour omission, value, other (a) crash: The article does not specifically mention a software crash incident where the system loses state and fails to perform its intended functions. (b) omission: The article mentions a scenario where unidentified devices connected to a hospital's network could potentially omit to perform their intended functions, leading to security vulnerabilities. For example, security cameras and uninterruptible power supplies (UPSs) were among the devices that were not registered with IT and did not meet security standards, posing risks of omission in terms of security functions [63159]. (c) timing: The article does not discuss a software failure incident related to timing issues where the system performs its intended functions but at incorrect times. (d) value: The article highlights the potential risk of a software failure incident where the system performs its intended functions incorrectly, such as hackers gaining access to critical machines or patient data due to security vulnerabilities in the network [63159]. (e) byzantine: The article does not explicitly mention a byzantine software failure incident where the system behaves erroneously with inconsistent responses and interactions. (f) other: The other behavior described in the article is related to the discovery of numerous unsecured devices connected to the network, which were not known to the IT team and did not meet security standards. This behavior poses a significant security threat and highlights the importance of comprehensive network monitoring and security measures to prevent potential cyber-attacks [63159].

IoT System Layer

Layer Option Rationale
Perception sensor, actuator, network_communication, embedded_software (a) The failure was related to the perception layer of the cyber physical system that failed due to contributing factors introduced by sensor error. The article mentions a scenario where a hacker could potentially switch off life-critical machines by hacking into uninterruptible power supplies (UPSs), which are units that provide back-up battery power in the event of a power cut. This indicates a vulnerability in the sensor layer of the system where the UPSs are not functioning as intended and can be manipulated by hackers [63159]. (b) The failure was related to the perception layer of the cyber physical system that failed due to contributing factors introduced by actuator error. The article discusses the potential scenario where hackers could switch off life-critical machines by hacking into uninterruptible power supplies (UPSs), which are actuator devices providing back-up battery power. This indicates a vulnerability in the actuator layer of the system where the UPSs are not functioning correctly and can be controlled externally [63159]. (c) The failure was related to the perception layer of the cyber physical system that failed due to contributing factors introduced by processing error. The article does not provide specific information indicating a failure at the processing unit level. (d) The failure was related to the perception layer of the cyber physical system that failed due to contributing factors introduced by network communication error. The article highlights the issue of insecure devices such as security cameras and uninterruptible power supplies (UPSs) being potential access points for hackers to infiltrate the network. This vulnerability in network communication allows hackers to potentially switch off life-critical machines or steal sensitive data, indicating a failure in the network communication layer of the system [63159]. (e) The failure was related to the perception layer of the cyber physical system that failed due to contributing factors introduced by embedded software error. The article mentions the use of a specialist IoT cyber-security program to carry out a full audit, which revealed numerous unidentified devices accessing the network. These devices, including security cameras and uninterruptible power supplies (UPSs), were not registered with IT and did not meet security standards, indicating a failure in the embedded software layer where these devices were not properly managed or secured [63159].
Communication unknown The articles do not provide specific information about a software failure incident related to the communication layer of the cyber physical system that failed. Therefore, it is unknown whether the failure was at the link level or connectivity level.
Application FALSE The software failure incident described in the article does not specifically mention that the failure was related to the application layer of the cyber physical system due to bugs, operating system errors, unhandled exceptions, or incorrect usage. Therefore, it is unknown if the failure was related to the application layer based on the information provided in the article.

Other Details

Category Option Rationale
Consequence death, harm, non-human, theoretical_consequence (a) death: People lost their lives due to the software failure The article discusses a scenario where a hacker could potentially switch off a life support machine in a hospital, leading to fatal consequences [63159].
Domain health (a) The failed system in the article is related to the healthcare industry. The incident involves a major security threat in a hospital setting due to the presence of unsecured devices connected to the network, including medical devices, computers, and software applications [63159].

Sources

Back to List