Incident: Smart Motorway CCTV System Failure at National Highways.

Published Date: 2021-09-26

Postmortem Analysis
Timeline 1. The software failure incident happened in September 2021 [118563].
System 1. CCTV system 2. ControlWorks system 3. COBS (Control Office Base System) screen [118563]
Responsible Organization 1. National Highways [118563]
Impacted Organization 1. National Highways [118563]
Software Causes 1. Software failures in the CCTV system, including cameras being faulty, unusable, or pointing in the wrong direction, leading to critical issues in monitoring and responding to incidents on smart motorways [118563].
Non-software Causes 1. Antiquated computers grinding to a halt for more than 30 minutes [118563] 2. Entire communications system going down, leaving operators with limited means of communication [118563] 3. CCTV system breaking, leading to the inability to monitor roads or respond to alerts [118563] 4. Faulty CCTV cameras, with some facing the wrong direction, obscured by condensation, or broken [118563] 5. Outdated control boxes not functioning properly [118563] 6. Staff reporting faults in cameras meant to keep smart motorways safe multiple times [118563]
Impacts 1. The software failure incident led to critical faults in the CCTV cameras meant to keep smart motorways safe, with operators reporting failures 218 times in a year, including 29 in just one month [118563]. 2. The broken devices left staff unable to check reports of stranded cars on the M25, leading to vital minutes being wasted before action could be taken [118563]. 3. An audit of over 800 cameras showed that 112 were faulty, unusable, or pointing in the wrong direction, impacting the ability to close lanes or reduce speed limits promptly in case of incidents [118563]. 4. On a busy stretch of the M25 where three people died, eight out of 19 cameras were broken, obscured by condensation, or facing the wrong way, highlighting the severity of the impact of the software failure incident [118563]. 5. The software failure incident resulted in delays in confirming incidents on CCTV feeds, leading to critical time being wasted waiting for traffic officers to arrive, potentially endangering motorists [118563].
Preventions 1. Regular maintenance and monitoring of the CCTV cameras and communication systems could have prevented the software failure incident [Article 118563]. 2. Implementing a more robust and reliable technology infrastructure for smart motorways, including the CCTV systems and communication networks, could have prevented the software failure incident [Article 118563]. 3. Adequate training and support for staff to handle and troubleshoot technology failures could have prevented the software failure incident [Article 118563].
Fixes 1. Implementing a comprehensive maintenance and monitoring program for the CCTV cameras to ensure they are functioning properly [Article 118563]. 2. Upgrading the outdated technology and equipment used in the control centre, such as the CCTV cameras and communication systems [Article 118563]. 3. Conducting regular checks and audits on the CCTV cameras to identify and address any faults promptly [Article 118563]. 4. Enhancing the training and support provided to staff to effectively handle incidents and utilize the control systems [Article 118563].
References 1. Undercover reporter working in a National Highways control centre [Article 118563] 2. Highways staff reporting faults in the cameras meant to keep smart motorways safe [Article 118563] 3. Mail investigation [Article 118563]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization (a) The software failure incident has happened again at one_organization: The article reports that staff at National Highways frequently reported faults in the cameras meant to keep smart motorways safe. Operators flagged failures in the cameras 218 times in a year, including 29 in just one month. The broken devices left them unable to check reports of stranded cars on the M25, leading to vital minutes being wasted before action could be taken. The article also mentions that during an undercover investigation, the entire CCTV system at a National Highways control center crashed several times, with the team manager expressing frustration about the frequent breakdowns [118563]. (b) The software failure incident has happened again at multiple_organization: The article does not provide specific information about similar incidents happening at other organizations or with their products and services.
Phase (Design/Operation) design, operation (a) The software failure incident related to the design phase can be seen in the article where it is mentioned that the CCTV system frequently experienced faults, with operators flagging failures multiple times, leaving them unable to check reports of stranded cars on the M25 [118563]. (b) The software failure incident related to the operation phase is evident in the article where it is reported that the entire communications system went down, leaving operators struggling to communicate using limited desk phones and unable to monitor roads or respond to alerts from the radar system detecting stopped cars [118563].
Boundary (Internal/External) within_system (a) The software failure incident reported in the articles is primarily within the system. The failure was due to various internal factors such as faulty cameras, broken devices, and outdated technology within the National Highways control center. Operators reported numerous faults in the cameras used to monitor smart motorways, with a significant number of cameras being faulty, unusable, or pointing in the wrong direction [118563]. The incident involved failures in the CCTV system, which left staff unable to close lanes or reduce speed limits until an accident was confirmed by a camera or patrol car, leading to critical delays in response times [118563]. Additionally, internal reports revealed that staff reported multiple instances of CCTV failures, with cameras being broken or not functioning properly. There were instances where staff could not find incidents on smart motorway sections due to faulty cameras, leading to potential harm to motorists [118563]. The software failure incident was exacerbated by issues such as slow and outdated technology, with operators expressing frustration over the system's inefficiencies [118563]. These factors point to a software failure incident primarily originating from within the system itself.
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident occurring due to non-human actions: The articles describe multiple instances of software failures in the National Highways control center, particularly related to the CCTV system used for monitoring smart motorways. These failures include broken cameras, cameras facing the wrong direction, cameras obscured by condensation, and cameras not functioning properly. The failures in the CCTV system led to critical issues such as the inability to confirm incidents, delays in responding to emergencies, and challenges in closing lanes or reducing speed limits promptly [118563]. (b) The software failure incident occurring due to human actions: The articles also highlight human actions contributing to the software failures. For example, there are mentions of staff reporting faults in the cameras, operators struggling with outdated and slow systems, and instances where operators had difficulty accessing or using the CCTV cameras effectively. Additionally, there are references to staff members raising concerns about the technology used by the organization, including issues with radios, cameras, and communication systems. These human-related factors have also played a role in the software failures experienced in the control center [118563].
Dimension (Hardware/Software) hardware (a) The software failure incident occurring due to hardware: - The news article mentions that the control room operators faced issues with the CCTV system, where cameras were reported as faulty, unusable, or pointing in the wrong direction [118563]. - Operators reported faults in the cameras used to keep smart motorways safe, with failures being flagged multiple times, leaving them unable to check reports of stranded vehicles on the roads [118563]. - The article highlights that there were problems with a significant number of cameras on various motorway sections, including some facing condensation issues or being broken [118563]. (b) The software failure incident occurring due to software: - The article does not explicitly mention any software-related contributing factors to the failure incident.
Objective (Malicious/Non-malicious) non-malicious (a) The articles do not mention any malicious intent behind the software failure incident reported in the National Highways control center. There is no indication of any deliberate actions taken to harm the system [118563]. (b) The software failure incident reported in the National Highways control center appears to be non-malicious in nature. The incident is described as a result of faults in the cameras used to keep smart motorways safe, with operators frequently reporting failures in the system. The broken devices left staff unable to check reports of stranded cars, and the CCTV system experienced multiple crashes, impacting the ability to monitor and respond to incidents effectively. The articles highlight issues with faulty, unusable, or misaligned cameras, indicating a non-malicious failure due to technical shortcomings rather than intentional harm to the system [118563].
Intent (Poor/Accidental Decisions) accidental_decisions (a) The software failure incident reported in the articles seems to be related to poor_decisions. The incident highlights numerous faults in the cameras meant to keep smart motorways safe, with operators frequently reporting failures in the system. The broken devices left staff unable to check reports of stranded cars on the roads, leading to critical delays in taking necessary actions [118563]. Additionally, internal reports revealed that staff reported CCTV failures multiple times, impacting their ability to monitor and respond to incidents effectively [118563]. (b) The software failure incident also involves accidental_decisions, as the failures in the system were not intentional but rather a result of the shortcomings in the technology and infrastructure. Staff members expressed frustration over the outdated and faulty CCTV system, which hindered their ability to carry out their duties efficiently and compromised the safety of motorists on smart motorways [118563]. The incidents of broken cameras, black screens, and slow system performance were unintended consequences of the system's deficiencies rather than deliberate actions.
Capability (Incompetence/Accidental) development_incompetence, accidental (a) The software failure incident occurring due to development incompetence: - The article highlights various instances where the software failures were attributed to the shortcomings in the technology used by National Highways, indicating a lack of professional competence in maintaining the systems [118563]. - Operators reported faults in the cameras used for smart motorways safety multiple times, with a significant number of cameras being faulty, unusable, or pointing in the wrong direction [118563]. - Staff members expressed concerns about the outdated and faulty technology, such as slow and unreliable CCTV systems, which hindered their ability to effectively monitor and respond to incidents on the roads [118563]. (b) The software failure incident occurring accidentally: - The article mentions instances where the software failures were described as happening frequently and almost predictably, indicating a pattern of accidental failures rather than intentional actions [118563]. - The failures in the CCTV systems, which are crucial for monitoring and responding to incidents on smart motorways, were reported to occur regularly and were not isolated incidents, suggesting accidental shortcomings in the technology [118563].
Duration temporary The software failure incident reported in the articles was temporary. The incident involved frequent faults in the cameras used to keep smart motorways safe, with operators reporting failures multiple times, including instances where the entire CCTV system crashed several times [118563]. Additionally, there were issues with a significant number of cameras being faulty, unusable, or pointing in the wrong direction, impacting the ability of staff to close lanes or reduce speed limits until an accident was confirmed by a camera or patrol car [118563]. Staff reported CCTV failures multiple times, indicating that the software failure was temporary and not a permanent issue [118563].
Behaviour crash, omission, value, other (a) crash: The software failure incident described in the articles can be categorized as a crash. The incident involved the system used to set lane closures and speed limits going down for more than 30 minutes, leading to a halt in operations and causing significant disruptions [118563]. (b) omission: The software failure incident also involved omission as a behavior. The broken devices, including faulty CCTV cameras, left operators unable to check reports of stranded cars on the roads, resulting in delays in taking necessary actions [118563]. (c) timing: The timing of the software failure incident can be considered a factor as well. The system failures occurred at critical times when operators needed to monitor the roads, set signals, and respond to alerts promptly. The delays caused by the system failures could have had serious consequences in terms of safety and traffic management [118563]. (d) value: The software failure incident also exhibited failures related to value. The system was not performing its intended functions correctly, leading to issues such as broken cameras, obscured views, and faulty technology that compromised the ability to ensure road safety and respond effectively to incidents [118563]. (e) byzantine: The behavior of the software failure incident did not specifically exhibit characteristics of a byzantine failure, which involves inconsistent responses and interactions. The incident primarily involved system crashes, omissions, timing issues, and failures in delivering value [118563]. (f) other: The software failure incident also showcased other behaviors such as system instability, frequent breakdowns, and overall unreliability. The incident highlighted a range of issues including outdated technology, faulty equipment, and a lack of proper maintenance, contributing to a chaotic and unreliable operational environment [118563].

IoT System Layer

Layer Option Rationale
Perception sensor, network_communication (a) sensor: The article mentions faults in the cameras used to keep smart motorways safe, with operators reporting failures in the cameras multiple times. The broken devices left them unable to check reports of stranded cars, and during outages, the entire CCTV system crashed several times, impacting the ability to monitor and respond to incidents [118563]. (b) actuator: The article does not specifically mention failures related to actuators. (c) processing_unit: The article does not specifically mention failures related to the processing unit. (d) network_communication: The article mentions failures in the communications system, leaving operators with limited desk phones to communicate with traffic and police patrols, as well as fielding calls from emergency SOS phones. Additionally, there were issues with the radios used to contact on-road traffic officers, which once went down for an entire day [118563]. (e) embedded_software: The article does not specifically mention failures related to embedded software.
Communication connectivity_level The software failure incident reported in the articles is related to the connectivity level of the cyber-physical system. The failure was due to contributing factors introduced by the network or transport layer. The incident involved the breakdown of the entire communications system, leading to operators struggling to communicate with traffic and police patrols, as well as facing challenges in responding to alerts from the radar system detecting stopped cars. Additionally, the CCTV system broke down, preventing staff from monitoring the roads effectively and responding to incidents promptly. The failure at the connectivity level impacted the operational efficiency and safety of the smart motorway sections [118563].
Application FALSE The software failure incident described in the articles does not seem to be related to the application layer of the cyber physical system. The failure incidents primarily revolve around the malfunctioning of CCTV cameras, communication systems, and technology infrastructure within the control center, rather than issues directly related to bugs, operating system errors, unhandled exceptions, or incorrect usage at the application layer [118563].

Other Details

Category Option Rationale
Consequence death, harm, property, delay (a) death: People lost their lives due to the software failure - The article mentions that on a busy stretch of the M25 where three people have died, eight out of 19 cameras were broken, obscured by condensation, or facing the wrong way, impacting the ability to respond to incidents promptly [118563]. (b) harm: People were physically harmed due to the software failure - The article discusses how broken devices and faulty cameras left operators unable to respond promptly to incidents, potentially leading to harm or accidents on the roads [118563]. (d) property: People's material goods, money, or data was impacted due to the software failure - The software failure incident impacted the ability to monitor and respond to incidents effectively, potentially leading to property damage in case of accidents or delays in providing assistance [118563]. (e) delay: People had to postpone an activity due to the software failure - The article mentions delays in responding to incidents and setting appropriate signals and closures due to the malfunctioning CCTV system and communication breakdowns caused by the software failure [118563].
Domain transportation, government The software failure incident reported in the news articles is related to the transportation industry. The incident occurred in a control center responsible for monitoring major roads in the East of England, including smart motorway sections on the M25, M1, and M4 [Article 118563]. The failure of the system used to set lane closures, speed limits, and monitor the roads led to significant safety concerns for motorists using these smart motorway sections. The malfunctioning technology, including faulty CCTV cameras and communication systems, compromised the ability of operators to effectively manage traffic incidents and ensure the safety of drivers on the roads [Article 118563].

Sources

Back to List