Incident: Smart Motorways CCTV System Failure Endangers Lives

Published Date: 2021-09-27

Postmortem Analysis
Timeline 1. The software failure incident happened on September 17, as mentioned in the article [118479].
System The software failure incident reported in the news articles primarily involves failures in the CCTV camera system used on smart motorways. The failures include: 1. CCTV cameras being broken, misted up, facing the wrong way, obscured by condensation, or not working properly [Article 118479, Article 118479, Article 118479]. 2. Outdated hardware including faulty CCTV control boxes from 2004 [Article 118479]. 3. Software used to close lanes going down multiple times [Article 118479]. 4. CCTV system crashing several times, leaving staff unable to monitor the roads or respond to alerts [Article 118479]. 5. Internal reports indicating frequent faults in the cameras, with operators flagging failures multiple times [Article 118479]. 6. Staff reporting issues with the technology, such as slow and faulty cameras, and difficulties in locating incidents due to broken cameras [Article 118479, Article 118479]. These failures in the CCTV camera system have contributed to safety risks and incidents on smart motorways.
Responsible Organization 1. National Highways [118479] 2. Department for Transport [118479]
Impacted Organization 1. Motorists on smart motorways [Article 118479] 2. National Highways (formerly Highways England) [Article 118479]
Software Causes 1. Faulty and outdated hardware, including CCTV boxes from 2004, making it hard for operators to locate stranded vehicles [118479] 2. Software used to close lanes went down several times in the six weeks the reporter worked at one of the regional control rooms [118479] 3. CCTV system crashing several times during the undercover reporter's time at the control center [118479] 4. Internal reports showing staff reporting CCTV failures multiple times, impacting their ability to monitor and respond to incidents [118479]
Non-software Causes 1. Outdated and faulty hardware, including CCTV boxes from 2004, making it hard for operators to locate stranded vehicles [Article 118479]. 2. Lack of safe stopping points for motorists every 600 meters, with some refuges now 2.5 miles apart, increasing the risk for stranded drivers [Article 118479]. 3. Failures in the CCTV system, with many cameras broken, faulty, or not monitored, leaving motorists stranded in high-speed traffic [Article 118479]. 4. Lack of proper monitoring and maintenance of the CCTV cameras, leading to critical incidents not being captured or responded to in a timely manner [Article 118479].
Impacts 1. The software failures on smart motorways led to a significant number of safety cameras being broken, misted up, or facing the wrong way, affecting monitoring and response to incidents, potentially putting lives at risk [Article 118479]. 2. The faulty cameras and outdated hardware, including CCTV boxes from 2004, made it challenging for operators to locate stranded vehicles, leading to delays in response times and potentially endangering motorists [Article 118479]. 3. The software used to close lanes went down multiple times in the six weeks the reporter worked at one of the control rooms, impacting the ability to manage traffic flow and safety measures effectively [Article 118479]. 4. Control room staff reported an average of almost two CCTV and technological failures every day in 2020, indicating a systemic issue with the software and hardware infrastructure [Article 118479]. 5. The software failures resulted in critical incidents not being captured by CCTV cameras, such as a fatal crash on the M25 that left four people dead, highlighting the potential consequences of inadequate monitoring due to software issues [Article 118479]. 6. The software failures led to delays in implementing vital lane closures or changing mandatory speed limits during incidents, posing a risk to motorists and potentially contributing to accidents [Article 118479]. 7. The faulty and outdated hardware, coupled with software failures, created a situation where staff had to resort to manual methods like keeping paper records of incidents when the systems were down, indicating a lack of reliable backup systems [Article 118479].
Preventions 1. Regular maintenance and monitoring of the CCTV cameras to ensure they are functioning properly and facing the correct direction could have prevented the software failure incident [Article 118479]. 2. Upgrading the outdated hardware, such as the CCTV control boxes from 2004, to more modern and reliable equipment could have helped prevent the software failure incident [Article 118479]. 3. Implementing a more robust and reliable software system for setting lane closures and speed limits, as well as for monitoring incidents, could have prevented the software failure incident [Article 118479].
Fixes 1. Implement a thorough investigation into the failures of the smart motorway system, particularly focusing on the faulty cameras and outdated technology used in control rooms [Article 118479]. 2. Address the issues with the CCTV cameras, ensuring they are fully functional, monitored, and positioned correctly to provide real-time monitoring of incidents on smart motorways [Article 118479]. 3. Improve the technology infrastructure in control rooms, including updating hardware and software to prevent system malfunctions and delays in responding to incidents on smart motorways [Article 118479]. 4. Enhance the training and support for control room operators to effectively handle incidents and utilize the technology available to ensure the safety of motorists on smart motorways [Article 118479]. 5. Consider reinstating the hard shoulder on smart motorways to provide a dedicated space for drivers to stop in case of emergencies, reducing the risks associated with breakdowns and accidents [Article 118479].
References 1. Undercover investigation by a Daily Mail reporter at a National Highways control room [Article 118479] 2. Audit of more than 800 CCTV cameras by an undercover reporter working at the South Mimms control room [Article 118479] 3. Internal reports and emails from National Highways staff [Article 118479]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization (a) The software failure incident having happened again at one_organization: - The articles report on ongoing software failures within National Highways, formerly known as Highways England, related to the smart motorways system [118479]. - The incidents include issues with CCTV cameras being faulty, broken, or facing the wrong direction, leading to delays in responding to emergencies and accidents on smart motorways [118479]. - Operators in the control rooms reported numerous faults in the cameras, with some cameras not working or showing irrelevant views like clouds or the ground [118479]. - The software used to monitor and control lane closures and speed limits also experienced failures, impacting the ability to manage traffic effectively [118479]. - The control room staff faced challenges due to outdated hardware, including faulty CCTV boxes from 2004, making it difficult to locate stranded vehicles and respond promptly to incidents [118479]. (b) The software failure incident having happened again at multiple_organization: - The articles do not mention similar incidents happening at other organizations or with their products and services.
Phase (Design/Operation) design, operation The software failure incident related to the development phases can be identified as follows: (a) Design: The articles highlight various failures and issues related to the design and development of the smart motorway system. The system was plagued with alarming problems such as faulty safety cameras, broken or obscured CCTV cameras, outdated hardware, faulty technology, and inadequate monitoring capabilities. The control rooms experienced frequent software failures, including the software used to close lanes going down multiple times, CCTV systems crashing, and operators facing challenges in locating stranded vehicles due to faulty cameras. The outdated and faulty hardware, including CCTV boxes from 2004, made it difficult for operators to effectively monitor and respond to incidents on the smart motorways [118479]. (b) Operation: The operation of the smart motorway system also contributed to the software failure incident. Control room staff reported an average of almost two CCTV and technological failures every day in 2020. The system experienced operational issues such as the inability to implement vital lane closures or change speed limits promptly, delays in spotting stranded vehicles, and challenges in responding to alerts from the radar system due to broken cameras. The control rooms faced communication breakdowns, system malfunctions, and inadequate monitoring capabilities during critical incidents, leading to delays in response and potential safety risks for motorists [118479].
Boundary (Internal/External) within_system (a) The software failure incident related to the smart motorways can be categorized as within_system. The failure was due to various internal factors within the system, such as faulty and outdated hardware, malfunctioning CCTV cameras, software used to close lanes going down multiple times, CCTV blackspots, slow and faulty technology, and overall system malfunctions [118479]. (b) The software failure incident was not primarily caused by contributing factors originating from outside the system. The issues mentioned in the articles, such as broken cameras, faulty hardware, and system malfunctions, were internal to the smart motorway control system itself [118479].
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident occurring due to non-human actions: The articles report on various instances of software failures on smart motorways due to technical issues and faulty equipment. For example, the articles mention that more than one in ten safety cameras were broken, misted up, or facing the wrong way, leading to critical monitoring failures [118479]. Additionally, the software used to close lanes went down multiple times in the control rooms, and there were reports of CCTV blackspots on the M25 due to faulty and outdated hardware, making it challenging for operators to locate stranded vehicles [118479]. These incidents highlight software failures caused by technical issues and faulty equipment rather than human actions. (b) The software failure incident occurring due to human actions: The articles also shed light on software failures attributed to human actions, such as inadequate maintenance and outdated technology. Control room staff reported an average of almost two CCTV and technological failures every day in 2020, indicating potential negligence in maintenance and monitoring [118479]. Furthermore, operators expressed frustration with the slow and faulty CCTV system, with one operator mentioning that the outdated technology was unreliable and difficult to operate efficiently [118479]. These instances suggest that human actions, such as inadequate maintenance and reliance on outdated technology, contributed to software failures on smart motorways.
Dimension (Hardware/Software) hardware, software The articles provide information about a software failure incident related to both hardware and software issues: (a) Hardware Failure: - The articles mention faulty and outdated hardware being used, including CCTV boxes from 2004, which make it hard for operators to locate stranded vehicles [Article 118479]. - There were reports of cameras being broken, obscured by condensation, or facing the wrong way, indicating hardware issues with the CCTV cameras [Article 118479]. (b) Software Failure: - The software used to close lanes went down several times in the six weeks the reporter worked at one of the regional control rooms, indicating software failures [Article 118479]. - There were instances where the entire communications system and CCTV system crashed, leaving operators unable to monitor the roads or respond to alerts, pointing to software failures [Article 118479]. Therefore, the software failure incident reported in the articles involves both hardware and software issues that contributed to the failures.
Objective (Malicious/Non-malicious) non-malicious The software failure incident related to the smart motorways can be categorized as non-malicious. The failures in the software and hardware systems used in the control rooms of National Highways were not intentional acts to harm the system but rather resulted from various technical issues and outdated equipment. The failures included problems with CCTV cameras, communication systems, lane closure software, and other technological aspects that were crucial for monitoring and managing incidents on the smart motorways. The articles highlighted instances where the CCTV cameras were faulty, facing the wrong direction, or obscured by condensation, leading to difficulties in monitoring the roads and responding to incidents ([118479], [118479]). Additionally, the software used to close lanes went down multiple times, causing delays in implementing necessary actions ([118479]). The control room operators reported frequent failures in the technology, such as broken cameras, communication systems going down, and outdated hardware causing inefficiencies in responding to incidents ([118479], [118479]). Overall, the software failure incident on the smart motorways was characterized by a series of non-malicious technical failures and shortcomings in the systems used for monitoring and managing traffic incidents, rather than deliberate actions to harm the system.
Intent (Poor/Accidental Decisions) poor_decisions The software failure incident related to the smart motorways can be categorized under **poor_decisions** as it was a failure due to contributing factors introduced by poor decisions made in the implementation and maintenance of the smart motorway system. 1. The smart motorways were implemented as a cost-effective solution to ease congestion without providing dedicated, protected spaces for drivers to shelter from hazards, leading to increased risks and dangers on the roads ([Article 118479](#118479)). 2. The decision to convert hard shoulders into live lanes on smart motorways was highlighted as a significant contributing factor to the deaths and accidents that occurred on these roads, indicating poor decision-making in the design and operation of the system ([Article 118479](#118479)). 3. The inquest into the deaths on smart motorways concluded that the lack of hard shoulders on smart motorways contributed to the tragedies and presented an ongoing risk of future deaths, emphasizing the negative consequences of the decisions made in implementing smart motorways ([Article 118479](#118479)). 4. The article by Claire Mercer, who lost her husband on a smart motorway, emphasizes the dangers and failures of smart motorways, attributing the death to the removal of the hard shoulder, which was a decision made in the design and operation of the smart motorway system ([Article 118479](#118479)). Therefore, the software failure incident related to the smart motorways falls under the category of **poor_decisions** due to the contributing factors introduced by the decisions made in implementing and maintaining the smart motorway system.
Capability (Incompetence/Accidental) development_incompetence, accidental The articles provide information related to the software failure incident occurring due to development incompetence and accidental factors: (a) development_incompetence: The software failure incidents reported in the articles are primarily due to development incompetence. The smart motorways' control rooms experienced alarming problems with faulty and outdated hardware, including faulty CCTV cameras, outdated CCTV control boxes, and slow technology. Control room staff reported an average of almost two CCTV and technological failures every day in 2020. Additionally, there were issues with the software used to close lanes, CCTV blackspots, and overall shortcomings in the technology used by National Highways [118479]. (b) accidental: The articles also highlight software failures occurring accidentally. For example, during an undercover reporter's first shift at a control room, a systems failure led to staff being unable to implement vital lane closures or change mandatory speed limits for over 30 minutes. This incident was not intentional but rather a result of the system malfunctioning [118479].
Duration temporary The software failure incident reported in the articles seems to be more of a temporary nature rather than a permanent one. The articles highlight various instances where the software, particularly the CCTV cameras and control systems, experienced temporary failures or issues due to specific contributing factors introduced by certain circumstances: 1. The articles mention incidents where the CCTV cameras were faulty, broken, obscured, or facing the wrong direction on specific dates like September 17 at different locations such as the M25, M1, M3, and M62 [Article 118479]. 2. Operators in the control room reported multiple instances of the entire communications system, CCTV system, and technology going down or malfunctioning for periods of time, causing significant disruptions in monitoring and response capabilities [Article 118479]. 3. There were reports of staff being unable to check alerts from radar systems, find incidents on the roads, or communicate effectively due to broken cameras and technology failures [Article 118479]. 4. The articles also mention instances where the CCTV system crashed several times during the undercover reporter's time at the control center, indicating recurring temporary failures [Article 118479]. 5. The control room operators expressed frustration and concern over the frequent issues with the outdated and faulty technology, indicating that these were ongoing problems rather than permanent failures [Article 118479]. Overall, the incidents described in the articles point towards temporary software failures caused by specific circumstances such as faulty hardware, outdated technology, and system malfunctions rather than permanent failures introduced by all circumstances.
Behaviour crash, omission, timing, value, byzantine, other (a) crash: Failure due to system losing state and not performing any of its intended functions - The software used to close lanes went down several times during the six weeks the reporter worked at one of the regional control rooms, causing delays in implementing vital lane closures or changing speed limits [118479]. - The entire communications system in one control room crashed, leaving operators unable to communicate effectively with traffic and police patrols [118479]. - The CCTV system in the control room crashed several times, leading to staff being unable to monitor the roads or respond to alerts from the radar system detecting stopped cars [118479]. (b) omission: Failure due to system omitting to perform its intended functions at an instance(s) - Faulty and outdated hardware, including CCTV boxes from 2004, were in use, making it difficult for operators to locate stranded vehicles, leading to instances where staff couldn't find incidents due to broken cameras [118479]. - Operators were unable to check reports of broken-down vehicles due to faulty cameras, resulting in stranded vehicles being left unattended for vital minutes before action could be taken [118479]. (c) timing: Failure due to system performing its intended functions correctly, but too late or too early - A systems failure during the reporter's first shift saw staff unable to implement vital lane closures or change mandatory speed limits until more than 30 minutes had passed, indicating a timing issue in responding to incidents [118479]. (d) value: Failure due to system performing its intended functions incorrectly - The software used to close lanes went down several times, leading to delays in implementing lane closures or changing speed limits, indicating incorrect system performance [118479]. (e) byzantine: Failure due to system behaving erroneously with inconsistent responses and interactions - The CCTV system in the control room crashed several times, leading to inconsistent responses and interactions as staff were unable to monitor the roads or respond to alerts from the radar system detecting stopped cars [118479]. (f) other: Failure due to system behaving in a way not described in the (a to e) options - The software used to close lanes went down several times during the six weeks the reporter worked at one of the regional control rooms, causing delays in implementing vital lane closures or changing speed limits [118479].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence death, harm (a) death: People lost their lives due to the software failure The consequence of the software failure incident reported in the articles is related to deaths caused by the smart motorways. The articles highlight several tragic incidents where individuals lost their lives due to the failures and dangers associated with smart motorways. For example, incidents involving Nathan Reeves, Tom Aldridge, Allan Evans, Laura Cooper, Anthony Marston, Jamil Ahmed, Sevim and Ayse Ustan, Dev Naran, Nargis Begum, Peter Lee, Derek Jacobs, Charles Scripps, Jason Mercer, Alexandru Murgeanu, Costel Stancu, Zahid Ahmed, Martin Davies, and others are mentioned as individuals who died as a result of accidents on smart motorways due to various factors like lack of hard shoulders, faulty cameras, and unsafe conditions [Article 118479].
Domain transportation (a) The failed system was intended to support the transportation industry. The smart motorways system, which converts hard shoulders into live lanes, was the focus of the software failure incident reported in the articles [118479]. The system involved CCTV cameras, lane closure software, and other technological components aimed at managing traffic flow and incidents on smart motorways, which are a part of the transportation infrastructure. The failure of these systems posed significant risks to motorists and contributed to safety concerns on smart motorways.

Sources

Back to List