Published Date: 2021-06-11
Postmortem Analysis | |
---|---|
Timeline | 1. The software failure incident affecting calls to French emergency services occurred on June 2, 2021 [115475]. |
System | 1. Call server software - A bug in the call server software caused the disturbance in the emergency calls [115475]. |
Responsible Organization | 1. Orange (ORAN.PA) - France's biggest telecoms operator, responsible for the network outage affecting calls to French emergency services [Article 115475]. |
Impacted Organization | 1. French emergency services were impacted by the software failure incident [115475]. |
Software Causes | 1. The software failure incident was caused by a bug in the call server software responsible for dispatching emergency calls [115475]. |
Non-software Causes | 1. Late communication of the incident to authorities, emergency services, media, and competitors due to a delay in setting up an internal crisis unit [115475]. 2. Delay in issuing alerts and setting up the crisis management unit involving top management and frontline personnel for major incidents [115475]. |
Impacts | 1. Approximately 11,800 calls, or 11% of the total, could not go through to the emergency services during the outage period, putting lives at risk [Article 115475]. 2. The late communication of the incident to authorities, emergency services, media, and competitors due to a delay in setting up an internal crisis unit caused additional challenges in managing the situation [Article 115475]. |
Preventions | 1. Conducting thorough testing and quality assurance procedures before implementing software upgrades to ensure that any bugs or issues are identified and resolved prior to deployment [115475]. 2. Implementing a more efficient and timely communication protocol for notifying authorities, emergency services, media, and competitors in the event of a software failure incident to enable quicker response and resolution [115475]. 3. Establishing a proactive crisis management plan that includes the immediate activation of a crisis management unit involving top management and relevant personnel to swiftly address major incidents like software failures affecting critical services [115475]. 4. Collaborating closely with the equipment supplier to ensure prompt identification and resolution of software glitches to minimize the impact on services and prevent future occurrences [115475]. |
Fixes | 1. Implement a mechanism for mass distribution of text messages in the event of a future breakdown affecting emergency services to ensure timely communication with authorities, emergency services, media, and competitors [Article 115475]. | References | 1. Orange (ORAN.PA) - France's biggest telecoms operator [Article 115475] 2. Orange's internal investigators 3. Chief Executive Stephane Richard 4. Authorities, emergency services, and the media 5. Rivals Bouygues Telecom, SFR, and Free 6. Orange's equipment supplier 7. France's cybersecurity agency ANSSI |
Category | Option | Rationale |
---|---|---|
Recurring | unknown | (a) The software failure incident at Orange was a unique event specific to the company. There is no mention in the article of a similar incident happening before within Orange or with its products and services [115475]. (b) The software failure incident at Orange did not indicate a similar incident happening at other organizations or with their products and services. The focus of the article was on Orange's internal investigation and the impact on emergency services in France [115475]. |
Phase (Design/Operation) | design, operation | (a) The software failure incident was attributed to a bug in the call server software, which was a result of an upgrade initiated in early May to increase the network's capacity [115475]. (b) The operation phase also played a role in the failure as there was a delay in setting up an internal crisis unit, leading to late communication of the incident to authorities, emergency services, media, and competitors [115475]. |
Boundary (Internal/External) | within_system | (a) The software failure incident involving Orange's network outage affecting calls to French emergency services was primarily within the system. The failure was caused by a bug in the call server software, which was a result of an upgrade initiated by Orange to increase the network's capacity [115475]. The incident was not attributed to a cyberattack, indicating that the failure originated from within the system itself. Additionally, the internal inquiry conducted by Orange's investigators focused on internal factors such as the late communication of the incident and the delay in setting up an internal crisis unit [115475]. |
Nature (Human/Non-human) | non-human_actions | (a) The software failure incident was primarily attributed to a bug in the call server software, which was a non-human action. Orange's internal inquiry found that the network outage affecting calls to French emergency services was caused by a software failure in the platform of servers responsible for dispatching calls [Article 115475]. The bug in the call server software disrupted the emergency calls, leading to about 11,800 calls not being able to go through to the emergency services during the outage period. Additionally, the cause of the bug was identified as stemming from an upgrade initiated in early May to increase the network's capacity, indicating a non-human action as the root cause of the failure. |
Dimension (Hardware/Software) | software | (a) The software failure incident reported in Article 115475 was attributed to a bug in the call server software, which is a software-related issue. The bug in the software caused severe disturbances in the platform responsible for dispatching emergency calls, leading to the network outage affecting calls to French emergency services [115475]. |
Objective (Malicious/Non-malicious) | non-malicious | (a) The software failure incident at Orange, which caused a network outage affecting calls to French emergency services, was classified as non-malicious. Orange's internal inquiry found that the outage was caused by a bug in the call server software, which was a result of an upgrade initiated in early May to increase the network's capacity. The glitch was not attributed to a cyberattack, indicating that the failure was not malicious [Article 115475]. (b) The incident was not reported to be a result of any malicious intent or actions by individuals seeking to harm the system. Instead, it was a non-malicious failure stemming from technical issues related to software bugs and network upgrades [Article 115475]. |
Intent (Poor/Accidental Decisions) | poor_decisions | (a) The software failure incident at Orange, France's biggest telecom operator, was primarily attributed to poor decisions. The failure was caused by a bug in the call server software, which disrupted emergency calls and put lives at risk. The bug stemmed from an upgrade initiated in early May to increase the network's capacity, indicating a decision that led to unintended consequences [Article 115475]. Additionally, there were delays in communication to authorities, emergency services, and the media, as well as in setting up an internal crisis unit, which further exacerbated the impact of the failure. |
Capability (Incompetence/Accidental) | accidental | (a) The software failure incident in Article 115475 was not attributed to development incompetence. Instead, it was caused by a bug in the call server software that disrupted emergency calls due to a software failure during an upgrade to increase the network's capacity [115475]. (b) The software failure incident in Article 115475 was accidental, as it was caused by a bug in the call server software during an upgrade to increase the network's capacity, rather than a deliberate action or intentional mistake [115475]. |
Duration | temporary | The software failure incident reported in Article 115475 was temporary. The outage affecting calls to French emergency services lasted from 1445 GMT to 2200 GMT on June 2, which indicates a specific timeframe of disruption rather than a permanent failure [115475]. Additionally, the article mentions that the bug causing the disruption was a result of an upgrade started in early May, indicating a specific event leading to the failure rather than a permanent issue [115475]. |
Behaviour | crash, omission, value, other | (a) crash: The software failure incident in the Orange network outage was due to a bug in the call server software, which severely disturbed the emergency calls, leading to a network outage affecting calls to French emergency services for several hours [115475]. (b) omission: The software failure incident resulted in about 11,800 calls, or 11% of the total, not being able to go through to the emergency services during the outage period [115475]. (c) timing: The delay in communication of the incident to authorities, emergency services, media, and competitors was highlighted as a factor in the software failure incident. It took about two hours to issue alerts and set up the crisis management unit after the incident occurred [115475]. (d) value: The software failure incident caused the system to perform its intended functions incorrectly, leading to a disruption in emergency call services and putting lives at risk [115475]. (e) byzantine: The software failure incident was not attributed to a cyberattack, indicating that the system did not exhibit inconsistent responses or interactions as a result of external malicious activity [115475]. (f) other: The software failure incident also involved a late communication of the incident to various stakeholders, including authorities, emergency services, media, and competitors, due to a delay in setting up an internal crisis unit [115475]. |
Layer | Option | Rationale |
---|---|---|
Perception | processing_unit, network_communication | (a) sensor: The software failure incident reported in Article 115475 was not related to a sensor error. The failure was specifically attributed to a bug in the call server software that disrupted the emergency calls, indicating that the issue was not caused by a sensor error [Article 115475]. (b) actuator: The software failure incident reported in Article 115475 did not involve an actuator error. The issue was traced back to a bug in the call server software that affected the dispatching of emergency calls, suggesting that the failure was not due to an actuator error [Article 115475]. (c) processing_unit: The software failure incident in Article 115475 was linked to a processing error. Orange's internal inquiry revealed that a software failure caused the network outage affecting calls to French emergency services. The disturbance in emergency calls was attributed to a bug in the call server software, indicating a processing error [Article 115475]. (d) network_communication: The software failure incident in Article 115475 was related to a network communication error. Orange stated that the emergency calls were severely disturbed due to a bug in the call server software, which impacted the network's capacity and access to some emergency services. The delay in communication to authorities, emergency services, and rivals was also highlighted, pointing to network communication issues [Article 115475]. (e) embedded_software: The software failure incident in Article 115475 was not explicitly linked to an embedded software error. The failure was attributed to a bug in the call server software that disrupted emergency calls, indicating that the issue was not directly related to embedded software errors [Article 115475]. |
Communication | connectivity_level | The software failure incident reported in Article 115475 was related to the connectivity level of the cyber physical system that failed. The failure was caused by a bug in the call server software, which severely disturbed the emergency calls that rely on a platform of servers responsible for dispatching calls [115475]. Additionally, the failure was attributed to an upgrade started in early May to increase the network's capacity, indicating issues at the network or transport layer [115475]. |
Application | TRUE | The software failure incident reported in Article 115475 was related to the application layer of the cyber physical system. Orange, France's biggest telecoms operator, experienced a network outage affecting calls to French emergency services due to a bug in the call server software, which disrupted the emergency calls that rely on a platform of servers responsible for dispatching calls [115475]. This bug was a contributing factor introduced during an upgrade in early May to increase the network's capacity, indicating an issue at the application layer of the system [115475]. |
Category | Option | Rationale |
---|---|---|
Consequence | delay, theoretical_consequence | (a) death: People lost their lives due to the software failure (b) harm: People were physically harmed due to the software failure (c) basic: People's access to food or shelter was impacted because of the software failure (d) property: People's material goods, money, or data was impacted due to the software failure (e) delay: People had to postpone an activity due to the software failure (f) non-human: Non-human entities were impacted due to the software failure (g) no_consequence: There were no real observed consequences of the software failure (h) theoretical_consequence: There were potential consequences discussed of the software failure that did not occur (i) other: Was there consequence(s) of the software failure not described in the (a to h) options? What is the other consequence(s)? The consequence of the software failure incident: The software failure incident at Orange resulted in a significant impact on people's lives. Approximately 11,800 calls, representing 11% of the total, could not go through to the emergency services during the outage period, potentially putting lives at risk [115475]. The delay in communication of the incident to authorities, emergency services, and the media, as well as the late setting up of an internal crisis unit, were highlighted as issues that could have serious consequences in emergency situations [115475]. The incident also raised pressure on Chief Executive Stephane Richard, indicating potential organizational consequences [115475]. |
Domain | unknown | (a) The software failure incident reported in Article 115475 was related to the telecommunications industry. Orange, France's biggest telecoms operator, experienced a network outage affecting calls to French emergency services due to a software failure in the call server system [Article 115475]. The incident impacted voice services and access to emergency services, highlighting the critical role of telecommunications in emergency communication. |
Article ID: 115475