Incident: Vodafone BlackBerry Outage: Data Services Disrupted in Europe, MEA

Published Date: 2013-01-11

Postmortem Analysis
Timeline 1. The software failure incident of BlackBerry service experiencing problems in Europe, the Middle East, and Africa happened in January 2013 [16579].
System The system that failed in the software failure incident reported in Article 16579 was: 1. Vodafone's network due to a router error [16579].
Responsible Organization 1. Vodafone [16579] 2. Router error [16579]
Impacted Organization 1. BlackBerry users in Europe, the Middle East, and Africa [16579]
Software Causes 1. The software cause of the failure incident was a router error on Vodafone's network, which led to issues with data services for BlackBerry customers in Europe, the Middle East, and Africa [16579].
Non-software Causes 1. The failure incident was caused by a router error on Vodafone's network, as confirmed by Vodafone in their statement [16579].
Impacts 1. Many BlackBerry customers in Europe, the Middle East, and Africa were unable to access data services, including receiving emails and using instant messaging [16579]. 2. Around 6 percent of BlackBerry users in the region were affected by the outage, unable to send or receive messages from their devices [16579]. 3. The outage occurred just before the long-awaited BlackBerry 10 launch, potentially impacting the company's reputation and customer satisfaction [16579].
Preventions 1. Improved network monitoring and redundancy systems could have prevented the software failure incident. By having better monitoring in place, Vodafone could have detected the router error earlier and taken preventive actions to avoid widespread service disruptions [16579]. 2. Enhanced communication and coordination between RIM and Vodafone could have helped prevent the incident. Clearer lines of communication and collaboration between the two companies might have led to quicker identification and resolution of the issue, minimizing the impact on BlackBerry users [16579].
Fixes 1. Fixing the router error causing the data services issue on the Vodafone network could resolve the software failure incident affecting BlackBerry users in Europe, the Middle East, and Africa [16579].
References 1. Vodafone [16579] 2. Research In Motion (RIM) [16579] 3. Reuters [16579]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization (a) The software failure incident having happened again at one_organization: The article [16579] mentions that this is not the first time RIM (Research In Motion) has suffered an outage on the Vodafone network. In September 2012, RIM restored BlackBerry services on the Vodafone network after a brief outage led to data services on devices being rendered useless. This indicates a recurring issue with RIM's services on the Vodafone network. (b) The software failure incident having happened again at multiple_organization: The article [16579] does not provide information about similar incidents happening at other organizations.
Phase (Design/Operation) operation (a) The software failure incident in this case seems to be related to the operation phase rather than the design phase. The issue was caused by a router error on Vodafone's network, which impacted some BlackBerry customers in Europe, the Middle East, and Africa [16579]. The statement from Vodafone mentioned that the problem was due to a router error and that services were being restored [16579]. (b) The failure was not attributed to the development phase but rather to the operation phase, specifically due to a router error on Vodafone's network [16579].
Boundary (Internal/External) within_system, outside_system (a) The software failure incident involving the BlackBerry service outage was primarily within the system. Research In Motion (RIM) initially responded to the outage by stating that all BlackBerry services were operating normally but acknowledged that a wider Vodafone service issue was impacting some BlackBerry customers in Europe, the Middle East, and Africa [16579]. This indicates that the root cause of the failure was related to Vodafone's network, which is an internal factor within the system. (b) The software failure incident was also influenced by factors outside the system. Vodafone confirmed that the issue causing the BlackBerry outage was a router error on their network, which is an external factor originating from outside the BlackBerry system [16579]. This external factor contributed to the disruption of services for BlackBerry users in the affected regions.
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident in this case was attributed to a router error on Vodafone's network, which falls under the category of non-human actions. The issue with BlackBerry services in Europe, the Middle East, and Africa was caused by this router error, leading to disruptions in data services for BlackBerry customers [16579]. (b) While the software failure incident itself was due to a non-human factor (router error), there were human actions involved in the response and resolution of the issue. Both RIM and Vodafone were actively working to restore the service and communicate with customers about the problem. RIM responded to inquiries by attributing the issue to Vodafone's network, indicating a level of collaboration between the companies to address the problem [16579].
Dimension (Hardware/Software) hardware (a) The software failure incident in this case was attributed to a hardware issue. Vodafone confirmed that the issue causing the BlackBerry outage was due to a router error [16579]. This indicates that the root cause of the failure originated in the hardware infrastructure (router) rather than the software itself. (b) The software failure incident was not directly attributed to software issues in this case. RIM responded to emails stating that all BlackBerry services were operating normally on their end, and they attributed the problem to a wider Vodafone service issue impacting some BlackBerry customers in Europe, the Middle East, and Africa [16579]. This suggests that the software on the BlackBerry devices was functioning correctly, but the failure was caused by issues in Vodafone's network infrastructure.
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident in this case was non-malicious. The issue was caused by a router error on Vodafone's network, as confirmed by Vodafone in their statement. RIM also responded to lay the blame on Vodafone's network, indicating that it was not a deliberate act to harm the system but rather an unintentional technical problem [16579].
Intent (Poor/Accidental Decisions) unknown (a) The software failure incident related to the BlackBerry outage was not due to poor decisions but rather to a router error on Vodafone's network. RIM responded to lay the blame on Vodafone's network, stating that all BlackBerry services were operating normally but acknowledging that a wider Vodafone service issue was impacting some BlackBerry customers in Europe, the Middle East, and Africa [16579]. Vodafone also confirmed that the issue was caused by a router error and that services were in the process of being restored [16579].
Capability (Incompetence/Accidental) accidental (a) The software failure incident in this case was not attributed to development incompetence. The articles do not mention any issues related to lack of professional competence by humans or the development organization as the cause of the failure [16579]. (b) The software failure incident was attributed to an accidental factor. Vodafone confirmed that the issue causing the BlackBerry outage was due to a router error, indicating that the failure was accidental rather than a result of development incompetence [16579].
Duration temporary The software failure incident reported in the articles was temporary. The BlackBerry service outage experienced by Vodafone customers in Europe, the Middle East, and Africa was caused by a router error on Vodafone's network [16579]. The issue was being actively worked on by Vodafone and RIM to restore services as quickly as possible. RIM confirmed that all BlackBerry services were operating normally but acknowledged that a wider Vodafone service issue was impacting some BlackBerry customers in the mentioned regions. Vodafone provided updates on the restoration process and apologized for any inconvenience caused to customers.
Behaviour crash, other (a) crash: The software failure incident described in the articles can be categorized as a crash. The BlackBerry service experienced problems in Europe, the Middle East, and Africa, with many users unable to access data services on their devices. This resulted in customers not being able to receive emails or communicate over instant messaging, indicating a failure of the system to perform its intended functions [16579]. (b) omission: The incident does not specifically mention a failure due to the system omitting to perform its intended functions at an instance(s). (c) timing: The incident does not indicate a failure due to the system performing its intended functions correctly but too late or too early. (d) value: The incident does not suggest a failure due to the system performing its intended functions incorrectly. (e) byzantine: The incident does not point to a failure due to the system behaving erroneously with inconsistent responses and interactions. (f) other: The behavior of the software failure incident can be described as a crash, where the system lost its state and failed to perform its intended functions, leading to users being unable to access data services on their devices [16579].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence delay (a) death: People lost their lives due to the software failure (b) harm: People were physically harmed due to the software failure (c) basic: People's access to food or shelter was impacted because of the software failure (d) property: People's material goods, money, or data was impacted due to the software failure (e) delay: People had to postpone an activity due to the software failure (f) non-human: Non-human entities were impacted due to the software failure (g) no_consequence: There were no real observed consequences of the software failure (h) theoretical_consequence: There were potential consequences discussed of the software failure that did not occur (i) other: Was there consequence(s) of the software failure not described in the (a to h) options? What is the other consequence(s)? The articles do not mention any consequences related to death, harm, basic needs, property loss, or non-human entities due to the BlackBerry service outage. The main consequence discussed in the articles is the inconvenience caused to BlackBerry customers in Europe, the Middle East, and Africa who were unable to access data services on their devices [16579]. The outage impacted users' ability to receive emails, communicate over instant messaging, and send or receive messages from their devices. The incident caused disruption and inconvenience to users, leading to a delay in their communication activities. The outage was attributed to a router error on Vodafone's network, and both Vodafone and RIM were working to restore services as quickly as possible.
Domain information (a) The failed system in this incident was related to the information industry as it affected BlackBerry users' ability to access data services, e-mail, and instant messaging [16579]. The outage impacted users in Europe, the Middle East, and Africa, highlighting the importance of data communication services in the information industry.

Sources

Back to List