| Recurring |
one_organization |
(a) The software failure incident having happened again at one_organization:
The article mentions that Telstra experienced a massive outage for the second time in as many months. The first outage occurred on February 8, caused by a human error that knocked one of the company's mobile nodes offline. The second outage, which affected roughly 8 million people, was due to a problem that triggered a significant number of customers to be disconnected from the network, causing congestion [42115].
(b) The software failure incident having happened again at multiple_organization:
There is no mention in the article of similar incidents happening at other organizations or with their products and services. Therefore, it is unknown if similar software failure incidents have occurred at multiple organizations. |
| Phase (Design/Operation) |
design, operation |
(a) The software failure incident related to the design phase can be seen in the article where Telstra experienced a network outage affecting millions of customers. The February 8 outage was caused by a human error that knocked one of the company's mobile nodes offline, leading to congestion when users tried to reconnect [42115].
(b) The software failure incident related to the operation phase is evident in the same article where Telstra mentioned that a problem triggered a significant number of customers to be disconnected from the network, causing congestion as they all tried to automatically reconnect at the same time [42115]. |
| Boundary (Internal/External) |
within_system |
(a) The software failure incident reported in Article 42115 was primarily within_system. The outage affecting Telstra's network was caused by a problem within the system that triggered a significant number of customers to be disconnected and then automatically reconnecting at the same time, leading to congestion on the network [42115]. The CEO mentioned that the cause of the outage was "unrelated" to a previous incident caused by a human error, indicating that the issue originated within the system itself. |
| Nature (Human/Non-human) |
non-human_actions, human_actions |
(a) The software failure incident occurring due to non-human actions:
- The article mentions that the outage was caused by a problem that triggered a significant number of customers to be disconnected from the network, leading to congestion as they all automatically reconnected at the same time. This indicates a failure due to contributing factors introduced without human participation [42115].
(b) The software failure incident occurring due to human actions:
- The article reports a previous major outage on Telstra's network caused by an "embarrassing human error" that knocked one of the company's mobile nodes offline, resulting in massive congestion when users attempted to reconnect. This incident highlights a failure due to contributing factors introduced by human actions [42115]. |
| Dimension (Hardware/Software) |
hardware, software |
(a) The software failure incident related to hardware:
- The February 8 outage was caused by an "embarrassing human error" that knocked one of Telstra's mobile nodes offline, leading to massive congestion when users attempted to reconnect [42115].
(b) The software failure incident related to software:
- The recent outage affecting 8 million people was due to a problem that triggered a significant number of customers to be disconnected from the network, causing congestion as they all automatically tried to reconnect at the same time [42115]. |
| Objective (Malicious/Non-malicious) |
non-malicious |
(a) The software failure incident related to the Telstra outage mentioned in Article 42115 was non-malicious. The outage was caused by a problem that triggered a significant number of customers to be disconnected from the network, leading to congestion when they all attempted to reconnect simultaneously. Telstra CEO Andy Penn stated that the cause of the outage was unrelated to a previous incident caused by a human error [42115]. |
| Intent (Poor/Accidental Decisions) |
accidental_decisions |
(a) The software failure incident related to Telstra's network outage was not explicitly attributed to poor decisions. The February 8 outage was caused by an "embarrassing human error" that knocked one of the company's mobile nodes offline, leading to massive congestion when users attempted to reconnect [42115]. Telstra CEO Andy Penn mentioned that the cause of the recent outage was "unrelated" to the previous problems, indicating that poor decisions were not the primary contributing factor to the incident. |
| Capability (Incompetence/Accidental) |
development_incompetence, accidental |
(a) The software failure incident related to development incompetence is evident in the article as Telstra experienced a major outage affecting millions of customers due to what the company called an "embarrassing human error" that caused one of the mobile nodes to go offline, leading to network congestion and disconnections [42115]. This incident highlights a failure caused by a lack of professional competence in managing the network infrastructure.
(b) The software failure incident related to accidental factors is also present in the article, where Telstra faced another outage that was described as "unrelated" to the previous one caused by a human error. The company is still investigating how the service disruption occurred, indicating that the incident was accidental in nature rather than intentional [42115]. |
| Duration |
temporary |
(a) The software failure incident described in the article was temporary. It mentions that the problem affecting roughly 8 million people, half of Telstra's mobile customers, was due to congestion on the network. The issue was first identified at 6 p.m. and customers started to be reconnected within 2 hours, with full service returning by 10 p.m. This indicates that the failure was not permanent but rather a temporary disruption in service [42115]. |
| Behaviour |
crash |
(a) crash: The article mentions a network failure incident at Telstra that caused congestion on the network, leading to roughly 8 million people being unable to make calls. The problem was first identified at 6 p.m., and customers started to be reconnected within 2 hours, with full service returning by 10 p.m. This indicates a crash where the system lost its state and was not performing its intended functions [42115].
(b) omission: The article does not specifically mention any instance of the system omitting to perform its intended functions at a particular instance.
(c) timing: The article does not indicate any timing-related failures where the system performed its intended functions too late or too early.
(d) value: The article does not mention any instances of the system performing its intended functions incorrectly.
(e) byzantine: The article does not describe any behavior of the system with inconsistent responses and interactions.
(f) other: The behavior of the software failure incident described in the article falls under the category of a crash, where the system lost its state and was not performing its intended functions as customers were disconnected from the network due to congestion issues [42115]. |