Recurring |
one_organization |
(a) The software failure incident related to the crashing servers due to a single packet of data from the Yealink SIP-T22P phones was specifically attributed to an issue with the firmware on the motherboards made by Lex CompuTech, as confirmed by Intel [17085]. This incident was isolated to a specific type of motherboard that Kielhofner blogged about, indicating that it was not a widespread issue within Intel's networking controller that had been shipping for five years.
(b) The articles do not provide information about similar incidents happening at other organizations or with their products and services. |
Phase (Design/Operation) |
design, operation |
(a) The software failure incident described in the article was primarily due to a design issue. The root cause of the problem was traced back to a single packet of data being sent by a specific model of phone, the Yealink SIP-T22P, which was causing the VOIP servers to crash. This issue was further attributed to a firmware bug in the Electrically Erasable Programmable Read-Only Memory (EEPROM) software used by the networking hardware built by Intel but shipped with motherboards manufactured by Lex CompuTech [17085].
(b) The operation of the system did play a role in exacerbating the software failure incident. After each crash caused by the packet of data, the servers had to be manually unplugged, plugged back in, and restarted to get back online. This operational procedure was necessary due to the nature of the failure, where simply restarting the server or turning it off did not resolve the issue. Additionally, the potential impact of the bug on the operation of servers was highlighted, as it could be exploited to take down entire racks of servers that were affected by the buggy firmware [17085]. |
Boundary (Internal/External) |
within_system |
(a) The software failure incident described in the article is primarily within_system. The issue stemmed from a single packet of data sent by a specific model of phone, the Yealink SIP-T22P, which caused the VOIP servers to crash. The problem was traced back to a firmware bug in the Intel 82574L network controller, which was due to an incorrect EEPROM image programmed by the motherboard manufacturer, Lex CompuTech [17085]. |
Nature (Human/Non-human) |
non-human_actions, human_actions |
(a) The software failure incident in this case was primarily due to non-human actions. The root cause of the issue was traced back to a single packet of data being sent by a specific model of phone (Yealink SIP-T22P) to the VOIP servers, which caused the servers to crash. This packet of data affected the Intel 82574L network controller on the servers, leading to the crashes [17085].
(b) Human actions were also involved in the software failure incident. The firmware issue that caused the bug was attributed to the company that made the motherboards, Lex CompuTech. They used the wrong version of the Electrically Erasable Programmable Read-Only Memory (EEPROM) software for the controller setup, which ultimately led to the problem. Additionally, the detective work conducted by Kristian Kielhofner to identify the root cause of the crashes involved human effort and expertise [17085]. |
Dimension (Hardware/Software) |
hardware |
(a) The software failure incident in this case was primarily due to hardware issues. The incident was traced back to a bug in the firmware of the Intel 82574L network controller, which was caused by an incorrect EEPROM image programmed by the motherboard manufacturer, Lex CompuTech [17085]. The bug in the hardware caused the servers to crash when receiving a specific packet of data from the Yealink SIP-T22P phone model, leading to network connectivity issues and the need for manual intervention to restart the servers.
(b) The software failure incident was not directly attributed to software issues but rather to a hardware bug in the firmware of the network controller. The software running on the servers, including the Linux-based VOIP servers, was functioning as intended until it encountered the problematic packet of data that triggered the hardware failure [17085]. |
Objective (Malicious/Non-malicious) |
non-malicious |
(a) The software failure incident described in the article is non-malicious. The failure was caused by a bug in the firmware of the networking hardware, specifically the Intel 82574L network controller, due to an incorrect EEPROM image programmed during manufacturing by the motherboard manufacturer, Lex CompuTech [17085]. The incident was not a result of malicious intent but rather a technical flaw in the system. |
Intent (Poor/Accidental Decisions) |
accidental_decisions |
(a) The software failure incident described in the article was not due to poor decisions but rather an accidental decision or mistake. The root cause of the issue was traced back to a firmware bug in the Electrically Erasable Programmable Read-Only Memory (EEPROM) software that was programmed incorrectly during manufacturing by the motherboard manufacturer, Lex CompuTech. Intel confirmed that the bug was not its fault but was in the firmware that Lex shipped with its motherboards [17085]. |
Capability (Incompetence/Accidental) |
accidental |
(a) The software failure incident described in the article was not due to development incompetence. Instead, it was attributed to a specific bug in the firmware that was shipped with the motherboards by the company Lex CompuTech. Intel confirmed that the issue was related to an incorrect EEPROM image programmed during manufacturing by the motherboard vendor, not a design problem with the Intel networking controller [17085].
(b) The software failure incident was accidental in nature. It was caused by a single packet of data sent by a specific model of phone (Yealink SIP-T22P) that was crashing the VOIP servers. This packet of data was identified as the root cause of the crashes, and it was not intentionally created to cause harm but rather had an unintended impact on the server hardware [17085]. |
Duration |
temporary |
(a) The software failure incident described in the article was temporary. The servers were crashing due to a specific packet of data being sent by a particular model of phone, the Yealink SIP-T22P. This packet caused the servers to crash, specifically targeting the Intel 82574L network controller. Restarting the server or turning it off did not fix the issue, and the only solution was to unplug the server, plug it back in, and then restart the machine to get back on the network [17085]. This indicates that the failure was temporary and could be resolved by taking specific actions to address the root cause of the issue. |
Behaviour |
crash, other |
(a) crash: The software failure incident described in the article resulted in crashes of the servers running the VOIP systems. The crashes were caused by a specific packet of data sent by a particular model of phone, which led to the system's Intel network controller being knocked offline and requiring manual intervention to restart the servers [17085].
(b) omission: There is no specific mention of the software failure incident being related to the system omitting to perform its intended functions at an instance(s) in the provided article.
(c) timing: The software failure incident did not involve the system performing its intended functions too late or too early; rather, it led to unexpected crashes of the servers [17085].
(d) value: The failure was not related to the system performing its intended functions incorrectly; instead, it was about the system crashing due to a specific packet of data causing issues with the network controller [17085].
(e) byzantine: The software failure incident did not exhibit behavior of the system behaving erroneously with inconsistent responses and interactions; it was more focused on the specific cause of the crashes related to the network controller issue [17085].
(f) other: The other behavior exhibited in this software failure incident was the unique nature of the bug causing the crashes. The specific packet of data sent by a particular model of phone had a byte value that, when set to certain numbers, would either crash the controller or inoculate it against further packets of death until the server was powered off. This behavior was described as unusual and not commonly encountered in network troubleshooting [17085]. |