| Recurring |
one_organization, multiple_organization |
(a) The software failure incident having happened again at one_organization:
The article mentions that Lloyds Banking Group experienced payment problems due to a server failure, leading to debit card transactions being declined and ATMs not dispensing cash [23587]. This incident is similar to a previous "glitch" at RBS's banking brands that caused customers to go weeks without proper access to their accounts. Both incidents highlight the challenges faced by banks with aging IT systems in need of overhaul.
(b) The software failure incident having happened again at multiple_organization:
The article discusses how various high street banks, including Lloyds Banking Group and RBS, have faced IT issues due to aging systems and underinvestment in their IT infrastructure [23587]. These incidents indicate a broader industry issue where multiple organizations are struggling with the complexity of their legacy systems and the challenges of integrating new technologies and regulatory changes. |
| Phase (Design/Operation) |
design, operation |
(a) The articles highlight that software failures in the banking systems are often attributed to the complexities introduced during system development and updates. Legacy systems at banks, which have been continuously modified and added onto over the years, are described as resembling a "house of cards" where making changes to one part can have unforeseen consequences elsewhere [23587]. The article also mentions that new functions are typically written in different programming languages, on different machines, by different teams, making it challenging for any single person or team to fully understand the entire structure of the system, leading to delays in identifying and fixing issues [23587].
(b) In terms of operational factors contributing to software failures, the articles discuss how the continuous operation of banking systems without downtime for maintenance poses challenges for implementing necessary changes and updates. The article compares the situation to "trying to change the windscreen while you're driving down the M6," emphasizing the difficulty in conducting maintenance when the system is constantly in use [23587]. Additionally, the article mentions that the 24/7 nature of banking operations, with payments going through continuously, makes it nearly impossible to schedule maintenance windows, further complicating the operational aspects of maintaining reliable IT systems [23587]. |
| Boundary (Internal/External) |
within_system, outside_system |
(a) The articles highlight that the software failure incidents in the banking systems were primarily within the system. The failures were attributed to factors such as aging IT systems in need of overhaul, legacy systems that were not designed to handle modern banking channels like online and mobile banking, and the complexity arising from continuous bolt-on changes to the systems [23587].
(b) Additionally, the articles mention that external factors such as regulatory changes, increased capital requirements, and chronic underinvestment in IT systems also contributed to the software failures within the banking systems [23587]. |
| Nature (Human/Non-human) |
non-human_actions, human_actions |
(a) The software failure incident occurring due to non-human actions:
The articles highlight that the software failure incidents in the banking systems were primarily attributed to aging IT systems, legacy systems, continuous bolt-on changes, and the complexity arising from the integration of various technologies and channels over time. These factors were introduced without direct human participation but contributed to the failures. The systems were described as resembling a "house of cards," where even small changes could have cascading effects on the overall system [23587].
(b) The software failure incident occurring due to human actions:
On the other hand, human actions also played a role in the software failure incidents. The articles mention chronic underinvestment in IT systems, years of under-spending on computer systems, and the laying off of IT staff as contributing factors to the failures. Additionally, the prioritization of investments in more "sexy" or customer-facing technologies over the back-office systems was highlighted as a human decision affecting the reliability of the IT infrastructure [23587]. |
| Dimension (Hardware/Software) |
hardware, software |
(a) The articles mention that the IT systems at high street banks have been facing issues due to aging IT systems that are in need of a serious overhaul. The systems are described as "legacy systems" that are 30-40 years old and were originally set up for branch banking but have been continuously modified to accommodate new technologies like ATMs, online banking, and mobile banking [23587]. These modifications and bolted-on changes have made the systems more complex and resemble a "house of cards" where making a change to a small part of the code can have far-reaching consequences, leading to failures originating in hardware components.
(b) The articles also highlight that new functions in the banking systems are usually written in different programming languages, on different machines, by different teams, which makes it challenging for a single person or team to fully understand the entire structure of the system. This complexity in software development and integration contributes to software failures when changes are made or issues arise, leading to incidents where teams scramble to identify the root cause of the problem [23587]. |
| Objective (Malicious/Non-malicious) |
non-malicious |
(a) The articles do not mention any malicious intent behind the software failure incident reported in the news articles [23587].
(b) The software failure incident discussed in the articles is attributed to non-malicious factors such as aging IT systems, legacy systems, continuous bolt-on changes, underinvestment in IT infrastructure, and the complexity arising from multiple programming languages and teams working on different functions of the system [23587]. These non-malicious factors have contributed to the challenges faced by banks in maintaining reliable and robust IT systems, leading to incidents like payment problems, server failures, and ATM disruptions. |
| Intent (Poor/Accidental Decisions) |
poor_decisions, accidental_decisions |
(a) The articles highlight that the software failure incidents in the banking systems were partly due to poor decisions made over the years. There was chronic underinvestment in the systems, with banks laying off IT staff and cutting quality checks due to squeezed budgets [23587]. The legacy systems at banks, some of which are 30-40 years old, were not adequately updated to keep up with the evolving technology and regulatory changes, leading to a situation where new functions were added on top of old systems in a complex and interconnected manner [23587]. These poor decisions regarding underinvestment, lack of comprehensive updates, and reliance on outdated systems contributed to the software failures experienced by the banks.
(b) The software failures were also a result of accidental decisions or unintended consequences. The systems were described as resembling a "house of cards," where making a change to a small piece of code could have far-reaching effects on other parts of the system, leading to failures [23587]. Additionally, the complexity of the systems, with different functions written in different languages by different teams, made it challenging to fully understand the entire structure of the system, causing delays in identifying and fixing problems when they occurred [23587]. These accidental decisions or unintended consequences in system design and maintenance also played a role in the software failure incidents. |
| Capability (Incompetence/Accidental) |
development_incompetence |
(a) The articles highlight the software failure incident in the banking sector as a result of development incompetence. It is mentioned that the banks' systems are outdated, with some legacy systems being 30-40 years old and not originally designed to handle modern banking channels like online and mobile banking [23587]. The complexity of the systems has increased over time due to continuous bolt-on changes rather than starting from scratch, leading to a situation where even small code changes can have widespread impacts on the system, causing failures [23587].
(b) The incident also reflects accidental failures caused by the continuous layering of new systems on top of old ones. This approach has led to breakdowns becoming more frequent, as mentioned in the articles. The lack of sufficient investment in the IT systems, chronic underinvestment, and the prioritization of spending on other "sexier" technologies rather than the back-office systems have contributed to the accidental failures in the banking IT infrastructure [23587]. |
| Duration |
permanent, temporary |
The articles discuss software failure incidents that can be categorized as both temporary and permanent:
(a) Permanent: The articles mention that the banks' IT systems are facing ongoing issues due to a combination of factors such as aging legacy systems, continuous bolt-on changes, complex structures, chronic underinvestment, and the challenge of overhauling systems while operations are ongoing [23587].
(b) Temporary: Specific incidents like the server failure at Lloyds Banking Group causing debit card transactions to be declined and ATMs to not dispense cash for around three-and-a-half hours represent temporary software failures that were resolved within a relatively short timeframe [23587]. |
| Behaviour |
omission, other |
(a) crash: The articles mention incidents where customers were cut off from their cash due to server failures, leading to debit card transactions being declined and ATMs not dispensing cash [23587].
(b) omission: The articles discuss how some customers of RBS's banking brands went weeks without being able to access their accounts properly due to a "glitch," indicating an omission of the system to provide access to accounts [23587].
(c) timing: The articles highlight that the systems in banks are struggling to keep up with the real-time demands of modern banking, with the reconciliation between different banking channels becoming increasingly challenging due to the widening gulf between background processes and user activities [23587].
(d) value: There is no specific mention of the system performing its intended functions incorrectly in the articles.
(e) byzantine: The articles describe how changes made to a small part of the code can have far-reaching consequences, causing issues in seemingly unrelated areas of the system, resembling a "house of cards" where a change in one area can lead to failures in another [23587].
(f) other: The articles also discuss the chronic underinvestment in IT systems in banks, leading to quality checks being cut, compounded by changes in regulations and increased capital requirements, which further strain the already complex and aging systems [23587]. |