Published Date: 2018-04-28
Postmortem Analysis | |
---|---|
Timeline | 1. The software failure incident at TSB occurred in April 2018 [70984, 70984, 70984]. 2. The incident happened in April 2018. |
System | 1. TSB's IT system 2. Proteo system 3. Legacy systems 4. Lloyds Banking Group's system [Citation: #70984] |
Responsible Organization | 1. TSB, Sabadell, and Lloyds Banking Group were responsible for causing the software failure incident [70984]. 2. TSB's rushed and inadequate testing, poor internal communication, and lack of thorough testing contributed to the failure [70984]. 3. Sabadell's decision to migrate TSB to its Proteo system without proper testing and communication also played a role in the incident [70984]. |
Impacted Organization | 1. TSB customers were impacted by the software failure incident [70984, 72311, 70069, 72314]. 2. TSB itself faced significant disruptions and financial costs due to the IT meltdown [70984, 72311, 70069, 72314]. 3. Sabadell Bank, the Spanish owner of TSB, was also impacted by the incident [70984, 72311, 70069]. |
Software Causes | 1. The software migration from legacy systems to Sabadell's in-house developed system, Proteo, at TSB was hindered by rushed and inadequate testing and poor internal communication [70984]. 2. Testing of different systems was not as thorough as it could have been due to time constraints and poor scoping of digital and mobile payment testing [70984]. 3. There were multiple daily failures in Proteo during testing, and testing was conducted based on knowledge of legacy systems rather than adequate training on Proteo [70984]. 4. Lack of communication between IT and the business about managing testing and poorly designed or rushed tests contributed to the failure incident [70984]. |
Non-software Causes | 1. Rushed and inadequate testing due to a self-imposed deadline and lack of time to test everything thoroughly [70984] 2. Poor internal communication between IT and the business regarding testing management [70984] |
Impacts | 1. The software failure incident at TSB led to chaos and left up to 1.9 million customers unable to access their accounts, affecting services like online banking and money transfers [70984]. 2. TSB customers faced problems making bill and mortgage payments, with reports of incorrect balances, accounts of other customers being visible, and issues with direct debits [70984]. 3. The IT meltdown caused severe disruptions for customers, including some losing money from their accounts due to fraud, with about 1,300 customers affected [72314]. 4. Businesses reported difficulties in paying staff, and some customers faced challenges in paying for services like restaurant bills and hotel stays [70069]. 5. TSB had to waive £10 million in overdraft fees, increase interest payments on current accounts, and cancel all overdraft fees for April to prevent customer losses [72311]. 6. The crisis led to TSB facing a compensation bill likely to run into tens of millions of pounds, tarnishing its reputation and potentially causing a customer exodus [70984]. 7. The incident resulted in TSB bringing in remedial teams from IBM to address the problems and facing investigations by regulators like the Financial Conduct Authority [70984]. |
Preventions | 1. Thorough Testing: The software migration at TSB could have been prevented by conducting more thorough and comprehensive testing of the new system before the actual migration took place. Rushed and inadequate testing led to various issues during the migration process [70984]. 2. Proper Communication: Improved internal communication between IT teams and the business side could have helped in managing the testing process more effectively. Lack of communication about testing responsibilities and progress contributed to the problems faced during the migration [70984]. 3. Adequate Training: Providing adequate training on the new system, Proteo, to employees involved in testing and using the system could have helped in conducting more accurate and relevant tests. Lack of proper training led to testing based on legacy systems knowledge, which was not sufficient for the new system [70984]. |
Fixes | 1. Thorough and Extensive Testing: The software migration at TSB could have been fixed by conducting thorough and extensive testing of the different systems involved to ensure all functionalities work properly [70984, 70984]. 2. Improved Internal Communication: Better internal communication between IT and business teams could have helped in managing the testing process more effectively and ensuring all aspects were covered adequately [70984, 70984]. 3. Proper Training on New System: Providing adequate training on the new Proteo system to testers and employees could have improved the testing process and reduced errors related to legacy system knowledge [70984, 70984]. 4. Avoid Rushed Deadlines: Avoiding rushed deadlines and allowing sufficient time for testing and system readiness before the migration could have prevented the chaos that ensued after the upgrade [70984, 70984]. | References | 1. Insiders with extensive knowledge of the systems involved [70984] 2. TSB chief executive, Paul Pester [72311] 3. Contractors who worked on the project [70984] 4. TSB chairman, Richard Meddings [72311] 5. Sabadell Bank [70984] 6. Financial Conduct Authority (FCA) [72311] 7. Lloyds Banking Group [70984] 8. IBM [70984] 9. Treasury committee [72311] 10. Reuters [70984] |
Category | Option | Rationale |
---|---|---|
Recurring | one_organization, multiple_organization | (a) The software failure incident having happened again at TSB: - TSB faced a major IT meltdown in 2018 when attempting to migrate customer records to a new system, causing chaos and leaving customers unable to access their accounts [Article 70984]. - The IT migration at TSB was hindered by rushed and inadequate testing, poor internal communication, and lack of thoroughness in testing different systems, leading to problems with digital and mobile payments after the migration [Article 70984]. - TSB's IT meltdown led to customers being locked out of their accounts, incorrect balances being displayed, and accounts of other customers being visible to some users [Article 70984]. - TSB's chief executive, Paul Pester, faced criticism for downplaying the access issues and prematurely declaring that the IT problems had been resolved for most customers [Article 72311]. - TSB customers continued to experience difficulties accessing their accounts for several days, with Pester admitting that the bank was "on its knees" and bringing in experts from IBM to address the issues [Article 72311]. - The IT migration at TSB was part of a project to transfer customer records from Lloyds Banking Group's system to Sabadell's Proteo system, with testing being conducted based on legacy systems knowledge rather than adequate training on Proteo [Article 70984]. (b) The software failure incident having happened again at other organizations: - RBS faced a similar IT failure in the past, resulting in millions of customers being unable to access their accounts, leading to fines and regulatory investigations [Article 70984]. - Banks globally are at risk of facing similar crises as they upgrade aging computer systems after years of under-investment and mergers, potentially leading to technical problems and customer disruptions [Article 70984]. |
Phase (Design/Operation) | design, operation | (a) In the TSB software failure incident, the design phase contributed to the failure. The migration to a new IT system, Proteo, from legacy systems was rushed and inadequately tested. Contractors involved in the project mentioned that testing was not thorough as TSB neared deadlines, and digital and mobile payment testing were not properly scoped. There were multiple daily failures in Proteo during testing, and there was a lack of adequate training on Proteo, leading to testing based on legacy system knowledge [70984]. (b) The operation phase also played a role in the TSB software failure incident. After the migration, customers reported problems making bill and mortgage payments, incorrect balances, and seeing other customers' accounts. TSB customers faced difficulties accessing their accounts, and some experienced issues with payments and account details. The bank struggled to provide a full service, with customers still unable to make payments or access key accounts weeks after the migration [70984]. |
Boundary (Internal/External) | within_system, outside_system | (a) within_system: The software failure incident at TSB was primarily due to issues within the system itself. The failure occurred during a computer systems migration where TSB was transferring customer records to a new IT system designed by its owner, Sabadell. The migration involved moving data from legacy systems to the new Proteo system, leading to problems with online banking, mobile app access, incorrect balances, and account access issues for customers [70984]. The testing of the different systems during the migration was rushed and inadequate, with poor internal communication between IT and the business about managing testing. Testing was not as thorough as it could have been, and there were failures in the Proteo system that persisted for days. The testing was conducted based on knowledge of the legacy systems, and there were multiple daily failures in Proteo before the migration went live [70984]. (b) outside_system: The software failure incident at TSB was also influenced by factors outside the system. TSB faced challenges due to the complex legacy systems inherited from Lloyds Banking Group, which were a result of the merger with HBOS during the banking crisis. The decision to migrate to a new system was influenced by Sabadell's acquisition of TSB and the need to integrate with Sabadell's Proteo system. The budget for the migration was considered low for the scale of the project, and warnings were given about the high risk and potential costs of the migration [70984]. |
Nature (Human/Non-human) | non-human_actions, human_actions | (a) The software failure incident occurring due to non-human actions: - The TSB IT meltdown in 2018 was caused by a computer systems migration that left up to 1.9 million customers unable to access their accounts. The migration was hindered by rushed and inadequate testing, poor internal communication, and a lack of thorough testing of different systems, leading to problems with digital and mobile payments. The testing was not as thorough as it could have been due to time constraints and a rushed deadline [70984]. - The TSB migration to Sabadell's Proteo system from legacy systems was a complex process involving the transfer of customer records and accounts. The system was a "bodge of many old systems" inherited from Lloyds Banking Group, leading to challenges in the migration process [70984]. - The TSB system was a mirror copy of the sprawling Lloyds Banking Group merged systems, which posed challenges for a smaller bank like TSB to inherit all the problems of a larger system. The Proteo system was designed to handle mergers like that of TSB into the Spanish group, but the migration faced difficulties due to the complexity of the legacy systems [70984]. (b) The software failure incident occurring due to human actions: - TSB faced criticism for rushed and inadequate testing, poor internal communication, and a lack of thorough testing of different systems during the IT migration. The testing was not as thorough as it could have been due to time constraints and a rushed deadline, indicating human factors contributing to the failure [70984]. - TSB rejected claims of shortcomings in testing but acknowledged the need for an investigation into why the migration did not go as expected. The bank is working to address the issues and keep customers informed about the situation [70984]. - TSB chief executive Paul Pester and chairman Richard Meddings are set to appear at a parliamentary hearing to explain how the problems occurred and what actions are being taken to fix them, indicating human accountability in addressing the software failure incident [70984]. |
Dimension (Hardware/Software) | hardware, software | (a) The software failure incident occurring due to hardware: - The TSB bank was migrating to Sabadell's in-house developed system, Proteo, from legacy systems for which it had been paying Lloyds around a hundred million pounds a year. The migration involved transferring records and accounts of its customers [70984]. - The Lloyds system, inherited by TSB, was described as a "bodge of many old systems" resulting from the integration of various banks during the banking crisis [70984]. - The TSB system was a mirror copy of the sprawling LBG merged systems, which was a bad fit for a smaller bank like TSB [70069]. - The Proteo system was designed in 2000 specifically to handle mergers like that of TSB into the Spanish group, but the development team did not have full control or understanding of the system they were migrating data from [70069]. (b) The software failure incident occurring due to software: - The IT migration at TSB was hindered by rushed and inadequate testing, poor internal communication, and lack of thorough testing of different systems [70984]. - Testing of the different systems was not as thorough as it could have been, with issues in digital and mobile payment testing, and poor communication between IT and the business about managing testing [70984]. - The testing of the new system was not properly scoped, and there were failures in Proteo that went on for days, indicating issues with the software [70984]. - The TSB system was described as a "nightmare" with major code changes on the hoof, and the system was problematic even before going live [70069]. - The TSB system was a "clusterfuck in the making" due to major code changes and lack of adequate training on Proteo, leading to testing based on legacy system knowledge [70069]. |
Objective (Malicious/Non-malicious) | non-malicious | (a) The software failure incident at TSB was non-malicious. The incident was a result of a computer systems migration that left up to 1.9 million customers unable to access their accounts. The migration was hindered by rushed and inadequate testing, poor internal communication, and a lack of thorough testing of different systems. The testing was not as thorough as it could have been due to rushing to meet deadlines, poorly designed or rushed tests, and a lack of communication between IT and the business about testing management. The failure was not attributed to malicious intent but rather to shortcomings in the testing and communication processes during the migration project [70984]. (b) The software failure incident at TSB was non-malicious. The incident was a result of a botched IT upgrade that left millions of customers locked out of their accounts. The problems began during an attempt to move data to a new computer system, which ultimately led to significant disruption across all areas of TSB's services. The issues were triggered by technical failures in the IT platform, inadequate risk management systems, insufficiently robust governance of the project, and a failure to plan for the IT migration properly. The incident resulted in chaos for customers, with some facing difficulties making payments, accessing accounts, and experiencing errors in account details. The failure was a result of systemic issues and shortcomings in the IT migration process, rather than any malicious intent [136658, 136614, 75434, 72355, 91842, 75402, 72314, 70069, 72311]. |
Intent (Poor/Accidental Decisions) | poor_decisions | [a70984] The software failure incident at TSB was hindered by rushed and inadequate testing, poor internal communication, and a lack of thorough testing of different systems. Contractors involved in the project mentioned that testing was not as thorough as it could have been due to rushing to meet deadlines and poorly designed or rushed tests. There were also issues with communication between IT and the business regarding testing management. This indicates contributing factors introduced by poor decisions in the software migration process. |
Capability (Incompetence/Accidental) | development_incompetence | (a) The software failure incident occurring due to development incompetence: - The TSB bank's computer systems migration was hindered by rushed and inadequate testing and poor internal communication, according to two contractors who worked on the project [70984]. - Testing of the different systems was not as thorough as it could have been due to rushing to meet deadlines, resulting in issues with digital and mobile payment testing [70984]. - There were failures in the Proteo system that went on for days, and testing was conducted based on knowledge of legacy systems rather than adequate training on the new system [70984]. - The Lloyds system inherited by TSB was described as complicated due to being created by amalgamating many systems, making the migration challenging [70984]. (b) The software failure incident occurring due to accidental factors: - TSB faced an IT meltdown during a computer systems migration that left up to 1.9 million customers unable to access their accounts [70984]. - The IT migration was initially planned to transfer customer records from Lloyds to Sabadell's Proteo system, but issues arose with incorrect balances, account access problems, and other system failures [70984]. - TSB customers reported problems making bill and mortgage payments, and the bank experienced a full-blown crisis as customers were locked out of their accounts [70984]. - The IT meltdown led to TSB customers facing severe disruptions, with issues ongoing for weeks and customers unable to access accounts or make payments [72311]. |
Duration | temporary | [a70984] The software failure incident at TSB was temporary, lasting for several days. Customers started reporting problems with accessing their accounts and making payments within hours of the migration over the weekend of April 21-22. The issues persisted for days, with customers still facing disruptions almost a month after the botched IT upgrade. The bank's CEO, Paul Pester, tweeted prematurely that the problems had been resolved, but customers continued to experience difficulties. The bank had to bring in a team of experts from IBM to fix the problems, and the chaos continued for weeks, with some customers unable to make payments or access key accounts. |
Behaviour | crash, omission, timing, value, other | (a) crash: The TSB software failure incident can be categorized as a crash behavior. The incident led to up to 1.9 million customers being unable to access their accounts, with reports of incorrect balances, accounts of other customers being visible, and problems with making payments [70984]. (b) omission: The software failure incident also exhibited omission behavior. Customers reported issues with digital and mobile payments not working properly, indicating an omission in the system's performance [70984]. (c) timing: The timing behavior was evident in the software failure incident. TSB faced delays in the migration process, with the initial deadline extended to April 2018. The rushed testing and inadequate communication contributed to the system not being ready for the migration [70984]. (d) value: The software failure incident involved value behavior. Customers experienced problems such as receiving texts about card usage abroad, discovering extra funds in their accounts, and encountering issues with mortgage accounts, indicating incorrect values being processed by the system [70984]. (e) byzantine: The software failure incident did not exhibit byzantine behavior. (f) other: The software failure incident also involved other behaviors such as poor testing, rushed testing, lack of communication between IT and business teams, and inadequate training on the new system. These factors contributed to the system's failure during the migration process [70984]. |
Layer | Option | Rationale |
---|---|---|
Perception | None | None |
Communication | None | None |
Application | None | None |
Category | Option | Rationale |
---|---|---|
Consequence | basic, property, delay, other | (a) death: People lost their lives due to the software failure - There is no mention of any deaths resulting from the TSB software failure incident in the articles. (b) harm: People were physically harmed due to the software failure - There is no mention of physical harm to individuals due to the TSB software failure incident in the articles. (c) basic: People's access to food or shelter was impacted because of the software failure - The TSB software failure incident impacted customers' ability to access their accounts, make payments, and manage their finances, causing significant inconvenience and financial difficulties [Article 72311]. (d) property: People's material goods, money, or data was impacted due to the software failure - The TSB software failure incident resulted in customers experiencing issues such as incorrect balances, inability to access accounts, and seeing other customers' accounts [Article 72311]. (e) delay: People had to postpone an activity due to the software failure - Customers faced delays and disruptions in managing their finances, making payments, and accessing their accounts due to the TSB software failure incident [Article 72311]. (f) non-human: Non-human entities were impacted due to the software failure - The TSB software failure incident primarily affected customers and their financial transactions, with no specific mention of non-human entities being impacted. (g) no_consequence: There were no real observed consequences of the software failure - The TSB software failure incident led to significant disruptions, financial difficulties, delays, and inconvenience for customers, indicating real consequences of the failure [Article 72311]. (h) theoretical_consequence: There were potential consequences discussed of the software failure that did not occur - The articles do not mention any potential consequences discussed that did not occur as a result of the TSB software failure incident. (i) other: Was there consequence(s) of the software failure not described in the (a to h) options? What is the other consequence(s)? - The TSB software failure incident resulted in customers facing challenges such as being unable to complete financial transactions, access funds for essential activities, and experiencing frustration and inconvenience due to the prolonged disruptions [Article 72311]. |
Domain | finance | (a) The failed system was intended to support the finance industry. TSB, a bank, experienced a software failure incident related to an IT system meltdown that left millions of banking customers locked out of their accounts for weeks [Article 70984]. The system migration at TSB was hindered by rushed and inadequate testing and poor internal communication, impacting up to 1.9 million customers [Article 70984]. The IT migration involved transferring customer records and accounts from a system operated by Lloyds Banking Group to one designed by TSB's current owner, Sabadell [Article 70984]. The failed IT upgrade led to customers being unable to access their accounts, make payments, and experience incorrect balances and account mix-ups [Article 70984]. (b) The failed system was intended to support the finance industry. TSB's computer systems migration that resulted in customers being unable to access their accounts was part of a project to migrate to Sabadell's in-house developed system, Proteo, from legacy systems [Article 70984]. The migration involved transferring records and accounts of TSB's customers, impacting up to 1.9 million individuals [Article 70984]. The IT contractors involved in the project highlighted rushed testing and poor communication as contributing factors to the system failure [Article 70984]. (c) The failed system was intended to support the finance industry. The TSB IT migration project involved moving from legacy systems to Sabadell's Proteo system, impacting up to 1.9 million customers who were unable to access their accounts [Article 70984]. The migration faced challenges due to rushed testing, inadequate communication, and poor training on the new system, leading to significant disruptions for customers [Article 70984]. (d) The failed system was intended to support the finance industry. TSB's IT migration project aimed to transition to Sabadell's Proteo system from legacy systems, resulting in up to 1.9 million customers facing account access issues and payment problems [Article 70984]. The rushed testing and lack of adequate training on the new system contributed to the failure, causing significant challenges for customers [Article 70984]. (e) The failed system was intended to support the finance industry. TSB's migration to Sabadell's Proteo system from legacy systems led to significant disruptions for up to 1.9 million customers who were unable to access their accounts and faced payment issues [Article 70984]. The project faced challenges due to rushed testing, poor communication, and inadequate training on the new system, impacting customer experience [Article 70984]. (f) The failed system was intended to support the finance industry. TSB's IT migration project involved transitioning to Sabadell's Proteo system from legacy systems, causing disruptions for up to 1.9 million customers who experienced account access problems and payment issues [Article 70984]. The rushed testing and lack of proper training on the new system contributed to the failure, impacting customer services [Article 70984]. (g) The failed system was intended to support the finance industry. TSB's computer systems migration to Sabadell's Proteo system from legacy systems resulted in up to 1.9 million customers facing account access issues and payment problems [Article 70984]. The rushed testing, poor communication, and lack of adequate training on the new system contributed to the failure, impacting customer services [Article 70984]. (h) The failed system was intended to support the finance industry. TSB's IT migration project aimed to transition to Sabadell's Proteo system from legacy systems, causing disruptions for up to 1.9 million customers who faced account access issues and payment problems [Article 70984]. The rushed testing, inadequate communication, and poor training on the new system were factors in the failure, impacting customer services [Article 70984]. (i) The failed system was intended to support the finance industry. TSB's migration to Sabadell's Proteo system from legacy systems resulted in significant disruptions for up to 1.9 million customers who experienced account access issues and payment problems [Article 70984]. The rushed testing, poor communication, and lack of proper training on the new system contributed to the failure, impacting customer services [Article 70984]. (j) The failed system was intended to support the finance industry. TSB's computer systems migration to Sabadell's Proteo system from legacy systems led to disruptions for up to 1.9 million customers who faced account access issues and payment problems [Article 70984]. The rushed testing, inadequate communication, and poor training on the new system contributed to the failure, impacting customer services [Article 70984]. (k) The failed system was intended to support the finance industry. TSB's IT migration project aimed to transition to Sabadell's Proteo system from legacy systems, causing disruptions for up to 1.9 million customers who experienced account access issues and payment problems [Article 70984]. The rushed testing, poor communication, and lack of adequate training on the new system were factors in the failure, impacting customer services [Article 70984]. (l) The failed system was intended to support the finance industry. TSB's migration to Sabadell's Proteo system from legacy systems resulted in significant disruptions for up to 1.9 million customers who faced account access issues and payment problems [Article 70984]. The rushed testing, poor communication, and lack of proper training on the new system contributed to the failure, impacting customer services [Article 70984]. (m) The failed system was intended to support the finance industry. TSB's computer systems migration to Sabadell's Proteo system from legacy systems led to disruptions for up to 1.9 million customers who faced account access issues and payment problems [Article 70984]. The rushed testing, inadequate communication, and poor training on the new system contributed to the failure, impacting customer services [Article 70984]. |
Article ID: 136658
Article ID: 71226
Article ID: 136614
Article ID: 75434
Article ID: 72355
Article ID: 136812
Article ID: 91842
Article ID: 75402
Article ID: 72314
Article ID: 70069
Article ID: 72311
Article ID: 70984