Incident: M&S Website Redesign Causes Major Shopping Disruption

Published Date: 2014-02-21

Postmortem Analysis
Timeline 1. The software failure incident happened in February 2014.
System 1. Website redesign system 2. Software development team's in-house design and operation system 3. Platform developed by Amazon 4. New website design and operation system 5. Software development team of 50 computer and design experts 6. Three-year development program 7. Laura Wade-Gery's e-commerce strategy 8. Customer browsing and shopping experience system
Responsible Organization 1. The software development team of 50 computer and design experts at Marks & Spencer who revamped the website and moved the design and operation in-house [24671].
Impacted Organization 1. Customers of Marks & Spencer were impacted by the software failure incident, as they were unable to make purchases on the new website, leading to frustration and threats of boycotting the store [Article 24671].
Software Causes 1. The software failure incident was caused by the decision to revamp the website, moving the design and operation in-house, away from a tried-and-tested platform developed by Amazon [24671]. 2. Problems arose after setting up a software development team of 50 computer and design experts to build the new website [24671]. 3. Issues occurred after a three-year development program that involved two years of testing, indicating potential software bugs or defects that were not adequately addressed during the testing phase [24671].
Non-software Causes 1. High number of customers overwhelming the website [24671] 2. Items disappearing from virtual baskets [24671] 3. Difficulties faced by customers after being asked to reset old passwords [24671]
Impacts 1. Customers were unable to make purchases on the new Marks & Spencer website due to the glitch, leading to frustration and anger among users [24671]. 2. Items disappeared from virtual baskets, and some users encountered difficulties when asked to reset their old passwords [24671]. 3. The software failure incident resulted in customers threatening to boycott the store unless the old website was restored, potentially leading to a loss of business for Marks & Spencer [24671].
Preventions 1. Thorough testing: Conducting more extensive and rigorous testing, including stress testing, load testing, and user acceptance testing, could have helped identify and address potential issues before the launch of the new website [24671]. 2. Incremental rollout: Implementing a phased rollout of the new website, starting with a smaller group of users or regions, could have helped in detecting and resolving any issues in a controlled environment before a full-scale launch [24671]. 3. Retaining external expertise: Maintaining the partnership with Amazon for the website platform development could have ensured a smoother transition and potentially avoided the issues faced during the in-house redesign [24671].
Fixes 1. Conduct thorough testing before launching the new website to identify and address any potential glitches or errors [24671]. 2. Consider reverting back to the old website temporarily while resolving the issues with the new redesign [24671]. 3. Implement a more robust infrastructure and platform for the website to handle high traffic and customer volume without crashing [24671]. 4. Provide clear communication to customers about the issues being faced and the steps being taken to rectify them to maintain transparency and trust [24671].
References 1. Customers' complaints on Facebook [Article 24671] 2. Error message displayed on the website [Article 24671] 3. M&S spokesperson's statement [Article 24671]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization <Article 24671> The software failure incident at Marks & Spencer's new website redesign can be categorized under the "one_organization" option. This incident occurred within the same organization as a result of the multi-million pound redesign of their website, which led to customers facing issues such as being unable to purchase items, encountering error messages, and experiencing difficulties with their virtual baskets and passwords. Customers expressed frustration and threatened to boycott the store until the old website was restored. This indicates a software failure incident that happened again within the same organization [24671].
Phase (Design/Operation) design, operation (a) The software failure incident in the article can be attributed to the design phase of the system development. The article mentions that the problems started after the company decided to revamp the website, moving the design and operation in-house and away from a tried-and-tested platform developed by Amazon. M&S spent millions of pounds building the new website with a software development team of 50 computer and design experts. The new site was the result of a three-year development program that involved two years of testing [24671]. (b) The software failure incident can also be linked to the operation phase. Users encountered issues such as items disappearing from their virtual baskets, difficulties resetting old passwords, and encountering error messages when trying to pay for products. Customers were met with an error message asking them to bear with the company due to a higher than usual number of customers, indicating operational challenges. These issues suggest contributing factors introduced by the operation or misuse of the system [24671].
Boundary (Internal/External) within_system (a) within_system: The software failure incident in this case was primarily within the system. The problems started after the company decided to revamp the website, moving the design and operation in-house and away from a tried-and-tested platform developed by Amazon. M&S spent millions of pounds building the new website with a software development team of 50 computer and design experts. The issues such as items disappearing from virtual baskets, difficulties with password resets, and the error message during payment were all internal to the system changes made by M&S [24671].
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident in this case was primarily due to non-human actions. The incident occurred as a result of the company's decision to revamp the website, moving the design and operation in-house and away from a tried-and-tested platform developed by Amazon. This decision led to issues such as items disappearing from virtual baskets, customers facing difficulties with password resets, and encountering error messages during the purchasing process [24671]. (b) Human actions also played a role in the software failure incident. The decision to revamp the website, the development of the new site by a team of computer and design experts, and the oversight of the project by the executive director for multi-channel e-commerce at M&S were all human actions that contributed to the failure. Additionally, customer complaints and threats to boycott the store due to the issues with the new website highlight the impact of human actions on the incident [24671].
Dimension (Hardware/Software) software (a) The software failure incident in the article was not explicitly attributed to hardware issues. The problems with the new Marks & Spencer website, such as customers being unable to buy items, encountering error messages, items disappearing from virtual baskets, and difficulties with password resets, were primarily related to software issues [24671]. (b) The software failure incident was primarily attributed to software issues. Customers faced challenges with the new website after M&S decided to revamp it and move the design and operation in-house. The problems included error messages, disappearing items from baskets, and password reset difficulties, indicating software-related glitches rather than hardware issues [24671].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident in this case does not seem to be malicious. It appears to be a non-malicious failure caused by the multi-million pound redesign of the Marks & Spencer website. The incident resulted in customers experiencing difficulties in purchasing items, encountering error messages, items disappearing from virtual baskets, and being asked to reset passwords. The problems arose after the company decided to revamp the website and move the design and operation in-house, away from a platform developed by Amazon. The failure was likely a result of technical issues and challenges in the redesign process rather than any malicious intent [24671].
Intent (Poor/Accidental Decisions) poor_decisions (a) The software failure incident related to the M&S website crash can be attributed to poor decisions made during the redesign process. The company decided to revamp the website, moving away from a tried-and-tested platform developed by Amazon and bringing the design and operation in-house. This decision led to issues such as customers facing error messages, items disappearing from virtual baskets, and difficulties with password resets [24671]. Additionally, the software development team of 50 computer and design experts spent millions of pounds on building the new website, which ultimately resulted in customer dissatisfaction and threats of boycott due to the poor user experience caused by the redesign [24671].
Capability (Incompetence/Accidental) development_incompetence (a) The software failure incident in this case seems to be related to development incompetence. The article mentions that the problems started after the company decided to revamp the website, moving the design and operation in-house after being developed by Amazon. M&S spent millions of pounds building the new website with a software development team of 50 computer and design experts. However, users faced issues such as items disappearing from their virtual baskets, difficulties with password resets, and encountering error messages during the checkout process. This indicates that the failure may have been due to factors introduced by the development team's lack of professional competence [24671]. (b) The software failure incident could also be attributed to accidental factors. Users were met with error messages, items disappeared from their virtual baskets, and some faced difficulties after being asked to reset their old passwords. These issues could have been unintentionally introduced during the revamp of the website and the transition to an in-house design and operation, leading to a negative user experience and potential loss of customers [24671].
Duration temporary The software failure incident reported in Article 24671 was temporary. The article mentions that users were met with an error message asking them to 'please bear with us' and informing them that the website was experiencing a higher than usual number of customers, putting them in a queue. Additionally, customers reported issues such as items disappearing from their virtual baskets and difficulties with password resets. These issues indicate a temporary disruption rather than a permanent failure [24671].
Behaviour crash, omission, other (a) crash: The software failure incident in the article is related to a crash. The new Marks & Spencer website crashed, leaving customers struggling to buy anything. Users were met with an error message asking them to 'please bear with us' as they experienced difficulties accessing the website [24671]. (b) omission: The software failure incident also involved omission. Customers reported that items disappeared from their virtual baskets, indicating that the system omitted to retain the selected items during the purchasing process [24671]. (c) timing: There is no specific information in the article indicating a timing-related failure where the system performed its intended functions but at the wrong time. (d) value: The software failure incident did not involve the system performing its intended functions incorrectly. (e) byzantine: The software failure incident did not exhibit behaviors of inconsistency or erratic responses that would classify it as a byzantine failure. (f) other: The software failure incident could be categorized as an overload issue where the system was unable to handle the higher than usual number of customers, leading to a crash and difficulties in accessing the website [24671].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence property, delay (d) property: People's material goods, money, or data was impacted due to the software failure Customers experienced issues such as items disappearing from their virtual baskets, difficulties with resetting passwords, and being unable to complete purchases on the new Marks & Spencer website due to the software failure incident [24671].
Domain sales (a) The failed system was intended to support the sales industry. The incident involved the new Marks & Spencer website crashing, which led to customers struggling to buy products online, encountering error messages, and facing difficulties with their virtual baskets and passwords [24671].

Sources

Back to List