Incident: SimCity Launch Week Server Connectivity Issues

Published Date: 2013-03-07

Postmortem Analysis
Timeline 1. The software failure incident with SimCity occurred on the launch week, as mentioned in the article [17555]. 2. The article was published on 2013-03-07. 3. Therefore, the software failure incident with SimCity happened in March 2013.
System 1. Server architecture [17555] 2. Connectivity system [17555] 3. Cloud save function [17555]
Responsible Organization 1. The software failure incident in SimCity was primarily caused by the heavy server load due to a high influx of new users trying to log in to play the game, leading to error messages, random disconnections, bugs, and long wait times [17555].
Impacted Organization 1. Gamers who experienced error messages, random disconnections, bugs, and long wait times preventing them from enjoying SimCity [Article 17555]. 2. SimCity developers and publisher Electronic Arts (EA) who faced technical issues with server architecture, bug reports, and server overload due to high influx of new users [Article 17555]. 3. CNET Editor Jeff Bakalar who encountered connectivity issues and was unable to play the game for his review [Article 17555].
Software Causes 1. Server architecture issues leading to bugs and long wait times for players to enter servers [17555] 2. Heavy server load due to a high influx of new users logging in to play, causing connectivity woes [17555] 3. Requirement of an Internet connection to play and stay in the game, even in single-player mode, contributing to the server load issues [17555]
Non-software Causes 1. Heavy server load due to a high influx of new users logging in to play [17555]. 2. Requirement of an Internet connection to play and stay in the game, even in single-player mode [17555]. 3. Insufficient allocation of resources by EA to handle the surge of gamers during launch week [17555]. 4. Issues with server stability leading to random shutdowns and problems saving to EA's cloud save function [17555].
Impacts 1. Players encountered error messages and random disconnections preventing them from enjoying the game, leading to frustration and disappointment [Article 17555]. 2. Significant portions of cities built by players were lost due to server shutdowns or issues with saving to the cloud, resulting in wasted time and effort [Article 17555]. 3. The requirement of an always-on Internet connection for both single-player and multiplayer modes caused inconvenience and backlash among players, especially considering the history of offline play in previous SimCity titles [Article 17555]. 4. The surge of new users during the launch week overloaded the servers, causing long wait times to enter servers, bugs, and other technical issues, impacting the overall gameplay experience [Article 17555]. 5. The software failure incident led to a negative impact on EA's reputation, with players expressing frustration on social media platforms and even starting a petition to urge EA to abandon the always-on DRM feature in SimCity and future games [Article 17555].
Preventions 1. Implementing a more robust server architecture to handle the high influx of new users and heavy server load could have prevented the software failure incident [17555]. 2. Allocating appropriate resources to handle the surge of gamers during the launch week could have helped prevent the connectivity issues experienced by players [17555]. 3. Allowing players to access SimCity's single-player mode offline while maintaining the always-on Internet requirement for multiplayer could have alleviated the server load and potentially avoided the server stability issues [17555].
Fixes 1. Allocating appropriate resources to handle the surge of gamers during launch week to prevent server overload and connectivity issues [Article 17555]. 2. Allowing players to access SimCity's single-player mode offline while maintaining the always-on Internet requirement for multiplayer to alleviate server load and provide a smoother gaming experience [Article 17555].
References 1. SimCity Senior Producer Kip Katsarelis [Article 17555] 2. Electronic Arts (EA) [Article 17555] 3. CNET Editor Jeff Bakalar [Article 17555]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization (a) The software failure incident having happened again at one_organization: The article mentions the comparison of EA's SimCity launch issues with Activision's messy Diablo III launch, indicating that EA has faced similar problems with their game launches in the past [17555]. (b) The software failure incident having happened again at multiple_organization: The article does not provide specific information about similar incidents happening at other organizations. Therefore, it is unknown if similar incidents have occurred at multiple organizations.
Phase (Design/Operation) design, operation (a) The software failure incident in the SimCity game can be attributed to design-related factors introduced during the system development phase. The article mentions that the connectivity issues and server instability were a result of problems with the server architecture, bugs, and long wait times to enter servers [17555]. These issues indicate that there were underlying design flaws or inadequacies in the system's architecture that led to the software failure incident. (b) Additionally, the software failure incident can also be linked to operation-related factors introduced by the operation or misuse of the system. The heavy server load caused by a high influx of new users logging in to play contributed to the connectivity woes experienced by players. The requirement for an Internet connection to play, even in single-player mode, and the lack of appropriate resources allocated to handle the surge of gamers during launch week are operational factors that exacerbated the software failure incident [17555].
Boundary (Internal/External) within_system (a) The software failure incident in the SimCity case can be categorized as within_system. The article mentions that the connectivity issues and server problems were due to technical issues with the server architecture, bugs, and long wait times to enter servers [17555]. The producer of SimCity acknowledged hitting a number of problems with their server architecture, which led to players encountering bugs and long wait times, indicating that the issues originated from within the system itself. Additionally, the article highlights that the software failure was exacerbated by the heavy server load caused by a high influx of new users logging in to play, further emphasizing internal system issues as the root cause of the failure.
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident in the SimCity game was primarily due to non-human actions, specifically heavy server load caused by a high influx of new users trying to log in to play the game. This led to players encountering bugs, long wait times to enter servers, error messages, random disconnections, and issues with saving to the cloud. The connectivity issues were exacerbated by the requirement for an Internet connection even in single-player mode, which was intended as a safeguard against software piracy [17555]. (b) However, human actions also played a role in the failure incident. The article criticizes Electronic Arts (EA) for not allocating appropriate resources to handle the surge of gamers during the launch week, despite the anticipation built over 10 years for a new SimCity game. The lack of preparation by EA in terms of server capacity and handling the high traffic volume contributed to the software failure incident [17555].
Dimension (Hardware/Software) hardware, software (a) The software failure incident in the SimCity game was primarily due to hardware-related issues. The connectivity woes and server instability were attributed to heavy server load caused by a high influx of new users trying to log in to play the game. This high demand on the servers led to error messages, random disconnections, bugs, long wait times, and issues with saving to the cloud. The need for an Internet connection to play, even in single-player mode, was highlighted as a significant hardware-related factor contributing to the failure incident [17555]. (b) Additionally, software-related factors also played a role in the failure incident. The software architecture of the servers was mentioned as a problem, leading to players encountering bugs and experiencing long wait times to enter servers. The developers were actively working on deploying more servers, pushing updates, and addressing bug reports to resolve these software-related issues. The need for bug fixes, server downtime notifications, and hotfixes indicated ongoing software challenges in ensuring a smooth gaming experience for players [17555].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident in the SimCity case does not seem to be malicious. The issues reported in the articles point towards technical problems and server overload due to a high influx of new users, rather than any intentional actions to harm the system. The failure was primarily attributed to server architecture problems, bugs, long wait times, and server shutdowns affecting gameplay experience [17555]. The article highlights the challenges faced by the development team and the efforts made to resolve the issues, indicating a lack of malicious intent behind the software failure.
Intent (Poor/Accidental Decisions) poor_decisions (a) The software failure incident in the SimCity case can be attributed to poor decisions made by the developers and publishers. The decision to require an always-on Internet connection for a game that traditionally had offline capabilities led to server overload and connectivity issues [17555]. Additionally, the lack of appropriate resource allocation to handle the surge of gamers during the launch week reflects poor planning and decision-making by EA [17555]. The need for more servers and the issues with server architecture point towards poor decisions that contributed to the software failure incident.
Capability (Incompetence/Accidental) development_incompetence, accidental (a) The software failure incident in the SimCity game was partially attributed to development incompetence. The article mentions that the connectivity issues and server problems were a result of hitting a number of problems with the server architecture, leading to players encountering bugs and long wait times to enter servers [17555]. Additionally, it criticizes EA for not allocating appropriate resources to handle the surge of gamers during the launch week, indicating a lack of foresight and planning on the part of the development organization [17555]. (b) The software failure incident in SimCity was also influenced by accidental factors. The heavy server load and connectivity woes were primarily caused by a high influx of new users logging in to play, which was not anticipated adequately by the developers [17555]. Furthermore, the requirement for an always-on Internet connection, even in single-player mode, was highlighted as an absurd decision, indicating unintentional consequences of the design choices made for the game [17555].
Duration temporary (a) The software failure incident in the SimCity case can be considered temporary. The issues experienced by players, such as error messages, random disconnections, bugs, long wait times, and server instability, were acknowledged by the game's Senior Producer as problems with the server architecture and heavy server load caused by a high influx of new users [17555]. The development team was actively working to resolve these issues by deploying more servers, pushing updates, and addressing bug reports. Additionally, temporary measures like disabling non-critical gameplay features were implemented to lessen the load on servers [17555]. These actions indicate that the software failure was not permanent but rather caused by specific circumstances that could be addressed and resolved over time.
Behaviour crash, omission, value, other (a) crash: The software failure incident in the SimCity game was characterized by crashes where players encountered bugs, long wait times to enter servers, and issues with server stability leading to random disconnections and loss of progress. Players reported losing significant portions of their cities due to the game randomly shutting down or experiencing problems with saving to EA's cloud save function [17555]. (b) omission: The software failure incident also involved omission as players were unable to experience the full functionality of the game due to error messages, random disconnections, and inability to log in to play. This omission of intended functions caused frustration among gamers who were unable to enjoy the game as intended [17555]. (c) timing: While the software failure incident did not specifically mention timing issues, the delays in entering servers, long wait times, and the need for constant updates and hotfixes to address bugs and server issues could be indicative of timing failures. Players had to wait for fixes and improvements to be implemented to be able to play the game without issues [17555]. (d) value: The software failure incident also involved value failures as the system was not performing its intended functions correctly. Players encountered bugs, server issues, and loss of progress in the game, which impacted their overall gaming experience and prevented them from enjoying the game as intended [17555]. (e) byzantine: The software failure incident did not exhibit characteristics of a byzantine failure where the system behaves erroneously with inconsistent responses and interactions. The primary issues were related to crashes, omission of functions, timing delays, and value failures rather than inconsistent or unpredictable behavior [17555]. (f) other: The other behavior observed in the software failure incident was the requirement for an always-on Internet connection to play the game, even in single-player mode. This aspect was highlighted as an issue by players and critics, as previous titles in the series allowed offline play. The decision to mandate an Internet connection was seen as unnecessary and contributed to the server load and connectivity issues experienced by players [17555].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence no_consequence (a) death: People lost their lives due to the software failure - There is no mention of any deaths caused by the software failure incident in the provided article [17555].
Domain entertainment (a) The failed system in the incident was related to the entertainment industry, specifically the gaming sector. The software failure incident occurred in the game SimCity, a city management simulation game that is part of the entertainment industry [17555]. The incident caused frustration among gamers due to error messages, random disconnections, and server issues, preventing players from enjoying the game as intended. (k) The software failure incident was specifically related to the gaming sector within the entertainment industry. Gamers encountered bugs, long wait times, server connectivity issues, and loss of progress in the game due to the software failure in SimCity [17555]. The incident highlighted the challenges of managing heavy server loads and ensuring a smooth gaming experience for players. (m) The software failure incident was not related to any other industry outside of the entertainment sector, as it specifically pertained to the launch and operation of the SimCity game within the gaming industry [17555].

Sources

Back to List