Incident: Mac App Store Outage Due to Expired Security Certificate

Published Date: 2015-11-12

Postmortem Analysis
Timeline 1. The software failure incident happened on November 11, 2015 [53273].
System 1. Security certificate used by Apple to prevent piracy [53273] 2. Mac App Store authentication system
Responsible Organization 1. The software failure incident with Mac users' apps was caused by the expiration of the security certificate used by Apple to prevent piracy [53273].
Impacted Organization 1. Mac users [53273]
Software Causes 1. The software causes of the failure incident were: - Expiration of the security certificate used by Apple to prevent piracy, leading to apps becoming temporarily unavailable [53273].
Non-software Causes 1. Expiry of the security certificate used by Apple to prevent piracy [53273] 2. Inability of users to connect to the internet to verify the new certificate 3. Users forgetting their passwords or facing login issues with iCloud 4. Requirement for users to delete and reinstall every app bought or downloaded from the App Store
Impacts 1. Users faced trouble with their apps as applications downloaded from the Mac App Store were temporarily unavailable due to the expired security certificate [53273]. 2. Users who could not connect to the internet couldn't verify the new certificate, and those who had forgotten their password or couldn't log in to iCloud were unable to use the downloaded apps until they could log in to the service [53273]. 3. Some users were forced to delete and then reinstall every app they had bought or downloaded from the App Store before they could be used, leading to frustrations and complaints on social media platforms like Twitter [53273].
Preventions 1. Regular monitoring and proactive renewal of security certificates before their expiration could have prevented the incident [53273]. 2. Implementing a system that allows for seamless transition to a new certificate without disrupting user access to downloaded apps could have mitigated the impact of the expired certificate [53273]. 3. Enhancing the error handling mechanism to provide clearer instructions or guidance to users on how to address issues related to expired certificates or authentication failures could have improved user experience during such incidents [53273].
Fixes 1. Issuing a new security certificate for the apps with an extended expiry date of April 2035 [53273] 2. Ensuring users can connect to the internet to verify the new certificate 3. Assisting users who have forgotten their passwords or are unable to log in to iCloud to use the downloaded apps 4. Providing support for users who had to delete and reinstall every app bought or downloaded from the App Store to resolve the issue
References 1. Tweets from affected users and developers such as Graham (@greyham) and Graeme Devine (@zaphodgjd) [53273] 2. Tweet from Paul Haddad (@tapbot_paul), a developer at Tapbots [53273]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization, multiple_organization (a) The software failure incident related to the expired security certificate affecting Mac users' apps on the Mac App Store is an example of a recurring issue within Apple's ecosystem. The article mentions that the security certificate expired five years after its creation, causing temporary unavailability of applications downloaded from the Mac App Store [53273]. This indicates a previous occurrence of a similar incident within the same organization. (b) The incident of a security certificate expiring and causing software issues is not unique to Apple. While the specific details of similar incidents at other organizations are not provided in the given article, the concept of security certificates expiring and leading to software disruptions is a known issue that can potentially affect multiple organizations using similar security measures.
Phase (Design/Operation) design, operation (a) The software failure incident in the article can be attributed to the design phase. The issue arose from the expiration of a security certificate used by Apple to prevent piracy. The certificate expired, causing applications downloaded from the Mac App Store to become temporarily unavailable. This problem was a result of the system design involving the use of security certificates to validate app authenticity and prevent unauthorized usage [53273]. (b) The software failure incident in the article can also be linked to the operation phase. Users faced difficulties in using the downloaded apps even after Apple issued a new certificate. Some users couldn't connect to the internet to verify the new certificate, while others had trouble logging in to iCloud, preventing them from using the apps. This operational issue impacted users' ability to access and utilize the apps they had downloaded, highlighting challenges in the operation and usability of the system [53273].
Boundary (Internal/External) within_system (a) within_system: The software failure incident reported in the articles was primarily within the system. The issue stemmed from the expiration of a security certificate used by Apple to prevent piracy. The expired certificate caused applications downloaded from the Mac App Store to become temporarily unavailable, leading to authentication problems for users even after Apple issued a new certificate. Users faced difficulties verifying the new certificate, logging into iCloud, or accessing their downloaded apps until the issue was resolved internally by Apple [53273].
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident occurred due to non-human actions, specifically the expiration of a security certificate used by Apple to prevent piracy. The certificate expired five years after its creation, leading to applications downloaded from the Mac App Store becoming temporarily unavailable [53273]. (b) The software failure incident also involved human actions, as highlighted by a developer at Tapbots who discovered the out-of-date security certificate late on Wednesday night. Additionally, users faced issues such as being unable to verify the new certificate if they couldn't connect to the internet or log in to iCloud, requiring them to delete and reinstall apps [53273].
Dimension (Hardware/Software) software (a) The software failure incident reported in Article 53273 was not attributed to hardware issues but rather to a security certificate expiration that caused trouble for Mac users with their apps. The issue stemmed from the security certificate used by Apple to prevent piracy expiring, leading to apps downloaded from the Mac App Store becoming temporarily unavailable. The root cause was the expiration of the security certificate, which is a software-related factor [53273].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident in this case was non-malicious. The issue stemmed from a security certificate used by Apple to prevent piracy expiring, causing disruptions for Mac users trying to use their downloaded apps from the Mac App Store [53273]. The expiration of the security certificate was not intentional but rather a result of oversight or a technical glitch, leading to inconvenience for users rather than any malicious intent to harm the system.
Intent (Poor/Accidental Decisions) poor_decisions (a) The software failure incident related to the expiration of the security certificate used by Apple's Mac App Store can be attributed to poor decisions. The incident occurred because the security certificate expired after five years of its creation, with no immediate replacement available [53273]. This lack of proactive renewal or management of the certificate led to widespread issues for Mac users, causing inconvenience and frustration. Additionally, the delay in issuing a new certificate and the subsequent challenges faced by users in verifying it further highlight the impact of poor decisions in managing the security infrastructure of the App Store.
Capability (Incompetence/Accidental) accidental (a) The software failure incident in Article 53273 was not explicitly attributed to development incompetence. However, the expiration of the security certificate used by Apple to prevent piracy, leading to apps becoming temporarily unavailable and users facing authentication issues, could potentially be linked to oversight or negligence in managing the certificate's expiration date. (b) The software failure incident in Article 53273 was primarily accidental in nature. The issue arose from the accidental expiration of the security certificate used by Apple to validate apps downloaded from the Mac App Store. This accidental expiration caused widespread disruptions for Mac users, leading to authentication problems and the inability to run downloaded apps until a new certificate was issued.
Duration temporary (a) The software failure incident in this case was temporary. The issue arose when the security certificate used by Apple to prevent piracy expired, causing applications downloaded from the Mac App Store to be temporarily unavailable. Apple issued a new certificate to fix the error, but users still faced problems, such as being unable to verify the new certificate if they couldn't connect to the internet or log in to iCloud. Users had to delete and reinstall apps before they could be used, indicating a temporary disruption rather than a permanent failure [53273].
Behaviour crash, omission, timing, value, other (a) crash: The software failure incident in the article can be categorized as a crash. Mac users faced trouble with their apps overnight after a security certificate expired, leading to the applications downloaded from the Mac App Store being temporarily unavailable. Users were unable to use the downloaded apps until they could log in to the service, and some had to delete and reinstall every app they had bought or downloaded from the App Store before they could be used [53273]. (b) omission: The incident also involved omission as part of the software failure behavior. Users were unable to verify the new certificate if they couldn't connect to the internet, and those who had forgotten their password or couldn't log in to iCloud for some other reason were also unable to use the downloaded apps until they could log in to the service [53273]. (c) timing: The timing aspect of the software failure incident is evident as well. The security certificate expired late on Wednesday, causing the issues for Mac users starting from 10 pm UK time. Even after Apple issued a new certificate, users were still faced with problems, indicating a timing-related failure [53273]. (d) value: The software failure incident also involved a value-related failure. Users were unable to use the downloaded apps correctly due to the issues with the security certificate, leading to frustration and the need for some users to delete and reinstall every app they had bought or downloaded from the App Store before they could be used [53273]. (e) byzantine: There is no indication of a byzantine behavior in the software failure incident described in the article. (f) other: The software failure incident also involved the behavior of inconvenience and disruption to users. Users took to Twitter to vent their frustrations, highlighting the impact of the failure on their experience and the inconvenience caused by the need to delete and reinstall apps [53273].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence unknown (a) death: People lost their lives due to the software failure (b) harm: People were physically harmed due to the software failure (c) basic: People's access to food or shelter was impacted because of the software failure (d) property: People's material goods, money, or data was impacted due to the software failure (e) delay: People had to postpone an activity due to the software failure (f) non-human: Non-human entities were impacted due to the software failure (g) no_consequence: There were no real observed consequences of the software failure (h) theoretical_consequence: There were potential consequences discussed of the software failure that did not occur (i) other: Was there consequence(s) of the software failure not described in the (a to h) options? What is the other consequence(s)? The consequence of the software failure incident described in the articles is mainly related to inconvenience and frustration experienced by Mac users. Users faced issues with their apps, were unable to access downloaded apps, had to delete and reinstall apps, and were unable to verify the new certificate. The incident led to a customer service nightmare for some users and caused frustration as highlighted in tweets [53273]. There were no reports of severe consequences such as death, physical harm, impact on basic needs, or significant property loss due to this software failure incident.
Domain information (a) The software failure incident reported in Article 53273 affected Mac users who were unable to access applications downloaded from the Mac App Store due to an expired security certificate. This incident impacted the production and distribution of information as users were unable to use the downloaded apps until the issue was resolved by Apple [53273].

Sources

Back to List