Incident: CDN Provider Glitch Causes Major Outage for Financial Institutions

Published Date: 2021-06-17

Postmortem Analysis
Timeline 1. The software failure incident happened on June 17, 2021 [115860].
System 1. Akamai's software [115860]
Responsible Organization 1. Akamai (AKAM.O) - The software failure incident was caused by a bug in Akamai's software [115860].
Impacted Organization 1. Australian banks such as the Commonwealth Bank of Australia, Westpac Banking Corp, and Australia and New Zealand Banking Group [115860] 2. U.S. airlines including American Airlines, Southwest Airlines, United Airlines, and Delta Air Lines [115860] 3. Virgin Australia [115860]
Software Causes 1. The failure incident was caused by a bug in Akamai's software [115860].
Non-software Causes 1. The outage was caused by a bug in Akamai's software that has since been fixed, and was not caused by a cyber-attack or vulnerability [115860].
Impacts 1. Websites of dozens of financial institutions and airlines in Australia and the United States were briefly down, causing disruptions in services [115860]. 2. Australian banks experienced server-related glitches, while U.S. airlines like American Airlines and Southwest Airlines reported an hour-long outage [115860]. 3. The outage affected popular websites and internet-based tools, leading to a temporary disruption in online services [115860]. 4. The software failure incident resulted in financial losses for companies relying on the affected CDN services, such as Akamai [115860].
Preventions 1. Implementing thorough software testing procedures to catch bugs before deployment could have prevented the software failure incident [115860]. 2. Conducting regular audits and checks on critical internet infrastructure components like content delivery networks (CDNs) to identify and address potential glitches proactively could have helped prevent the outage [115860]. 3. Enhancing redundancy and failover mechanisms in place to quickly switch to backup systems in case of a failure in the primary infrastructure could have mitigated the impact of the outage [115860].
Fixes 1. The software failure incident caused by a bug in Akamai's software could be fixed by addressing the bug in the software, which has already been done according to an Akamai spokesperson [115860].
References 1. Akamai spokesperson [115860] 2. Southwest Airlines spokesperson [115860] 3. Virgin Australia [115860]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization, multiple_organization (a) The software failure incident having happened again at one_organization: The article mentions that the recent outage at Akamai was the second major blackout in just over a week caused by a glitch in an important piece of internet infrastructure [Article 115860]. This indicates that Akamai experienced a similar incident within a short timeframe. (b) The software failure incident having happened again at multiple_organization: The article also notes that the disruption linked to technical issues at Akamai follows an outage at rival Fastly Inc that affected a number of popular websites last week [Article 115860]. This suggests that multiple organizations, including those using services from Akamai and Fastly, have experienced similar incidents recently.
Phase (Design/Operation) design (a) The software failure incident was attributed to a bug in Akamai's software, which caused the outage affecting websites of financial institutions and airlines in Australia and the United States. The bug was identified as the root cause of the disruption and was subsequently fixed by Akamai. The outage was not a result of a cyber-attack or vulnerability but rather a glitch in an important piece of internet infrastructure [115860]. (b) The operation of the affected websites was impacted by the glitch in Akamai's software, leading to brief outages for many financial institutions and airlines in Australia and the United States. The outage affected the availability of services provided by these companies, causing inconvenience to users accessing their websites. However, the issue was resolved, and the impacted platforms were back online after the bug in Akamai's software was fixed [115860].
Boundary (Internal/External) within_system (a) within_system: The software failure incident was caused by a bug in Akamai's software, which is a contributing factor originating from within the system. The outage was not caused by a cyber-attack or vulnerability but rather by an internal glitch in the important piece of internet infrastructure provided by Akamai [115860]. (b) outside_system: The software failure incident was not attributed to factors originating from outside the system in the articles provided.
Nature (Human/Non-human) non-human_actions (a) The software failure incident was attributed to a bug in Akamai's software, which caused the outage affecting websites of financial institutions and airlines in Australia and the United States. The outage was not caused by a cyber-attack or vulnerability but was due to a glitch in an important piece of internet infrastructure provided by Akamai [115860]. (b) The outage was not caused by human actions but rather by a bug in Akamai's software. The Akamai spokesperson mentioned that the issue was not the result of a cyber-attack or vulnerability but was a technical glitch that has since been fixed [115860].
Dimension (Hardware/Software) software (a) The software failure incident related to the outage of websites of financial institutions and airlines in Australia and the United States was caused by a bug in Akamai's software, which is a content delivery network (CDN) provider. The outage was specifically mentioned to be caused by a bug in Akamai's software that has since been fixed, and it was clarified that the issue was not due to a cyber-attack or vulnerability [115860]. (b) The software failure incident was directly attributed to a bug in Akamai's software, indicating that the contributing factors that originated in software led to the outage experienced by various websites of financial institutions and airlines [115860].
Objective (Malicious/Non-malicious) non-malicious The software failure incident reported in Article 115860 was non-malicious. The outage experienced by websites of financial institutions and airlines in Australia and the United States was caused by a bug in Akamai's software, which has since been fixed. The outage was not attributed to a cyber-attack or vulnerability, as confirmed by an Akamai spokesperson in the article [115860].
Intent (Poor/Accidental Decisions) accidental_decisions (a) The software failure incident related to the outage of websites of financial institutions and airlines in Australia and the United States was not due to poor decisions but rather an accidental incident caused by a bug in Akamai's software. The outage was not a result of a cyber-attack or vulnerability but rather a technical issue with the CDN provider Akamai's software [115860].
Capability (Incompetence/Accidental) accidental (a) The software failure incident related to development incompetence is not mentioned in the provided article [115860]. (b) The software failure incident was accidental, caused by a bug in Akamai's software that has since been fixed, and was not caused by a cyber-attack or vulnerability [115860].
Duration temporary The software failure incident reported in the news article was temporary. The outage experienced by websites of financial institutions and airlines in Australia and the United States was caused by a glitch in an important piece of internet infrastructure related to a server-related glitch at the content delivery network (CDN) provider Akamai. The outage lasted for about an hour, affecting services at Australian banks and U.S. airlines like American Airlines and Southwest Airlines [Article 115860].
Behaviour crash (a) crash: The software failure incident in the articles can be categorized as a crash. The outage experienced by websites of financial institutions and airlines in Australia and the United States was caused by a glitch in an important piece of internet infrastructure, specifically a bug in Akamai's software [115860]. (b) omission: There is no specific mention of the software failure incident being related to the system omitting to perform its intended functions at an instance(s). (c) timing: The outage was not due to the system performing its intended functions too late or too early. (d) value: The software failure incident was not related to the system performing its intended functions incorrectly. (e) byzantine: The software failure incident was not characterized by the system behaving erroneously with inconsistent responses and interactions. (f) other: The behavior of the software failure incident can be categorized as a crash due to the system losing state and not performing any of its intended functions [115860].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence property, delay, non-human The consequence of the software failure incident described in the articles is mainly related to the impact on services and operations rather than direct harm to individuals. The outage caused by a bug in Akamai's software led to websites of financial institutions and airlines in Australia and the United States going down temporarily. This resulted in disruptions to online services, including banking and flight booking services. The outage affected the ability of users to access these services, but there were no reports of direct harm, deaths, or physical injuries to individuals as a result of the software failure incident [115860].
Domain information, transportation, finance, health, government (a) The software failure incident affected the information industry as it impacted websites of financial institutions and airlines in Australia and the United States, causing them to be briefly down due to a glitch in an important piece of internet infrastructure [115860]. (h) The finance industry was also impacted by the software failure incident as server-related glitches at content delivery network (CDN) provider Akamai hampered services at Australian banks, and U.S. airlines, including American Airlines and Southwest Airlines, reported an hour-long outage [115860]. (j) The health industry was indirectly affected as Virgin Australia, an airline, mentioned that it was one of many organizations to experience an outage with the Akamai content delivery system, though the situation was resolved [115860]. (l) The government sector was impacted as websites of the central bank, the Commonwealth Bank of Australia, and other financial institutions in Australia had begun to come back online by late afternoon after the outage caused by a bug in Akamai's software [115860].

Sources

Back to List