Incident: Federal Reserve's Fedwire Funds and Check Clearing Services Disruption.

Published Date: 2021-02-24

Postmortem Analysis
Timeline 1. The software failure incident happened on February 24, 2021. [Article 110985]
System 1. Fedwire Funds 2. Fedcash 3. Check clearing services 4. National Settlement Service 5. Fedwire Securities Service [110985]
Responsible Organization 1. The Federal Reserve was responsible for causing the software failure incident as stated by the Fed itself, attributing it to a Federal Reserve operational error [110985].
Impacted Organization 1. The Federal Reserve's Fedwire Funds, Fedcash, and some check clearing services were impacted by the software failure incident [110985].
Software Causes 1. The software cause of the failure incident was a Federal Reserve operational error, as stated by the Federal Reserve on its website [110985].
Non-software Causes 1. The Federal Reserve attributed the failure incident to a Federal Reserve operational error [110985].
Impacts 1. The software failure incident led to a disruption in more than a dozen critical central bank payment services, including Fedwire Funds, Fedcash, and some check clearing services, which are vital components of the U.S. banking system [Article 110985]. 2. The National Settlement Service and the Fedwire Securities Service, responsible for issuance, settlement, and transfer services for Treasuries and other government securities, were also affected by the disruption [Article 110985]. 3. The Federal Reserve acknowledged that the cause of the disruption was a Federal Reserve operational error, indicating an internal software failure [Article 110985]. 4. The incident impacted billions of dollars of payments flowing daily through the financial system, affecting the smooth operation of financial transactions [Article 110985]. 5. The disruption in Fedwire Funds, which handles a significant daily average dollar volume of $3.4 trillion, could have caused delays and financial uncertainties for banks, businesses, and government agencies relying on the service for same-day transactions [Article 110985].
Preventions 1. Implementing thorough testing procedures: Conducting comprehensive testing, including stress testing and scenario testing, could have potentially identified any operational errors or weaknesses in the system before they caused a disruption [110985]. 2. Enhancing redundancy and failover mechanisms: Strengthening the system's redundancy and failover capabilities could have minimized the impact of the operational error and allowed for quicker recovery without disrupting critical services [110985]. 3. Continuous monitoring and alert systems: Implementing robust monitoring and alert systems could have helped in detecting the issue earlier, enabling prompt intervention and mitigation of the software failure incident [110985].
Fixes 1. Implementing stricter operational protocols and procedures to prevent operational errors like the one that caused the disruption [110985]. 2. Conducting a thorough review and enhancement of the resilience and recovery mechanisms of the Fedwire and NSS applications to better handle future failures [110985].
References 1. Federal Reserve (as the source of the information about the software failure incident) [110985]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization (a) The software failure incident having happened again at one_organization: The article [110985] reports that the Federal Reserve experienced a software failure incident that disrupted its Fedwire Funds, Fedcash, and check clearing services. The incident was attributed to a Federal Reserve operational error. This indicates that a software failure incident has occurred again within the same organization, the Federal Reserve. (b) The software failure incident having happened again at multiple_organization: There is no information in the provided article to suggest that a similar software failure incident has happened again at other organizations or with their products and services.
Phase (Design/Operation) design (a) The software failure incident in this case was attributed to a Federal Reserve operational error, indicating a failure due to contributing factors introduced by system development, system updates, or procedures to operate or maintain the system [110985]. The article mentions that the technical teams determined the cause to be a Federal Reserve operational error, suggesting that the issue may have originated during the development or maintenance of the system. (b) The article does not provide specific information indicating that the software failure incident was due to contributing factors introduced by the operation or misuse of the system.
Boundary (Internal/External) within_system The software failure incident at the Federal Reserve, as reported in Article 110985, was attributed to a Federal Reserve operational error. This indicates that the boundary of the software failure incident was within the system, as the cause originated internally within the Federal Reserve's operations [110985].
Nature (Human/Non-human) non-human_actions (a) The software failure incident was attributed to a Federal Reserve operational error, indicating a failure due to contributing factors introduced without human participation [110985].
Dimension (Hardware/Software) software (a) The software failure incident mentioned in Article 110985 was attributed to a Federal Reserve operational error, indicating that the contributing factor originated in the software rather than hardware. The article specifically states, "Our technical teams have determined that the cause is a Federal Reserve operational error" [110985]. This points to a software-related issue leading to the disruption in critical central bank payment services.
Objective (Malicious/Non-malicious) non-malicious The software failure incident reported in Article 110985 was classified as non-malicious. The Federal Reserve attributed the disruption to an operational error on their part, indicating that the failure was not caused by malicious intent but rather by unintentional mistakes made by the technical teams at the Federal Reserve [110985].
Intent (Poor/Accidental Decisions) accidental_decisions The software failure incident at the Federal Reserve, as reported in Article 110985, was attributed to a Federal Reserve operational error. This indicates that the failure was likely due to accidental_decisions, specifically a mistake or unintended decision made within the Federal Reserve's operational processes [110985].
Capability (Incompetence/Accidental) development_incompetence (a) The software failure incident in this case was attributed to a Federal Reserve operational error, indicating a failure due to contributing factors introduced due to lack of professional competence by humans or the development organization [110985]. The Federal Reserve mentioned that their technical teams determined the cause as a Federal Reserve operational error, suggesting that the failure was a result of a mistake or lack of professional competence during the operation of the systems.
Duration temporary The software failure incident described in the article was temporary. The article mentions that the disruption to the Federal Reserve's Fedwire Funds, Fedcash, and check clearing services lasted for more than three hours before normal operations were resumed [Article 110985]. This indicates that the failure was due to contributing factors introduced by certain circumstances but not all, and it was not a permanent failure.
Behaviour crash, other (a) crash: The software failure incident in this case was a crash, as the Federal Reserve mentioned that its Fedwire Funds, Fedcash, and some check clearing services experienced a disruption for more than three hours, impacting critical central bank payment services [Article 110985]. (b) omission: There is no specific mention of the software failure incident being due to omission in the provided article. (c) timing: The software failure incident did not seem to be related to timing issues, as there was no indication that the system was performing its intended functions too late or too early. (d) value: The software failure incident was not attributed to the system performing its intended functions incorrectly. (e) byzantine: The software failure incident was not described as exhibiting byzantine behavior in terms of inconsistent responses and interactions. (f) other: The behavior of the software failure incident was described as a Federal Reserve operational error that led to the disruption in critical central bank payment services [Article 110985].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence property, delay, non-human The consequence of the software failure incident described in the article [110985] was primarily related to financial impacts and delays. The disruption affected critical central bank payment services, including Fedwire Funds, Fedcash, check clearing services, National Settlement Service, and Fedwire Securities Service. This led to delays in transactions and financial operations, with billions of dollars of payments being impacted. The article mentions that the Federal Reserve attributed the cause of the disruption to a Federal Reserve operational error. Additionally, the article highlights that the affected services facilitate check clearing and allow billions of dollars of payments to flow daily through the financial system, emphasizing the financial impact of the software failure incident.
Domain finance (a) The failed system was intended to support the finance industry. The article mentions that the Federal Reserve's Fedwire Funds, Fedcash, and check clearing services, which were disrupted, are critical central bank payment services forming the backbone of the U.S. banking system [Article 110985]. These services facilitate check clearing and allow billions of dollars of payments to flow daily through the financial system. Fedwire Funds, in particular, is described as the premier electronic funds-transfer service relied upon by banks, businesses, and government agencies for same-day transactions, with a significant daily average dollar volume [Article 110985].

Sources

Back to List