Incident Details

Incident: Dexcom G6 Service Outage Leaves Diabetics at Risk

Published Date: 2019-12-02

Postmortem Analysis
Timeline	1. The software failure incident with Dexcom's service outage occurred around midnight on Friday [92979]. 2. Published on 2019-12-02. 3. The incident likely occurred on November 29, 2019.
System	1. Dexcom G6 continuous glucose monitor system [92979]
Responsible Organization	1. Dexcom's servers unexpectedly becoming overloaded led to the software failure incident [92979].
Impacted Organization	1. Parents of children with diabetes relying on the Dexcom G6 continuous glucose monitor [92979] 2. Patients with diabetes using the Dexcom G6 device for glucose monitoring [92979] 3. Caregivers and loved ones remotely monitoring diabetic individuals using the Dexcom G6 device [92979]
Software Causes	1. The software failure incident with the Dexcom G6 continuous glucose monitor was caused by the company's servers unexpectedly becoming overloaded, leading to a service outage [92979].
Non-software Causes	1. Overloaded servers: The outage occurred because the company's servers unexpectedly became overloaded, affecting a large portion of Dexcom's users [92979]. 2. Lack of timely communication: Dexcom did not announce the outage until several hours after it began, leading to frustration among users who were not notified sooner [92979].
Impacts	1. Many parents relying on the Dexcom G6 continuous glucose monitor were left in the dark due to the service outage, causing them to scramble to ensure their children's safety [92979]. 2. The outage led to a situation where a child's blood sugar plummeted to dangerously low levels while he was asleep, and no alert was sent out, putting his life at risk [92979]. 3. Users who depended on the technology were left without alerts during nighttime emergencies, highlighting the risks associated with relying solely on such devices [92979]. 4. The outage caused frustration among users, with many expressing disappointment and anger towards Dexcom for not notifying them sooner about the issue [92979]. 5. Some parents had to revert to manual monitoring methods, such as setting alarms to check on their children every few hours, due to the failure of the Dexcom device [92979].
Preventions	1. Implementing better server capacity planning and load balancing mechanisms could have prevented the software failure incident [92979]. 2. Conducting more thorough testing and monitoring of the system to catch potential issues before they impact users could have helped prevent the outage [92979]. 3. Improving communication and notification processes to alert users promptly in case of service disruptions could have mitigated the impact of the software failure incident [92979].
Fixes	1. Implementing better server capacity planning and load balancing mechanisms to prevent server overload incidents like the one experienced by Dexcom [92979]. 2. Enhancing monitoring and alert systems within the software to promptly notify users of any service outages or disruptions [92979]. 3. Conducting regular system checks and testing to identify and address potential vulnerabilities or issues before they impact users [92979].
References	1. Dexcom users affected by the outage, including parents of children with diabetes like Virginia Coleman-Prisco and Heath Smith, as well as other individuals like Mark Turner and Kevin Sayer [Article 92979].

Software Taxonomy of Faults

Category	Option	Rationale
Recurring	one_organization	(a) The software failure incident having happened again at one_organization: The article mentions that Dexcom experienced a similar outage less than a year ago, on Dec. 31, which it resolved within a day [92979]. This indicates that Dexcom has faced similar software failure incidents in the past. (b) The software failure incident having happened again at multiple_organization: There is no specific mention in the article about similar incidents happening at other organizations or with their products and services.
Phase (Design/Operation)	design, operation	(a) The software failure incident related to the design phase can be seen in the article where it mentions that the outage occurred because the company's servers unexpectedly became overloaded, leading to the service outage [92979]. This indicates that there may have been issues related to the system design or capacity planning that contributed to the failure. (b) The software failure incident related to the operation phase is evident in the article where users were relying on the Dexcom Follow feature to alert them to nighttime emergencies, but the service went down without warning, impacting the operation and use of the system [92979]. This highlights a failure in the operation or functioning of the system that affected users' ability to monitor critical information.
Boundary (Internal/External)	within_system	(a) within_system: The software failure incident with Dexcom's G6 continuous glucose monitor was due to internal factors within the system. The outage occurred because Dexcom's servers unexpectedly became overloaded, leading to the service disruption [92979]. Dexcom's chief technology officer mentioned that a "large portion" of Dexcom's users were affected by the outage, indicating an internal issue within the system [92979]. (b) outside_system: There is no specific mention in the article about the software failure incident being caused by contributing factors originating from outside the system.
Nature (Human/Non-human)	non-human_actions	(a) The software failure incident in Article 92979 occurred due to non-human actions. Specifically, the Dexcom service outage was attributed to the company's servers unexpectedly becoming overloaded, leading to the disruption in service affecting a large portion of users [92979].
Dimension (Hardware/Software)	hardware, software	(a) The software failure incident related to hardware: - The outage of the Dexcom Follow service was due to the company's servers unexpectedly becoming overloaded, which is a hardware-related issue [92979]. (b) The software failure incident related to software: - The Dexcom Follow service outage was primarily caused by the company's servers becoming overloaded, indicating a software-related issue in managing server capacity and performance [92979].
Objective (Malicious/Non-malicious)	non-malicious	(a) The software failure incident related to the Dexcom G6 continuous glucose monitor was non-malicious. The outage occurred because the company's servers unexpectedly became overloaded, leading to the service disruption. Dexcom's chief technology officer, Jake Leach, mentioned that it was not the first time the service went dark, indicating that it was a technical issue rather than a malicious attack [92979]. (b) The software failure incident was non-malicious as there is no indication in the article that the outage was caused by any malicious intent. The chief executive of Dexcom, Kevin Sayer, expressed regret for letting users down and not alerting them to the outage sooner, emphasizing the company's commitment to fixing the disruption and improving user interactions [92979].
Intent (Poor/Accidental Decisions)	poor_decisions, accidental_decisions	The software failure incident related to the Dexcom G6 continuous glucose monitor outage can be attributed to both poor decisions and accidental decisions. 1. Poor Decisions: The incident can be linked to poor decisions in terms of communication and response strategy by Dexcom. Users expressed frustration that Dexcom did not announce the outage until several hours after it began, leading to a lack of awareness and preparedness among affected individuals [92979]. 2. Accidental Decisions: The outage itself was described as unexpected, with Dexcom's chief technology officer mentioning that the servers became overloaded unexpectedly, leading to the service disruption [92979]. This indicates that the outage was not a deliberate action but rather an unintended consequence of server overload.
Capability (Incompetence/Accidental)	development_incompetence, accidental	(a) The software failure incident related to development incompetence is evident in the article as Dexcom's chief technology officer, Jake Leach, mentioned that the outage occurred because the company's servers unexpectedly became overloaded. This indicates a lack of proper capacity planning or scalability measures in the system's design and implementation, which can be attributed to development incompetence [92979]. (b) The software failure incident also appears to have an accidental aspect as Dexcom's chief executive, Kevin Sayer, expressed regret for letting users down and not alerting them to the outage sooner. This indicates that the failure was not intentional but rather an unintended consequence of the system's performance issues [92979].
Duration	temporary	(a) The software failure incident described in the articles was temporary. The Dexcom G6 continuous glucose monitor suffered a service outage, leaving thousands of people without critical information for a period of time. The affected service, Dexcom Follow, was partly restored by Monday morning [Article 92979]. This indicates that the failure was not permanent but rather a temporary disruption in service.
Behaviour	crash, omission, timing, value, other	(a) crash: The software failure incident in the Dexcom G6 continuous glucose monitor case can be categorized as a crash. The service outage caused the system to lose its state and not perform its intended function of sending alerts to users about their glucose levels, leading to potentially life-threatening situations [92979]. (b) omission: The software failure incident can also be classified as an omission. Users reported that the system omitted to perform its intended function of sending alerts when glucose levels required urgent action, resulting in instances where users did not receive critical notifications [92979]. (c) timing: The timing of the software failure incident is also relevant. The system was not performing its intended functions at the right time, as users were relying on it to alert them to nighttime emergencies, but the outage occurred during that critical period, causing delays in receiving necessary information [92979]. (d) value: The software failure incident can be associated with a failure in value. The system was not performing its intended functions correctly, as users did not receive accurate alerts about their glucose levels, leading to situations where individuals had dangerously low blood sugar levels without being notified [92979]. (e) byzantine: The software failure incident does not align with a byzantine failure, which involves inconsistent responses and interactions. In this case, the system's failure was more about losing its state and omitting critical functions rather than providing inconsistent or conflicting information [92979]. (f) other: The other behavior observed in this software failure incident is the lack of timely communication and transparency from the company regarding the outage. Users expressed frustration over the delayed notification about the issue, highlighting the importance of clear and prompt communication during such incidents to manage user expectations and safety concerns [92979].

IoT System Layer

Layer	Option	Rationale
Perception	sensor	(a) The failure was related to the sensor: The software failure incident with the Dexcom G6 continuous glucose monitor was primarily due to a service outage that affected the sensor's ability to transmit glucose readings wirelessly to smartphones or receivers. This outage left thousands of users, including parents of children with diabetes, without critical information about their children's glucose levels, leading to potentially dangerous situations [92979].
Communication	connectivity_level	The software failure incident reported in Article 92979 was related to the connectivity level of the cyber physical system. The failure was attributed to Dexcom's servers unexpectedly becoming overloaded, leading to the service outage affecting a large portion of users [92979]. The outage was not due to issues with the physical layer (link level) but rather with the network or transport layer, causing disruptions in communication between the Dexcom devices and the users' smartphones or receivers.
Application	TRUE	The software failure incident reported in the article was related to the application layer of the cyber physical system. The Dexcom G6 continuous glucose monitor's service outage, which left thousands of people without critical information, was attributed to the company's servers unexpectedly becoming overloaded, leading to the failure [92979]. This overload of servers causing the outage aligns with the definition of an application layer failure due to contributing factors introduced by system errors.

Other Details

Category	Option	Rationale
Consequence	harm, delay	(a) death: People lost their lives due to the software failure - The article mentions a case where a parent expressed concern that their son could have died in his sleep due to the software failure incident with the Dexcom G6 continuous glucose monitor [92979]. (b) harm: People were physically harmed due to the software failure - There is a specific incident mentioned where a child's blood sugar plummeted to dangerously low levels while he was asleep, and no alert was sent out due to the software failure. The child had to be awakened and given glucose to bring his levels back to normal, indicating physical harm could have occurred [92979]. (e) delay: People had to postpone an activity due to the software failure - Parents had to scramble to ensure their children's safety after waking up to the news of the software failure incident, indicating a delay in their normal routines and activities [92979].
Domain	health	(a) The software failure incident reported in Article 92979 is related to the health industry. The Dexcom G6 continuous glucose monitor, which suffered a service outage, is a critical device for parents of children with diabetes as it tracks glucose levels and sends alerts when blood sugar levels are too high or too low, enabling quick corrective actions to be taken [92979]. The outage left thousands of people who rely on the device for critical information in the dark, causing distress and potential risks to the health and safety of individuals, especially children with diabetes [92979]. The Dexcom G6 device is a significant technological advancement in diabetes care, allowing patients to easily track their glucose levels without the need for frequent blood tests [92979].

Sources

In Weekend Outage, Diabetes Monitors Fail to Send Crucial Alerts (Published 2019) - The New York Times - Published on: 2019-12-02
Article ID: 92979

Back to List