Incident: Excel Autoformatting Errors in Genetics Research Papers.

Published Date: 2016-08-26

Postmortem Analysis
Timeline 1. The software failure incident of errors introduced by Microsoft Excel in genetics papers was reported in an article published on 2016-08-26 [46877]. Therefore, the software failure incident likely happened around August 2016.
System The system that failed in the software failure incident described in the article is: 1. Microsoft Excel - The automatic date formatting feature in Microsoft Excel caused errors in gene lists in scientific papers [46877].
Responsible Organization 1. Microsoft Excel [46877]
Impacted Organization 1. Genetics researchers who published papers with errors in gene lists due to Excel autoformatting [46877]
Software Causes 1. Errors introduced by Microsoft Excel due to automatic conversion of gene names to dates or random numbers [46877]
Non-software Causes 1. Human error: The incident was caused by researchers forgetting to manually format columns to "Text" before entering gene names in Excel, leading to automatic conversion of gene names to dates or numbers [46877].
Impacts 1. Errors in gene lists in nearly 1 in 5 genetics papers were introduced by Microsoft Excel automatically converting gene names to dates or random numbers, impacting the accuracy of research findings [46877].
Preventions 1. Implementing a thorough data validation process to catch and correct any errors introduced by Excel's autoformatting feature [46877]. 2. Using alternative spreadsheet programs like Google Sheets that do not have the same autoformatting issues as Excel [46877]. 3. Transitioning to specialized programs and languages designed for statistical research, such as R and Python, to avoid reliance on general-purpose tools like Excel [46877].
Fixes 1. Researchers and journal editors should remain vigilant when working with data files, ensuring to manually format columns to "Text" before typing gene names in Excel [46877]. 2. Consider abandoning Excel completely in favor of programs and languages built for statistical research, such as R and Python [46877].
References 1. Australian researchers 2. Scientific journals like Nature, Science, and PLoS One 3. Harvard economists Carmen Reinhart and Kenneth Rogoff

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization (a) The software failure incident related to errors introduced by Microsoft Excel in scientific papers has happened again within the same organization or with its products and services. The Australian researchers noted that this problem was first identified in a paper published more than a decade ago, yet they found that these errors continue to persist in supplementary files in the scientific literature [46877]. This indicates that the issue of Excel automatically converting gene names to dates or random numbers leading to errors in genetics papers has not been fully resolved within the scientific community.
Phase (Design/Operation) design, operation (a) The software failure incident related to the design phase is evident in the article. The incident occurred due to errors introduced by Microsoft Excel during the development and data handling processes in scientific research. Specifically, the issue arose from Excel automatically converting gene names to dates or random numbers, leading to errors in genetics papers. This design flaw in Excel's autoformatting feature caused significant problems for researchers working with gene symbols and data manipulation [46877]. (b) The software failure incident related to the operation phase is also highlighted in the article. Researchers and genetics scientists faced challenges during the operation of Excel when handling gene lists and data files. The errors introduced by Excel's automatic date formatting impacted the accuracy of gene names in scientific papers, ultimately affecting the operation and reliability of the research data [46877].
Boundary (Internal/External) within_system, outside_system The software failure incident related to errors introduced by Microsoft Excel in genetics papers can be categorized as both within_system and outside_system. (a) within_system: The errors in the genetics papers were caused by Excel automatically converting gene names to dates or random numbers within the system. This issue originates from within the system itself, as Excel's autoformatting feature led to the incorrect conversion of gene symbols to date formats, impacting the accuracy of the data in the papers [46877]. (b) outside_system: On the other hand, the impact of Excel's autoformatting issues on the genetics papers can also be considered as originating from outside the system. While the errors occurred within the software system (Excel), the root cause of the problem lies in the design and functionality of Excel itself, which was not specifically tailored for scientific data entry requirements. This external factor of Excel's design flaw contributed to the software failure incident in the genetics research field [46877].
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident in the field of genetics, where errors were introduced in scientific papers due to Microsoft Excel automatically converting gene names to dates or random numbers, is an example of a failure due to contributing factors introduced without human participation [46877]. The automatic conversion of gene names by Excel led to errors in gene lists in approximately 1 in 5 genetics papers, highlighting how software behavior can lead to unintended consequences without direct human involvement. (b) On the other hand, the same incident also showcases a failure due to contributing factors introduced by human actions [46877]. The errors in gene lists were ultimately a result of researchers typing shortened gene names into Excel cells, which triggered the automatic date formatting feature of Excel. Despite the researchers' best intentions, the human action of inputting data into Excel cells inadvertently led to the software failure incident.
Dimension (Hardware/Software) software (a) The software failure incident occurring due to hardware: - The article does not mention any software failure incident occurring due to contributing factors originating in hardware. Therefore, it is unknown. (b) The software failure incident occurring due to software: - The software failure incident discussed in the article is due to errors introduced by Microsoft Excel's automatic conversion of gene names to dates or random numbers, leading to inaccuracies in genetics papers [46877].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident discussed in the articles is non-malicious. The errors introduced in scientific papers in the field of genetics were due to Excel automatically converting gene names to dates or random numbers, which was not done with the intent to harm the system or data integrity [46877].
Intent (Poor/Accidental Decisions) poor_decisions (a) The software failure incident related to the genetics papers containing errors introduced by Microsoft Excel can be categorized under poor_decisions. The incident was a result of Excel's automatic conversion of gene names to dates or random numbers, which was a design decision made by Microsoft that led to errors in scientific papers [46877]. Researchers highlighted that this issue was first identified more than a decade ago, indicating a persistent problem stemming from the initial decision to have Excel automatically format gene names [46877].
Capability (Incompetence/Accidental) development_incompetence, accidental (a) The software failure incident in the field of genetics, where errors were introduced by Microsoft Excel, can be attributed to development incompetence. The errors in gene lists were caused by Excel automatically converting gene names to calendar dates or random numbers, leading to inaccuracies in scientific papers [46877]. This issue highlights the lack of professional competence in handling data and software tools effectively within the scientific community. (b) The software failure incident related to Excel automatically converting gene names to dates or numbers can also be categorized as an accidental failure. Researchers often unintentionally input gene symbols into Excel, triggering the automatic formatting that leads to errors in the data. Despite efforts to correct the formatting to prevent errors, the issue persists due to the inherent behavior of Excel, indicating an accidental introduction of errors in scientific research [46877].
Duration temporary The software failure incident related to errors introduced by Microsoft Excel in genetics papers is temporary. The errors were due to Excel automatically converting gene names to dates or random numbers, which was a result of specific circumstances such as typing shortened gene names into Excel cells. The issue was not permanent as it could be prevented by manually formatting columns to "Text" before typing in new Excel sheets [46877].
Behaviour value (a) crash: The software failure incident related to the Excel autoformatting issue did not involve a crash where the system loses state and does not perform any of its intended functions. Instead, the issue was related to Excel automatically converting gene names to dates or random numbers, leading to errors in scientific papers [46877]. (b) omission: The incident did not involve the system omitting to perform its intended functions at an instance(s). Rather, the problem stemmed from Excel's automatic conversion of gene names to dates or numbers, introducing errors in genetics papers [46877]. (c) timing: The failure was not related to the system performing its intended functions correctly but too late or too early. The issue was with Excel misinterpreting gene names as dates or numbers, causing errors in genetics papers [46877]. (d) value: The software failure incident was related to the system performing its intended functions incorrectly. Excel's autoformatting feature led to errors in genetics papers by converting gene names to dates or random numbers, impacting the accuracy of the research [46877]. (e) byzantine: The incident did not involve the system behaving erroneously with inconsistent responses and interactions. Instead, the issue was with Excel's automatic formatting causing gene names to be misinterpreted, leading to errors in scientific papers [46877]. (f) other: The software failure incident involved Excel's autoformatting feature causing gene names to be automatically converted to dates or random numbers, which led to errors in genetics papers. Researchers found that this issue persisted despite being identified over a decade ago, highlighting a persistent problem in scientific literature caused by spreadsheet errors [46877].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence theoretical_consequence The consequence of the software failure incident related to the Excel errors in genetics papers was primarily in the category of (h) theoretical_consequence. The articles discuss the potential consequences of errors introduced by Excel in genetics papers, such as undermining the research integrity, causing inaccuracies in data analysis, and potentially leading to flawed conclusions. While there were no direct reports of deaths, physical harm, impact on basic needs, or property damage due to the software failure, the theoretical consequences of using Excel incorrectly in scientific research were highlighted [46877].
Domain information, knowledge (a) The software failure incident reported in the articles is related to the industry of information. The incident involved errors introduced by Microsoft Excel in scientific papers in the field of genetics, affecting the accuracy of gene lists used in research papers [46877].

Sources

Back to List