Incident: Machine Translation Error by WeChat Translating Racial Slur.

Published Date: 2017-10-12

Postmortem Analysis
Timeline 1. The software failure incident of WeChat translating a neutral Chinese phrase into the n-word happened in October 2017 as reported in [Article 63905] and [Article 64493]. Therefore, the incident occurred in October 2017.
System 1. WeChat's translation engine - neural network-based service [63905, 64493] 2. Machine learning system used for translation [63905, 64493] 3. AI and machine learning system used by WeChat for translation [64493]
Responsible Organization 1. WeChat's translation engine, which is a neural network-based service, was responsible for causing the software failure incident [63905, 64493].
Impacted Organization 1. WeChat users, including Shanghai-based theatre producer and actor Ann James, a black American, were impacted by the software failure incident [63905, 64493]. 2. Google users were also impacted by similar issues with machine translation, as mentioned in the articles [63905].
Software Causes 1. The software failure incident was caused by an error in the artificial intelligence software used by WeChat for translation, which resulted in translating a neutral Chinese phrase into the n-word [63905, 64493]. 2. The incident was attributed to the biases and errors present in the data sources on which the neural network-based translation engine was trained, leading to inappropriate translations [63905]. 3. The machine learning system used by WeChat may have been trained on a corpus of Chinese and English text containing racial slurs and stereotypical descriptions of black people, contributing to the translation error [63905]. 4. The lack of human oversight in the AI and machine learning system used by WeChat also played a role in allowing incorrect and offensive translations to occur [64493].
Non-software Causes 1. Cultural biases and stereotypes: The incident was caused by the incorporation of biases and errors from the data sources on which the translation engine was trained, leading to inappropriate translations [63905]. 2. Lack of human oversight: The use of AI and machine learning without sufficient human oversight resulted in incorrect and offensive translations being generated by the software [64493].
Impacts 1. The software failure incident involving WeChat's translation error resulted in the translation of a neutral Chinese phrase into a racial slur, specifically the n-word, causing offense and outrage among users, especially the black community [63905, 64493]. 2. The incident highlighted the biases and errors that can be incorporated into machine learning systems, emphasizing the importance of continuously tweaking and improving such systems for accuracy and appropriateness [63905]. 3. WeChat's apology and subsequent fix for the translation error demonstrated a swift response to user feedback, indicating a commitment to rectifying the issue and preventing similar occurrences in the future [63905, 64493]. 4. The incident shed light on the challenges of automated translation engines in learning and optimizing their vocabulary banks to ensure accurate translations, underscoring the ongoing process of improving translation quality and user experience [63905]. 5. The software failure incident also raised broader concerns about racial insensitivity in technology, as seen in other instances such as Google's translation product making sexist assumptions and labeling photos inappropriately, prompting the need for greater awareness and oversight in AI and machine learning applications [63905].
Preventions 1. Implementing stricter data filtering and preprocessing techniques to remove biased or offensive language from the training data used for the machine translation system could have prevented the software failure incident [63905, 64493]. 2. Conducting thorough testing and validation of the machine translation system with diverse sets of input data to identify and rectify any potential offensive translations before deployment could have helped prevent such incidents [63905, 64493]. 3. Incorporating human oversight and review mechanisms in the machine translation process to catch and correct inappropriate translations in real-time could have mitigated the risk of offensive outputs [64493].
Fixes 1. Implement stricter data filtering and preprocessing techniques to remove biased and offensive language from the training data used for the machine translation system [63905, 64493]. 2. Incorporate human oversight and review mechanisms into the AI and machine learning processes to catch and correct inappropriate translations before they are released to users [64493]. 3. Conduct regular audits and tests on the translation software to identify and address any potential issues or biases in the output [63905]. 4. Enhance user feedback mechanisms to quickly identify and rectify translation errors reported by users [63905, 64493].
References 1. WeChat spokesperson - as mentioned in both articles [63905, 64493] 2. Ann James - the American living in Shanghai who first noticed the issue [63905, 64493] 3. That's Shanghai - a local news site that conducted tests revealing the translation issues [63905] 4. The Guardian - a news outlet that conducted tests showing the translation software had been retooled [64493]

Software Taxonomy of Faults

Category Option Rationale
Recurring one_organization, multiple_organization (a) The software failure incident related to inappropriate translations using racial slurs has happened again at WeChat. WeChat had a previous incident where its AI error resulted in translating a neutral Chinese phrase into the n-word. The incident was reported by Shanghai-based theatre producer and actor Ann James, a black American [63905]. (b) The incident of inappropriate translations using racial slurs is not unique to WeChat. Google's own translation product has also faced similar issues. For example, Google's translation product made sexist assumptions when translating gender-neutral Turkish sentences and had previously labeled a photo of two black people as "gorillas" [63905].
Phase (Design/Operation) design, operation (a) The software failure incident in the articles can be attributed to the design phase. WeChat's translation error, where a neutral Chinese phrase was translated into a racial slur, was a result of the AI translation engine being trained on biased data sources containing racial slurs and stereotypical descriptions of black people. This bias in the training data led to the inappropriate translations, highlighting a flaw in the design of the translation system [63905, 64493]. (b) Additionally, the incident also involved operation-related factors as users were able to trigger the translation error by using specific negative contexts with the phrase "hei laowai." The misuse of the system by inputting certain phrases led to the generation of offensive translations, indicating issues related to the operation or usage of the translation feature [63905, 64493].
Boundary (Internal/External) within_system (a) within_system: - The software failure incident involving the translation error in WeChat was primarily attributed to the machine translation system itself. The incident occurred due to the neural network-based translation engine incorporating biases and errors from its training data sources, leading to inappropriate translations like using the n-word [63905, 64493]. - WeChat acknowledged the issue and mentioned that their automated translation engine is still undergoing the learning and optimization process to improve translation quality [63905]. - The company uses AI and machine learning to train the translation system, but the lack of human oversight contributed to incorrect and offensive translations [64493]. (b) outside_system: - The software failure incident was not explicitly linked to factors originating from outside the system in the articles. The focus was more on the internal issues within the machine translation system that led to the inappropriate translations [63905, 64493]. - While the incident highlighted cultural sensitivity and racial biases, particularly in the context of Chinese perceptions of race, there was no direct mention of external factors causing the software failure [64493].
Nature (Human/Non-human) non-human_actions, human_actions (a) The software failure incident occurring due to non-human actions: - The software failure incident with WeChat's translation engine was attributed to a neural network-based service that incorporated biases and errors from its data sources on which it was trained [63905]. - WeChat mentioned that their automated translation engine is still undergoing the learning and optimization process, indicating that the issue stemmed from the system's training and optimization rather than direct human actions [63905]. - The article highlighted that the translation software had been retooled and no longer produced racial slurs after the incident, suggesting that the correction was made through adjustments in the system rather than direct human intervention [64493]. (b) The software failure incident occurring due to human actions: - The incident was first noticed by an American living in Shanghai, Ann James, who used WeChat's built-in translation feature and discovered the inappropriate translation [64493]. - WeChat's spokesperson mentioned that they immediately fixed the problem after receiving users' feedback, indicating that human intervention was required to address the issue [63905]. - The article mentioned that the system removes human oversight, which can lead to incorrect and offensive words being used, implying that lack of human oversight in the system's design and implementation could have contributed to the incident [64493].
Dimension (Hardware/Software) software (a) The software failure incident reported in the news articles is primarily attributed to software-related factors. WeChat's translation error, where a neutral Chinese phrase was incorrectly translated into a racial slur, was a result of an error in the artificial intelligence software used for translation [63905, 64493]. The incident was caused by the machine learning system being trained on data sources containing biases and errors, leading to inappropriate translations. WeChat acknowledged the issue and mentioned that the translation engine is still undergoing the learning and optimization process to improve accuracy [63905]. (b) The software failure incident is not directly linked to hardware-related factors. The issue with WeChat's translation feature was specifically related to the software's translation algorithm and the data it was trained on, rather than any hardware malfunctions or failures [63905, 64493].
Objective (Malicious/Non-malicious) non-malicious (a) The software failure incident related to the translation error in WeChat's AI system does not appear to be malicious. It was a non-malicious failure caused by the biases and errors in the data sources on which the system was trained. The incident was attributed to the AI system incorporating racial slurs and stereotypical descriptions of black people in its translations, leading to inappropriate outputs like translating a neutral Chinese phrase into the n-word [63905, 64493]. The company promptly apologized, fixed the issue, and stated that the automated translation engine was still undergoing the learning and optimization process to improve translation quality [63905]. (b) The failure was non-malicious as it was a result of the AI system's training data containing biases and errors, rather than any intentional harm caused by individuals involved in the system's development or operation.
Intent (Poor/Accidental Decisions) poor_decisions (a) The software failure incident involving WeChat's translation error can be attributed to poor decisions made in the training of the machine translation system. The incident occurred due to the system being trained on a corpus of Chinese and English text that contained racial slurs and stereotypical descriptions of black people [63905, 64493]. This poor decision in the training data led to the inappropriate translation of a neutral Chinese phrase into a racial slur, causing significant backlash and necessitating immediate corrective action by WeChat.
Capability (Incompetence/Accidental) development_incompetence, accidental (a) The software failure incident related to development incompetence is evident in the articles. WeChat's translation error, which resulted in translating a neutral Chinese phrase into a racial slur, was attributed to the biases and errors in the data sources on which the neural network-based translation engine was trained [63905, 64493]. This indicates a lack of professional competence in ensuring that the training data used for the translation engine was free from racial slurs and biases. (b) The software failure incident related to accidental factors is also apparent in the articles. WeChat apologized for the inappropriate translation and mentioned that the issue was due to an error in the artificial intelligence software that translates between Chinese and English [63905, 64493]. This suggests that the use of the racial slur was unintentional and not a deliberate action by the developers or the organization.
Duration temporary (a) The software failure incident in the articles can be categorized as temporary. WeChat's translation software used the N-word to translate a Chinese phrase meaning "black foreigner" due to an error in the artificial intelligence software [63905, 64493]. However, after receiving user feedback, WeChat immediately fixed the problem, indicating that the issue was not permanent and was rectified promptly [63905, 64493].
Behaviour omission, value, other (a) crash: The software failure incident in the articles does not involve a crash where the system loses state and does not perform any of its intended functions [63905, 64493]. (b) omission: The software failure incident involves the system omitting to perform its intended functions at an instance(s) by translating a neutral Chinese phrase into a racial slur, specifically the n-word [63905, 64493]. (c) timing: The software failure incident does not involve timing issues where the system performs its intended functions too late or too early [63905, 64493]. (d) value: The software failure incident involves the system performing its intended functions incorrectly by translating a neutral Chinese phrase into a racially offensive term [63905, 64493]. (e) byzantine: The software failure incident does not exhibit byzantine behavior where the system behaves erroneously with inconsistent responses and interactions [63905, 64493]. (f) other: The software failure incident involves the system making inappropriate translations due to biases and errors in the data sources on which it was trained, leading to offensive outputs [63905, 64493].

IoT System Layer

Layer Option Rationale
Perception None None
Communication None None
Application None None

Other Details

Category Option Rationale
Consequence other (a) death: People lost their lives due to the software failure - There is no mention of any deaths resulting from the software failure incident reported in the articles [63905, 64493]. (b) harm: People were physically harmed due to the software failure - There is no mention of physical harm to individuals due to the software failure incident reported in the articles [63905, 64493]. (c) basic: People's access to food or shelter was impacted because of the software failure - There is no mention of people's access to food or shelter being impacted by the software failure incident reported in the articles [63905, 64493]. (d) property: People's material goods, money, or data was impacted due to the software failure - The software failure incident involving the translation error in WeChat did not directly impact people's material goods, money, or data [63905, 64493]. (e) delay: People had to postpone an activity due to the software failure - The software failure incident did not lead to any activities being postponed due to the translation error in WeChat [63905, 64493]. (f) non-human: Non-human entities were impacted due to the software failure - The software failure incident primarily involved the incorrect translation of a phrase by WeChat's AI, impacting the perception of individuals but not non-human entities [63905, 64493]. (g) no_consequence: There were no real observed consequences of the software failure - The consequences of the software failure incident were primarily related to the inappropriate translations made by WeChat's AI, leading to negative implications for users [63905, 64493]. (h) theoretical_consequence: There were potential consequences discussed of the software failure that did not occur - The articles do not discuss potential consequences of the software failure that did not actually occur [63905, 64493]. (i) other: Was there consequence(s) of the software failure not described in the (a to h) options? What is the other consequence(s)? - The primary consequence of the software failure incident was the dissemination of inappropriate and offensive translations by WeChat's AI, leading to public outrage and the need for immediate correction [63905, 64493].
Domain information (a) The software failure incident reported in the articles is related to the information industry, specifically in the context of language translation services provided by WeChat [63905, 64493]. (g) The incident does not directly relate to utilities, as it pertains to a language translation error in a messaging app and not to power, gas, steam, water, or sewage services [63905, 64493]. (m) The software failure incident is not directly linked to any other industry beyond the information industry, as it primarily involves a translation error in a messaging app [63905, 64493].

Sources

Back to List