Assessing the Use of Google Translate for Spanish and Chinese Translations of Emergency Department Discharge Instructions | Emergency Medicine | JAMA Internal Medicine | JAMA Network
[Skip to Navigation]
Table 1.  Characteristics of Inaccurately Translated Sentences and Clinically Significant Potential Harm From Inaccurate Translations
Characteristics of Inaccurately Translated Sentences and Clinically Significant Potential Harm From Inaccurate Translations
Table 2.  Examples of Inaccurate Translations and Associated Level of Potential Clinical Harm
Examples of Inaccurate Translations and Associated Level of Potential Clinical Harm
1.
Johnson  A, Sandford  J, Tyndall  J.  Written and verbal information versus verbal information only for patients being discharged from acute hospital settings to home.  Cochrane Database Syst Rev. 2003;4(4):CD003716. doi:10.1002/14651858.CD003716PubMedGoogle Scholar
2.
Khanna  RR, Karliner  LS, Eck  M, Vittinghoff  E, Koenig  CJ, Fang  MC.  Performance of an online translation tool when applied to patient educational material.  J Hosp Med. 2011;6(9):519-525. doi:10.1002/jhm.898PubMedGoogle ScholarCrossref
3.
Wu  Y, Schuster  M, Chen  Z,  et al. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. https://arxiv.org/abs/1609.08144. Accessed January 17, 2019.
4.
Weiss  AJ, Wier  LM, Stocks  C, Blanchard  J. Overview of Emergency Department Visits in the United States, 2011. Agency for Healthcare Research and Quality, Healthcare Cost and Utilization Project Statistical Brief #174. June 2014. https://www.hcup-us.ahrq.gov/reports/statbriefs/sb174-Emergency-Department-Visits-Overview.pdf. Accessed January 17, 2019.
5.
Castro  CM, Wilson  C, Wang  F, Schillinger  D.  Babel babble: physicians’ use of unclarified medical jargon with patients.  Am J Health Behav. 2007;31(suppl 1):S85-S95. doi:10.5993/AJHB.31.s1.11PubMedGoogle ScholarCrossref
6.
Nápoles  AM, Santoyo-Olsson  J, Karliner  LS, Gregorich  SE, Pérez-Stable  EJ.  Inaccurate language interpretation and its clinical significance in the medical encounters of Spanish-speaking Latinos.  Med Care. 2015;53(11):940-947. doi:10.1097/MLR.0000000000000422PubMedGoogle ScholarCrossref
Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Not all submitted comments are published. Please see our commenting policy for details.

Limit 140 characters
Limit 3600 characters or approximately 600 words
    Research Letter
    February 25, 2019

    Assessing the Use of Google Translate for Spanish and Chinese Translations of Emergency Department Discharge Instructions

    Author Affiliations
    • 1Division of General Internal Medicine, Department of Medicine at Zuckerberg San Francisco General Hospital, University of California, San Francisco
    • 2University of Michigan School of Medicine, Ann Arbor
    • 3Department of Emergency Medicine, University of California, San Francisco
    • 4Center for Vulnerable Populations at University of California, San Francisco
    JAMA Intern Med. 2019;179(4):580-582. doi:10.1001/jamainternmed.2018.7653

    Patients with limited English proficiency experience communication barriers to health care in English-speaking countries. Written communication improves comprehension,1 but pretranslated standard instructions cannot address patient-specific issues (eg, medication titration). Machine translation tools, including Google Translate (GT), have potential to improve communication with these patients, but prior studies showed limited accuracy; 1 study found that GT Spanish translations of patient education materials were 60% accurate, with 4% resulting in serious error.2

    In 2017, GT changed its translation algorithm, claiming significant improvement.3 In this study, we assess the use of GT to translate emergency department (ED) discharge instructions into Spanish and Chinese.

    Methods

    We abstracted 100 free-texted ED discharge instructions and oversampled for medication changes and common complaints.4 We analyzed each sentence by content category; Flesch-Kincaid readability score; use of medical jargon,5 such as atypical use of normal words (eg, positive test result) or medical terminology; and presence of nonstandard English (spelling or grammar errors, abbreviations, colloquial English, proper nouns). Content categories included explanation of diagnosis and/or results, follow-up instructions, medication instructions, return precautions, and greeting.

    Using GT we translated instructions into Spanish and Chinese, and then bilingual translators translated the text back into English.

    The primary outcome was sentence translation accuracy, assessed for overall content accuracy, not word-for-word accuracy, and coded as a binary outcome. Two clinicians coded accuracy independently; a third adjudicated disagreements. A second translator reviewed back-translations deemed inaccurate to ensure these were not back-translator error.

    Potential for harm from inaccurate translations was assessed by 2 clinicians (with a third adjudicating) using an established rating system: clinically nonsignificant, clinically significant, and life-threatening potential harm.6 For analyses, we used a binary variable (clinically significant/life-threatening vs clinically nonsignificant/no harm).

    We used logistic regression analyses stratified by language to assess associations between sentence characteristics and accuracy and/or harm. Variables with significance of P < .20 in bivariate analyses were used in multivariable analyses.

    Results

    The 100 sets of patient instructions contained 647 sentences. Overall, 594 (92%) and 522 (81%) sentences were accurately translated into Spanish and Chinese, respectively, by GT (Table 1). A minority of inaccurate translations had potential for clinically significant harm: in Spanish, 15 (28%) of 53 inaccuracies and 15 (2%) of 647 sentences; in Chinese, 50 (40%) of 125 inaccuracies and 50 (8%) of 647 sentences. Some errors were correct translations of errant English instructions, but overall, content was inaccurate owing to grammar or typographical errors (Table 2) that would readily have been overlooked or understood by a reader of the English text.

    Only spelling and grammar anomalies were associated with inaccurate translations in multivariable analyses: Spanish (odds ratio [OR], 2.6; 95% CI, 1.1-5.8); Chinese (OR, 2.6; 95% CI, 1.3-5.0).

    In multivariable analyses, potential harm was associated in Spanish with a Flesch-Kincaid reading level higher than eighth grade (OR, 4.0; 95% CI, 1.2-13.5) and follow-up instructions (OR, 3.5; 95% CI, 1.2-10.2); and in Chinese with medical terminology (OR, 2.4; 95% CI, 1.2-4.9), spelling or grammar anomalies (OR, 3.1; 95% CI, 1.4-7.2), and colloquial English (OR, 5.9; 95% CI, 1.4-24.7).

    Discussion

    Discharge instructions were translated by the new GT algorithm with higher accuracy and fewer seriously harmful inaccuracies than previously,2 yet 2% of Spanish and 8% of Chinese sentence translations had potential for significant harm. While GT can supplement (not replace) written English instructions, machine-translated instructions should include a warning about potentially inaccurate translations.

    Clinicians using GT can reduce potential harm by having patients read translations while receiving verbal instructions; being vigilant about spelling and grammar; and avoiding complicated grammar, medical jargon (eg, fingerstick), and colloquial English.

    Study limitations include assessment of only 2 languages (though our inclusion of Chinese is a strength, since non-European languages are often less accurately translated by machines); no assessment of translation readability; and no comparison to human translators.

    Google Translate can be used to translate clinician-entered, patient-specific ED instructions for Spanish- and Chinese-speaking patients. Potential for harm can be minimized by using clear communication practices. We recommend including English instructions and automated warnings regarding the use of machine translation.

    Back to top
    Article Information

    Accepted for Publication: November 13, 2018.

    Corresponding Author: Elaine C. Khoong, MD, MS, Division of General Internal Medicine, Department of Medicine at Zuckerberg San Francisco General Hospital, University of California, San Francisco, 1001 Potrero Ave, 1M, San Francisco, CA 94122 (elaine.khoong@ucsf.edu).

    Published Online: February 25, 2019. doi:10.1001/jamainternmed.2018.7653

    Author Contributions: Dr Khoong had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

    Study concept and design: Khoong, Brown, Fernandez.

    Acquisition, analysis, or interpretation of data: All authors.

    Drafting of the manuscript: All authors.

    Critical revision of the manuscript for important intellectual content: All authors.

    Statistical analysis: Khoong, Brown.

    Administrative, technical, or material support: Steinbrook, Fernandez.

    Study supervision: Fernandez.

    Conflict of Interest Disclosures: None reported.

    References
    1.
    Johnson  A, Sandford  J, Tyndall  J.  Written and verbal information versus verbal information only for patients being discharged from acute hospital settings to home.  Cochrane Database Syst Rev. 2003;4(4):CD003716. doi:10.1002/14651858.CD003716PubMedGoogle Scholar
    2.
    Khanna  RR, Karliner  LS, Eck  M, Vittinghoff  E, Koenig  CJ, Fang  MC.  Performance of an online translation tool when applied to patient educational material.  J Hosp Med. 2011;6(9):519-525. doi:10.1002/jhm.898PubMedGoogle ScholarCrossref
    3.
    Wu  Y, Schuster  M, Chen  Z,  et al. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. https://arxiv.org/abs/1609.08144. Accessed January 17, 2019.
    4.
    Weiss  AJ, Wier  LM, Stocks  C, Blanchard  J. Overview of Emergency Department Visits in the United States, 2011. Agency for Healthcare Research and Quality, Healthcare Cost and Utilization Project Statistical Brief #174. June 2014. https://www.hcup-us.ahrq.gov/reports/statbriefs/sb174-Emergency-Department-Visits-Overview.pdf. Accessed January 17, 2019.
    5.
    Castro  CM, Wilson  C, Wang  F, Schillinger  D.  Babel babble: physicians’ use of unclarified medical jargon with patients.  Am J Health Behav. 2007;31(suppl 1):S85-S95. doi:10.5993/AJHB.31.s1.11PubMedGoogle ScholarCrossref
    6.
    Nápoles  AM, Santoyo-Olsson  J, Karliner  LS, Gregorich  SE, Pérez-Stable  EJ.  Inaccurate language interpretation and its clinical significance in the medical encounters of Spanish-speaking Latinos.  Med Care. 2015;53(11):940-947. doi:10.1097/MLR.0000000000000422PubMedGoogle ScholarCrossref
    ×