[Skip to Content]
Access to paid content on this site is currently suspended due to excessive activity being detected from your IP address Please contact the publisher to request reinstatement.
[Skip to Content Landing]
Poissant  L, Pereira  J, Tamblyn  R, Kawasumi  Y.  The impact of electronic health records on time efficiency of physicians and nurses: a systematic review.  J Am Med Inform Assoc. 2005;12(5):505-516.PubMedGoogle ScholarCrossref
Pollard  SE, Neri  PM, Wilcox  AR,  et al.  How physicians document outpatient visit notes in an electronic health record.  Int J Med Inform. 2013;82(1):39-46.PubMedGoogle ScholarCrossref
Stewart  B. Front-End Speech 2014: Functionality Doesn't Trump Physician Resistance. Orem, UT: KLAS; 2014. https://klasresearch.com/report/front-end-speech-2014/940. Accessed April 26, 2018.
Hodgson  T, Magrabi  F, Coiera  E.  Efficiency and safety of speech recognition for documentation in the electronic health record.  J Am Med Inform Assoc. 2017;24(6):1127-1133.PubMedGoogle ScholarCrossref
Johnson  M, Lapkin  S, Long  V,  et al.  A systematic review of speech recognition technology in health care.  BMC Med Inform Decis Mak. 2014;14:94.PubMedGoogle ScholarCrossref
Hammana  I, Lepanto  L, Poder  T, Bellemare  C, Ly  MS.  Speech recognition in the radiology department: a systematic review.  Health Inf Manag. 2015;44(2):4-10.PubMedGoogle Scholar
Hodgson  T, Coiera  E.  Risks and benefits of speech recognition for clinical documentation: a systematic review.  J Am Med Inform Assoc. 2016;23(e1):e169-e179.PubMedGoogle ScholarCrossref
Safran  DG, Miller  W, Beckman  H.  Organizational dimensions of relationship-centered care. theory, evidence, and practice.  J Gen Intern Med. 2006;21(suppl 1):S9-S15.PubMedGoogle ScholarCrossref
Goss  FR, Zhou  L, Weiner  SG.  Incidence of speech recognition errors in the emergency department.  Int J Med Inform. 2016;93:70-73.PubMedGoogle ScholarCrossref
Siegal  D, Ruoff  G.  Data as a catalyst for change: stories from the frontlines.  J Healthc Risk Manag. 2015;34(3):18-25.PubMedGoogle ScholarCrossref
Ruder  DB.  Malpractice claims analysis confirms risks in EHRs.  Patient Safety and Quality Healthcare. https://www.psqh.com/analysis/malpractice-claims-analysis-confirms-risks-in-ehrs/. Published Feburary 9, 2014. Accessed April 26, 2018.Google Scholar
Motamedi  SM, Posadas-Calleja  J, Straus  S,  et al.  The efficacy of computer-enabled discharge communication interventions: a systematic review.  BMJ Qual Saf. 2011;20(5):403-415.PubMedGoogle ScholarCrossref
Rosenbloom  ST, Denny  JC, Xu  H, Lorenzi  N, Stead  WW, Johnson  KB.  Data from clinical notes: a perspective on the tension between structure and flexible documentation.  J Am Med Inform Assoc. 2011;18(2):181-186.PubMedGoogle ScholarCrossref
Davidson  SJ, Zwemer  FL  Jr, Nathanson  LA, Sable  KN, Khan  AN.  Where’s the beef? the promise and the reality of clinical documentation.  Acad Emerg Med. 2004;11(11):1127-1134.PubMedGoogle ScholarCrossref
Cowan  J.  Clinical governance and clinical documentation: still a long way to go?  Clin Perform Qual Health Care. 2000;8(3):179-182.PubMedGoogle ScholarCrossref
von Elm  E, Altman  DG, Egger  M, Pocock  SJ, Gøtzsche  PC, Vandenbroucke  JP; STROBE Initiative.  The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies.  PLoS Med. 2007;4(10):e296.PubMedGoogle ScholarCrossref
Ranks NL Webmaster Tools. Stopwords. https://www.ranks.nl/stopwords. Accessed April 26, 2018.
Ogren  PV. Knowtator: a Protégé plug-in for annotated corpus construction. In: Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology; June 4-9, 2006; New York, NY. doi:10.3115/1225785.1225791
Dawson  L, Johnson  M, Suominen  H,  et al.  A usability framework for speech recognition technologies in clinical handover: a pre-implementation study.  J Med Syst. 2014;38(6):56.PubMedGoogle ScholarCrossref
Quint  LE, Quint  DJ, Myles  JD.  Frequency and spectrum of errors in final radiology reports generated with automatic speech recognition technology.  J Am Coll Radiol. 2008;5(12):1196-1199.PubMedGoogle ScholarCrossref
Zick  RG, Olsen  J.  Voice recognition software versus a traditional transcription service for physician charting in the ED.  Am J Emerg Med. 2001;19(4):295-298.PubMedGoogle ScholarCrossref
Pezzullo  JA, Tung  GA, Rogg  JM, Davis  LM, Brody  JM, Mayo-Smith  WW.  Voice recognition dictation: radiologist as transcriptionist.  J Digit Imaging. 2008;21(4):384-389.PubMedGoogle ScholarCrossref
Kanal  KM, Hangiandreou  NJ, Sykes  AM,  et al.  Initial evaluation of a continuous speech recognition program for radiology.  J Digit Imaging. 2001;14(1):30-37.PubMedGoogle ScholarCrossref
Zemmel  NJ, Park  SM, Maurer  EJ, Leslie  LF, Edlich  RF.  Evaluation of VoiceType Dictation for Windows for the radiologist.  Med Prog Technol. 1996-1997;21(4):177-180.PubMedGoogle ScholarCrossref
Ramaswamy  MR, Chaljub  G, Esch  O, Fanning  DD, vanSonnenberg  E.  Continuous speech recognition in MR imaging reporting: advantages, disadvantages, and impact.  Am J Roentgenol. 2000;174(3):617-622.PubMedGoogle ScholarCrossref
Smith  NT, Brien  RA, Pettus  DC, Jones  BR, Quinn  ML, Sarnat  A.  Recognition accuracy with a voice-recognition system designed for anesthesia record keeping.  J Clin Monit. 1990;6(4):299-306.PubMedGoogle ScholarCrossref
Issenman  RM, Jaffer  IH.  Use of voice recognition software in an outpatient pediatric specialty practice.  Pediatrics. 2004;114(3):e290-e293.PubMedGoogle ScholarCrossref
Ilgner  J, Düwel  P, Westhofen  M.  Free-text data entry by speech recognition software and its impact on clinical routine.  Ear Nose Throat J. 2006;85(8):523-527.PubMedGoogle Scholar
Yuhaniak Irwin  J, Fernando  S, Schleyer  T, Spallek  H.  Speech recognition in dental software systems: features and functionality.  Stud Health Technol Inform. 2007;129(Pt 2):1127-1131.PubMedGoogle Scholar
Al-Aynati  MM, Chorneyko  KA.  Comparison of voice-automated transcription and human transcription in generating pathology reports.  Arch Pathol Lab Med. 2003;127(6):721-725.PubMedGoogle Scholar
Voll  K, Atkins  S, Forster  B.  Improving the utility of speech recognition through error detection.  J Digit Imaging. 2008;21(4):371-377.PubMedGoogle ScholarCrossref
Basma  S, Lord  B, Jacks  LM, Rizk  M, Scaranelo  AM.  Error rates in breast imaging reports: comparison of automatic speech recognition and dictation transcription.  AJR Am J Roentgenol. 2011;197(4):923-927.PubMedGoogle ScholarCrossref
Vorbeck  F, Ba-Ssalamah  A, Kettenbach  J, Huebsch  P.  Report generation using digital speech recognition in radiology.  Eur Radiol. 2000;10(12):1976-1982.PubMedGoogle ScholarCrossref
Yadav  S, Kazanji  N, Narayan  KC,  et al.  Comparison of accuracy of physical examination findings in initial progress notes between paper charts and a newly implemented electronic health record.  J Am Med Inform Assoc. 2017;24(1):140-144.PubMedGoogle ScholarCrossref
Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Not all submitted comments are published. Please see our commenting policy for details.

Limit 140 characters
Limit 3600 characters or approximately 600 words
    2 Comments for this article
    Proof-Reading Essential
    Stephan Fihn, MD MPH | University of Washington

    Errors in medical records were common long before speech recognition software created themand proof-reading notes has always been essential. Unfortunately, speech recognition does not obviate that responsibility although the technology is constantly improving. Even so, in an era when patients can readily access their records, accuracy is ever more important. (Records are not tweets.)

    CONFLICT OF INTEREST: Deputy Editor, JAMA Network Open
    Broader Range of parameters essential in this and future speech recognition studies
    Charles Porter, MD | University of Kansas Medical Center
    The authors are commended for this innovative analysis of speech recognition (SR) upon clinician workflow and SR accuracy. The absence of FDA or other oversight agency on the impact upon patient care of this technology creates a strong need for comprehensive studies of SR effects and side effects. Such studies as this one may establish error rates promoted by SR vendors (Dragon, in this case) used as “industry standards” As a clinical cardiologist who has used Dragon SR exclusively for all outpatient encounters since the end of 2017, my direct observation of inherent challenges of SR allows identification of three relevant process variables that are not provided in this report: 1. Voice input platform, 2. Clinician user profiles and 3. Organizational SR infrastructure. Three voice input platforms into Dragon/Nuance SR are available. The Nuance microphone (MIC) sold as a wired USB connected device in my anecdotal experience and that of colleagues provides faster connection time, fewer losses of connection and a faster rate of speech speed with a lower error rate than when the second option, the Nuance Medical “Mobile MIC” App on a clinician’s personal smart phone is used with Bluetooth connection to the desktop PC. The Mobile MIC platform has minimal expense for the healthcare system and demands the use of the clinician’s personal smartphone. Non-Nuance USB connected digital MICs that may cost a fraction of the Nuance MIC can be used but have minimal dictation control options compared to the Nuance wired MIC and generate a warning message upon connection that the devices are not recommended due to uncertain accuracy rates of Dragon SR. In addition to reporting SR input platforms being studied, SR analyses should characterize the technology commitment & expertise profiles of the clinician user cohort studied. Error rate reduction associated with sustained commitment, experience and SR training as well as willingness to allocate increased time as primary proofreader and editor of SR generated documents may vary widely among clinicians. Study results will be biased toward higher accuracy and lower error rate if the clinicians studied were “early adapters” of SR with extensive motivation and commitment to mastering SR technology, proofreading and editing as compared to results seen with collections of “technophobe” clinicians responding to employer mandates to use SR and who are resistant to spending more time serving as replacements for professional medical transcriptionists (PMT) and less time as clinicians. In this study clinician review time (CRT), the interval between time of return of a transcribed document and the time of physician signing was reported but the actual clinician time spent on document review and editing was not. Dragon mediated errors are more insidious than errors from PMT or hand typed data entries. Distortions of the English language such as these I’ve detected in Dragon SR “the situation was Ms. Stated” and “the ejection fraction has been consistent Lee below 35%” speak for themselves as uniquely insidious forms of SR mediated errors that require a greater degree of attention to detect and correct than errors clinicians face with editing PMT supported documents. Demands to minimize CRT can increase time pressure on clinicians who compelled to detailed document editing previously done by PMT may contribute to physician burnout. Allowance for time required when trainees are involved in document preparation is essential. Effect of SR on patient care time must be measured.
    Original Investigation
    Health Informatics
    July 6, 2018

    Analysis of Errors in Dictated Clinical Documents Assisted by Speech Recognition Software and Professional Transcriptionists

    Author Affiliations
    • 1Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
    • 2Harvard Medical School, Boston, Massachusetts
    • 3Department of Information Systems, Partners HealthCare, Boston, Massachusetts
    • 4Geisinger Commonwealth School of Medicine, Scranton, Pennsylvania
    • 5Department of Emergency Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
    • 6Department of Hospital Medicine, North Shore Medical Center, Salem, Massachusetts
    • 7University of Colorado School of Medicine, Aurora
    • 8Department of Computer Science, Brandeis University, Waltham, Massachusetts
    • 9Department of Emergency Medicine, University of Colorado, Aurora
    JAMA Netw Open. 2018;1(3):e180530. doi:10.1001/jamanetworkopen.2018.0530
    Key Points español 中文 (chinese)

    Question  How accurate are dictated clinical documents created by speech recognition software, edited by professional medical transcriptionists, and reviewed and signed by physicians?

    Findings  Among 217 clinical notes randomly selected from 2 health care organizations, the error rate was 7.4% in the version generated by speech recognition software, 0.4% after transcriptionist review, and 0.3% in the final version signed by physicians. Among the errors at each stage, 15.8%, 26.9%, and 25.9% involved clinical information, and 5.7%, 8.9%, and 6.4% were clinically significant, respectively.

    Meaning  An observed error rate of more than 7% in speech recognition–generated clinical documents demonstrates the importance of manual editing and review.


    Importance  Accurate clinical documentation is critical to health care quality and safety. Dictation services supported by speech recognition (SR) technology and professional medical transcriptionists are widely used by US clinicians. However, the quality of SR-assisted documentation has not been thoroughly studied.

    Objective  To identify and analyze errors at each stage of the SR-assisted dictation process.

    Design, Setting, and Participants  This cross-sectional study collected a stratified random sample of 217 notes (83 office notes, 75 discharge summaries, and 59 operative notes) dictated by 144 physicians between January 1 and December 31, 2016, at 2 health care organizations using Dragon Medical 360 | eScription (Nuance). Errors were annotated in the SR engine–generated document (SR), the medical transcriptionist–edited document (MT), and the physician’s signed note (SN). Each document was compared with a criterion standard created from the original audio recordings and medical record review.

    Main Outcomes and Measures  Error rate; mean errors per document; error frequency by general type (eg, deletion), semantic type (eg, medication), and clinical significance; and variations by physician characteristics, note type, and institution.

    Results  Among the 217 notes, there were 144 unique dictating physicians: 44 female (30.6%) and 10 unknown sex (6.9%). Mean (SD) physician age was 52 (12.5) years (median [range] age, 54 [28-80] years). Among 121 physicians for whom specialty information was available (84.0%), 35 specialties were represented, including 45 surgeons (37.2%), 30 internists (24.8%), and 46 others (38.0%). The error rate in SR notes was 7.4% (ie, 7.4 errors per 100 words). It decreased to 0.4% after transcriptionist review and 0.3% in SNs. Overall, 96.3% of SR notes, 58.1% of MT notes, and 42.4% of SNs contained errors. Deletions were most common (34.7%), then insertions (27.0%). Among errors at the SR, MT, and SN stages, 15.8%, 26.9%, and 25.9%, respectively, involved clinical information, and 5.7%, 8.9%, and 6.4%, respectively, were clinically significant. Discharge summaries had higher mean SR error rates than other types (8.9% vs 6.6%; difference, 2.3%; 95% CI, 1.0%-3.6%; P < .001). Surgeons’ SR notes had lower mean error rates than other physicians’ (6.0% vs 8.1%; difference, 2.2%; 95% CI, 0.8%-3.5%; P = .002). One institution had a higher mean SR error rate (7.6% vs 6.6%; difference, 1.0%; 95% CI, −0.2% to 2.8%; P = .10) but lower mean MT and SN error rates (0.3% vs 0.7%; difference, −0.3%; 95% CI, −0.63% to −0.04%; P = .03 and 0.2% vs 0.6%; difference, −0.4%; 95% CI, −0.7% to −0.2%; P = .003).

    Conclusions and Relevance  Seven in 100 words in SR-generated documents contain errors; many errors involve clinical information. That most errors are corrected before notes are signed demonstrates the importance of manual review, quality assurance, and auditing.