[Skip to Content]
Access to paid content on this site is currently suspended due to excessive activity being detected from your IP address 34.204.191.0. Please contact the publisher to request reinstatement.
[Skip to Content Landing]
Original Investigation
February 21, 2019

Evaluation of an Algorithm for Identifying Ocular Conditions in Electronic Health Record Data

Author Affiliations
  • 1W. K. Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan Medical School, Ann Arbor
  • 2Department of Health Management and Policy, University of Michigan School of Public Health, Ann Arbor
  • 3Center for Eye Policy and Innovation, University of Michigan, Ann Arbor
  • 4Data Office for Clinical and Translational Research, University of Michigan Medical School, Ann Arbor
  • 5Department of Pediatrics, University of Michigan Medical School, Ann Arbor
JAMA Ophthalmol. 2019;137(5):491-497. doi:10.1001/jamaophthalmol.2018.7051
Key Points

Question  What method other than assessing administrative billing codes can researchers, using big data, apply to accurately identify patients with ocular diseases of interest?

Findings  In this study of the electronic health records of 122 339 eye care recipients, a newly developed and validated algorithm that searches structured and unstructured data in electronic health records successfully detected most patients with and without exfoliation syndrome.

Meaning  Algorithms may enhance the ability of researchers to make use of big data to study patients with ocular diseases.

Abstract

Importance  For research involving big data, researchers must accurately identify patients with ocular diseases or phenotypes of interest. Reliance on administrative billing codes alone for this purpose is limiting.

Objective  To develop a method to accurately identify the presence or absence of ocular conditions of interest using electronic health record (EHR) data.

Design, Setting, and Participants  This study is a retrospective analysis of the EHR data of patients (n = 122 339) in the Sight Outcomes Research Collaborative Ophthalmology Data Repository who received eye care at participating academic medical centers between August 1, 2012, and August 31, 2017. An algorithm that searches structured and unstructured (free-text) EHR data for conditions of interest was developed and then tested to determine how well it could detect the presence or absence of exfoliation syndrome (XFS). The algorithm was trained to search for evidence of XFS among a sample of patients with and without XFS (n = 200) by reviewing International Classification of Diseases, Ninth Revision or International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-9 or ICD-10) billing codes, the patient’s problem list, and text within the ocular examination section and unstructured (free-text) data in the EHR. The likelihood that each patient had XFS was estimated using logistic least absolute shrinkage and selection operator (LASSO) regression. The EHR data of all patients were run through the algorithm to generate an XFS probability score for each patient. The algorithm was validated with review of EHRs by glaucoma specialists.

Main Outcomes and Measures  Positive predictive value (PPV) and negative predictive value (NPV) of the algorithm were computed as the proportion of patients correctly classified with XFS or without XFS.

Results  This study included 122 339 patients, with a mean (SD) age of 52.4 (25.1) years. Of these patients, 69 002 (56.4%) were female and 99 579 (81.4%) were white. The algorithm assigned a less than 10% probability of XFS for 121 085 patients (99.0%) as well as an XFS probability score of more than 75% for 543 patients (0.4%), more than 90% for 353 patients (0.3%), and more than 99% for 83 patients (0.07%). Validated by glaucoma specialists, the algorithm had a PPV of 95.0% (95% CI, 89.5%-97.7%) and an NPV of 100% (95% CI, 91.2%-100%). When there was ICD-9 or ICD-10 billing code documentation of XFS, in 86% or 96% of the records, respectively, evidence of XFS was also recorded elsewhere in the EHR. Conversely, when there was clinical examination or free-text evidence of XFS, it was documented with ICD-9 codes only approximately 40% of the time and even less often with ICD-10 codes.

Conclusions and Relevance  The algorithm developed, tested, and validated in this study appears to be better at identifying the presence or absence of XFS in EHR data than the conventional approach of assessing only billing codes; such an algorithm may enhance the ability of investigators to use EHR data to study patients with ocular diseases.

×