[Skip to Content]
[Skip to Content Landing]
Views 1,570
Citations 0
Original Investigation
September 12, 2019

Development and Validation of a Deep Learning System to Detect Glaucomatous Optic Neuropathy Using Fundus Photographs

Author Affiliations
  • 1Beijing Institute of Ophthalmology, Beijing Tongren Hospital, Capital Medical University, Beijing, China
  • 2Beijing Ophthalmology and Visual Science Key Lab, Beijing, China
  • 3School of Electronic and Information Engineering, Beihang University, Beijing, China
  • 4School of Biological Sciences, University of East Anglia, Norwich, United Kingdom
  • 5Department of Ophthalmology, Peking University Third Hospital, Beijing, China
  • 6Ophthalmology Hospital, First Hospital of Harbin Medical University, Harbin, Heilongjiang, China
  • 7Department of Ophthalmology, Beijing Children’s Hospital, Capital Medical University, Beijing, China
  • 8Department of Mathematics, Beijing University of Chemical Technology, Beijing, China
  • 9College of Computer Science,Nankai University, Tianjin, China
  • 10Beijing Shanggong Medical Technology Co., Ltd, Beijing, China
  • 11Department of Ophthalmology, Byers Eye Institute at Stanford University, Palo Alto, California
  • 12Department of Ophthalmology and Visual Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Kowloon, Hong Kong, China
  • 13Singapore Eye Research Institute, Singapore National Eye Center, Singapore
  • 14Shiley Eye Institute, University of California, San Diego, La Jolla, California
JAMA Ophthalmol. Published online September 12, 2019. doi:10.1001/jamaophthalmol.2019.3501
Key Points

Question  How does a deep learning system compare with professional human graders in detecting glaucomatous optic neuropathy?

Findings  In this cross-sectional study, the deep learning system showed a sensitivity and specificity of greater than 90% for detecting glaucomatous optic neuropathy in a local validation dataset, in 3 clinical-based datasets, and in a real-world distribution dataset. The deep learning system showed lower sensitivity when tested in multiethnic and website-based datasets.

Meaning  This assessment of fundus images suggests that deep learning systems can provide a tool with high sensitivity and specificity that might expedite screening for glaucomatous optic neuropathy.

Abstract

Importance  A deep learning system (DLS) that could automatically detect glaucomatous optic neuropathy (GON) with high sensitivity and specificity could expedite screening for GON.

Objective  To establish a DLS for detection of GON using retinal fundus images and glaucoma diagnosis with convoluted neural networks (GD-CNN) that has the ability to be generalized across populations.

Design, Setting, and Participants  In this cross-sectional study, a DLS for the classification of GON was developed for automated classification of GON using retinal fundus images obtained from the Chinese Glaucoma Study Alliance, the Handan Eye Study, and online databases. The researchers selected 241 032 images were selected as the training dataset. The images were entered into the databases on June 9, 2009, obtained on July 11, 2018, and analyses were performed on December 15, 2018. The generalization of the DLS was tested in several validation datasets, which allowed assessment of the DLS in a clinical setting without exclusions, testing against variable image quality based on fundus photographs obtained from websites, evaluation in a population-based study that reflects a natural distribution of patients with glaucoma within the cohort and an additive dataset that has a diverse ethnic distribution. An online learning system was established to transfer the trained and validated DLS to generalize the results with fundus images from new sources. To better understand the DLS decision-making process, a prediction visualization test was performed that identified regions of the fundus images utilized by the DLS for diagnosis.

Exposures  Use of a deep learning system.

Main Outcomes and Measures  Area under the receiver operating characteristics curve (AUC), sensitivity and specificity for DLS with reference to professional graders.

Results  From a total of 274 413 fundus images initially obtained from CGSA, 269 601 images passed initial image quality review and were graded for GON. A total of 241 032 images (definite GON 29 865 [12.4%], probable GON 11 046 [4.6%], unlikely GON 200 121 [83%]) from 68 013 patients were selected using random sampling to train the GD-CNN model. Validation and evaluation of the GD-CNN model was assessed using the remaining 28 569 images from CGSA. The AUC of the GD-CNN model in primary local validation datasets was 0.996 (95% CI, 0.995-0.998), with sensitivity of 96.2% and specificity of 97.7%. The most common reason for both false-negative and false-positive grading by GD-CNN (51 of 119 [46.3%] and 191 of 588 [32.3%]) and manual grading (50 of 113 [44.2%] and 183 of 538 [34.0%]) was pathologic or high myopia.

Conclusions and Relevance  Application of GD-CNN to fundus images from different settings and varying image quality demonstrated a high sensitivity, specificity, and generalizability for detecting GON. These findings suggest that automated DLS could enhance current screening programs in a cost-effective and time-efficient manner.

Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Not all submitted comments are published. Please see our commenting policy for details.

Limit 140 characters
Limit 3600 characters or approximately 600 words
    ×