What is the association of having a very low number (“low-shot”) of training images with the performance of artificial intelligence algorithms for retinal diagnostics?
This cross-sectional study found that performance degradation occurred when using traditional algorithms with low numbers of training images. When using only 160 training images per class, traditional approaches had an area under the curve of 0.6585; low-shot methods using contrastive self-supervision outperformed this with an area under the curve of 0.7467.
These findings suggest that low-shot deep learning methods show promise for use in artificial intelligence retinal diagnostics and may be beneficial for situations involving much less training data, such as rare retinal diseases or addressing artificial intelligence bias.
Recent studies have demonstrated the successful application of artificial intelligence (AI) for automated retinal disease diagnostics but have not addressed a fundamental challenge for deep learning systems: the current need for large, criterion standard–annotated retinal data sets for training. Low-shot learning algorithms, aiming to learn from a relatively low number of training data, may be beneficial for clinical situations involving rare retinal diseases or when addressing potential bias resulting from data that may not adequately represent certain groups for training, such as individuals older than 85 years.
To evaluate whether low-shot deep learning methods are beneficial when using small training data sets for automated retinal diagnostics.
Design, Setting, and Participants
This cross-sectional study, conducted from July 1, 2019, to June 21, 2020, compared different diabetic retinopathy classification algorithms, traditional and low-shot, for 2-class designations (diabetic retinopathy warranting referral vs not warranting referral). The public domain EyePACS data set was used, which originally included 88 692 fundi from 44 346 individuals. Statistical analysis was performed from February 1 to June 21, 2020.
Main Outcomes and Measures
The performance (95% CIs) of the various AI algorithms was measured via receiver operating curves and their area under the curve (AUC), precision recall curves, accuracy, and F1 score, evaluated for different training data sizes, ranging from 5120 to 10 samples per class.
Deep learning algorithms, when trained with sufficiently large data sets (5120 samples per class), yielded comparable performance, with an AUC of 0.8330 (95% CI, 0.8140-0.8520) for a traditional approach (eg, fined-tuned ResNet), compared with low-shot methods (AUC, 0.8348 [95% CI, 0.8159-0.8537]) (using self-supervised Deep InfoMax [our method denoted as DIM]). However, when far fewer training images were available (n = 160), the traditional deep learning approach had an AUC decreasing to 0.6585 (95% CI, 0.6332-0.6838) and was outperformed by a low-shot method using self-supervision with an AUC of 0.7467 (95% CI, 0.7239-0.7695). At very low shots (n = 10), the traditional approach had performance close to chance, with an AUC of 0.5178 (95% CI, 0.4909-0.5447) compared with the best low-shot method (AUC, 0.5778 [95% CI, 0.5512-0.6044]).
Conclusions and Relevance
These findings suggest the potential benefits of using low-shot methods for AI retinal diagnostics when a limited number of annotated training retinal images are available (eg, with rare ophthalmic diseases or when addressing potential AI bias).
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Burlina P, Paul W, Mathew P, Joshi N, Pacheco KD, Bressler NM. Low-Shot Deep Learning of Diabetic Retinopathy With Potential Applications to Address Artificial Intelligence Bias in Retinal Diagnostics and Rare Ophthalmic Diseases. JAMA Ophthalmol. Published online September 03, 2020. doi:10.1001/jamaophthalmol.2020.3269
Coronavirus Resource Center
Customize your JAMA Network experience by selecting one or more topics from the list below.
Create a personal account or sign in to: