[Skip to Content]
Access to paid content on this site is currently suspended due to excessive activity being detected from your IP address 54.211.120.181. Please contact the publisher to request reinstatement.
[Skip to Content Landing]
Download PDF
Figure 1.
Web-based telemedicine system developed for project. Gestational age (GA), birth weight (BW), and postmenstrual age (PMA) are displayed. Three standard images from each retina are displayed (shown), along with up to 2 additional images per eye based on photographer's discretion (not shown). OD indicates right eye; OS, left eye.

Web-based telemedicine system developed for project. Gestational age (GA), birth weight (BW), and postmenstrual age (PMA) are displayed. Three standard images from each retina are displayed (shown), along with up to 2 additional images per eye based on photographer's discretion (not shown). OD indicates right eye; OS, left eye.

Figure 2.
Examples of variation in accuracy and reliability. A, Posterior pole view of eye classified by ophthalmoscopy and all 3 telemedicine graders as treatment-requiring retinopathy of prematurity (ROP). B and C, Posterior and temporal views of eye classified as mild ROP by ophthalmoscopy and 1 telemedicine grader and as type 2 prethreshold ROP by 2 telemedicine graders.

Examples of variation in accuracy and reliability. A, Posterior pole view of eye classified by ophthalmoscopy and all 3 telemedicine graders as treatment-requiring retinopathy of prematurity (ROP). B and C, Posterior and temporal views of eye classified as mild ROP by ophthalmoscopy and 1 telemedicine grader and as type 2 prethreshold ROP by 2 telemedicine graders.

Table 1. 
Ordinal Diagnostic Classification of ROP in Study Eyesa
Ordinal Diagnostic Classification of ROP in Study Eyesa
Table 2. 
Accuracy of Telemedical ROP Diagnosis by 3 Experienced Retinal Specialist Graders Based on Sensitivity, Specificity, and AUCsa
Accuracy of Telemedical ROP Diagnosis by 3 Experienced Retinal Specialist Graders Based on Sensitivity, Specificity, and AUCsa
Table 3. 
Intergrader Reliability of Telemedical ROP Diagnosis Based on κ and Weighted κ Statistics Between Each Pair of Gradersa
Intergrader Reliability of Telemedical ROP Diagnosis Based on κ and Weighted κ Statistics Between Each Pair of Gradersa
Table 4. 
Intragrader Reliability of Telemedical ROP Diagnosis Based on κ and Weighted κ Statistics for Each Gradera
Intragrader Reliability of Telemedical ROP Diagnosis Based on κ and Weighted κ Statistics for Each Gradera
Table 5. 
Evaluation of Image Quality of 248 Study Eyes Photographed by Trained Nursea
Evaluation of Image Quality of 248 Study Eyes Photographed by Trained Nursea
1.
Cryotherapy for Retinopathy of Prematurity Cooperative Group, Multicenter trial of cryotherapy for retinopathy of prematurity: preliminary results. Arch Ophthalmol 1988;106 (4) 471- 479
PubMedArticle
2.
Early Treatment for Retinopathy of Prematurity Cooperative Group, Revised indications for the treatment of retinopathy of prematurity: results of the Early Treatment for Retinopathy of Prematurity Randomized Trial. Arch Ophthalmol 2003;121 (12) 1684- 1694
PubMedArticle
3.
International Classification of Retinopathy of Prematurity, The Committee for the Classification of Retinopathy of Prematurity. Arch Ophthalmol 1984;102 (8) 1130- 1134
PubMedArticle
4.
International Committee for the Classification of Retinopathy of Prematurity, The international classification of retinopathy of prematurity revisited. Arch Ophthalmol 2005;123 (7) 991- 999
PubMedArticle
5.
Muñoz  BWest  SK Blindness and visual impairment in the Americas and the Caribbean. Br J Ophthalmol 2002;86 (5) 498- 504
PubMedArticle
6.
Steinkuller  PGDu  LGilbert  C  et al.  Childhood blindness. J AAPOS 1999;3 (1) 26- 32
PubMedArticle
7.
Hamilton  BMartin  JVentura  S Births: preliminary data for 2005. http://www.cdc.gov/nchs/products/pubs/pubd/hestats/prelimbirths05/prelimbirths05.htmMarch 7, 2007
8.
Gilbert  CFielder  AGordillo  L  et al.  Characteristics of infants with severe retinopathy of prematurity in countries with low, moderate, and high levels of development: implications for screening programs. Pediatrics 2005;115 (5) e518- e525http://pediatrics.aappublications.org/cgi/content/abstract/115/5/e518?rss=1May 5, 2007
PubMedArticle
9.
Gilbert  CRahi  JEckstein  M  et al.  Retinopathy of prematurity in middle-income countries. Lancet 1997;350 (9070) 12- 14
PubMedArticle
10.
Section on Ophthalmology American Academy of Pediatrics,American Academy of Opthalmology,American Association for Pediatric Ophthalmology and Strabismus, Screening examination of premature infants for retinopathy of prematurity. Pediatrics 2006;117 (2) 572- 576[published correction appears in Pediatrics. 2006;118:1324].
PubMedArticle
11.
American Academy of Ophthalmology, Ophthalmologists warn of shortage in specialists who treat premature babies with blinding eye condition. http://www.aao.org/newsroom/release/20060713.cfmJanuary 5, 2007
12.
Perednia  DAAllen  A Telemedicine technology and clinical applications. JAMA 1995;273 (6) 483- 488
PubMedArticle
13.
Bashshur  RLReardon  TGShannon  GW Telemedicine: a new health care delivery system. Annu Rev Public Health 2000;21613- 637
PubMedArticle
14.
Field  MJed Telemedicine: a Guide to Assessing Telecommunications in Health Care.  Washington, DC National Academies Press1996;
15.
Shea  SStarren  JWeinstock  RS  et al.  Columbia University's Informatics for Diabetes Education and Telemedicine (IDEATel) project: rationale and design. J Am Med Inform Assoc 2002;9 (1) 49- 62
PubMedArticle
16.
Schwartz  SDHarrison  SAFerrone  PJTrese  MT Telemedical evaluation and management of retinopathy of prematurity using a fiberoptic digital fundus camera. Ophthalmology 2000;107 (1) 25- 28
PubMedArticle
17.
Ells  ALHolmes  JMAstle  WF  et al.  Telemedicine approach to screening for severe retinopathy of prematurity: a pilot study. Ophthalmology 2003;110 (11) 2113- 2117
PubMedArticle
18.
Chiang  MFKeenan  JDStarren  JB  et al.  Accuracy and reliability of remote retinopathy of prematurity diagnosis. Arch Ophthalmol 2006;124 (3) 322- 327
PubMedArticle
19.
Chiang  MFStarren  JBDu  YE  et al.  Remote image based retinopathy of prematurity diagnosis: a receiver operating characteristic analysis of accuracy. Br J Ophthalmol 2006;90 (10) 1292- 1296
PubMedArticle
20.
Roth  DBMorales  DFeuer  WJHess  DJohnson  RAFlynn  JT Screening for retinopathy of prematurity employing the RetCam-120: sensitivity and specificity. Arch Ophthalmol 2001;119 (2) 268- 272
PubMed
21.
Yen  KGHess  DBurke  B  et al.  The optimum time to employ telephotoscreening to detect retinopathy of prematurity. Trans Am Ophthalmol Soc 2000;98145- 150
PubMed
22.
Wu  CPetersen  RAVanderveen  DK Retcam imaging for retinopathy of prematurity screening. J AAPOS 2006;10 (2) 107- 111
PubMedArticle
23.
Section on Ophthalmology American Academy of Pediatrics; American Academy of Ophthalmology; American Association for Pediatric Ophthalmology and Strabismus, Screening examination of premature infants for retinopathy of prematurity. Pediatrics 2001;108 (3) 809- 811
PubMedArticle
24.
Landis  JRKoch  GG The measurement of observer agreement for categorical data. Biometrics 1977;33 (1) 159- 174
PubMedArticle
25.
Cohen  J Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 1968;70 (4) 213- 220Article
26.
Chiang  MFKeenan  JDDu  YE  et al.  Assessment of image-based technology: impact of referral cutoff on accuracy and reliability of remote retinopathy of prematurity diagnosis. AMIA Annu Symp Proc 2005;126- 130
PubMed
27.
Palmer  EAFlynn  JTHardy  RT  et al.  Incidence and early course of retinopathy of prematurity. Ophthalmology 1991;98 (11) 1628- 1640
PubMedArticle
28.
Good  WVHardy  RJDobson  V  et al. ETROP Cooperative Group, The incidence and course of retinopathy of prematurity: findings from the Early Treatment for Retinopathy of Prematurity Study. Pediatrics 2005;116 (1) 15- 23
PubMedArticle
29.
Stanberry  B Legal and ethical aspects of telemedicine. J Telemed Telecare 2006;12 (4) 166- 175
PubMedArticle
30.
Laws  DEMorton  CWeindling  MClark  D Systemic effects of screening for retinopathy of prematurity. Br J Ophthalmol 1996;80 (5) 425- 428
PubMedArticle
31.
Early Treatment Diabetic Retinopathy Study Research Group, Grading diabetic retinopathy from stereoscopic color fundus photographs—an extension of the modified Airlie House classification. ETDRS report No. 10. Ophthalmology 1991;98 (5) ((suppl)) 786- 806
PubMedArticle
32.
Scott  KEKim  DYWang  L  et al.  Telemedical retinopathy of prematurity diagnosis: intra-physician agreement between ophthalmoscopic and image-based examinations. OphthalmologyIn press
33.
Jackson  KMScott  KEGraff Zivin  J  et al.   Cost-utility analysis of telemedicine and standard ophthalmoscopy for retinopathy of prematurity management. Ophthalmology. In press 
Clinical Sciences
November 2007

Telemedical Retinopathy of Prematurity DiagnosisAccuracy, Reliability, and Image Quality

Author Affiliations

Author Affiliations: Departments of Ophthalmology (Drs Chiang, Wang, Busuioc, Kane, and Flynn and Mr Chan) and Biomedical Informatics (Drs Chiang and Starren) and Division of Neonatology (Ms Coki), Columbia University College of Physicians and Surgeons and Department of Epidemiology and Public Health, Albert Einstein College of Medicine (Dr Du), New York, New York; Division of Ophthalmology, Childrens Hospital Los Angeles, Los Angeles, California (Dr Lee); Retina Center of Vermont, Burlington (Dr Weissgold); and Department of Ophthalmology, Bascom Palmer Eye Institute, University of Miami Miller School of Medicine, Miami, Florida (Dr Berrocal).

Arch Ophthalmol. 2007;125(11):1531-1538. doi:10.1001/archopht.125.11.1531
Abstract

Objective  To prospectively measure accuracy, reliability, and image quality of telemedical retinopathy of prematurity (ROP) diagnosis.

Methods  Two-hundred forty-eight eyes from 67 consecutive infants underwent wide-angle retinal imaging by a trained neonatal nurse at 31 to 33 weeks’ and/or 35 to 37 weeks' postmenstrual age (PMA) using a standard protocol. Data were uploaded to a Web-based telemedicine system and interpreted by 3 expert retinal specialist graders who provided a diagnosis (no ROP, mild ROP, type 2 prethreshold ROP, treatment-requiring ROP) and an evaluation of image quality for each eye. Findings were compared with a reference standard of indirect ophthalmoscopy by an experienced pediatric ophthalmologist.

Results  At 35 to 37 weeks' PMA, sensitivity and specificity for diagnosis of mild or worse ROP were 0.908 and 1.000 for grader A, 0.971 and 1.000 for grader B, and 0.908 and 0.977 for grader C. Sensitivity and specificity for diagnosis of type 2 prethreshold or worse ROP were 1.000 and 0.943 for grader A, 1.000 and 0.930 for grader B, and 1.000 and 0.851 for grader C. At 35 to 37 weeks' PMA, weighted κ for intergrader reliability was 0.791 to 0.889, and κ for intragrader reliability for detection of type 2 prethreshold or worse ROP was 0.769 to 1.000. Image technical quality was rated as “adequate” or “possibly adequate” for diagnosis in 93.3% to 100% of eyes.

Conclusion  A telemedicine system using nurse-captured retinal images has the potential to improve existing shortcomings of ROP management, particularly at later PMAs.

Retinopathy of prematurity (ROP) is a vasoproliferative disease that is diagnosed by serial dilated ophthalmoscopy. Progress has occurred in validation of treatment criteria through the Cryotherapy for Retinopathy of Prematurity (CRYO-ROP) and Early Treatment for Retinopathy of Prematurity (ETROP) trials1,2 and in development of an international classification system.3,4 However, ROP continues to be a leading cause of childhood blindness throughout the world.5,6

Retinopathy of prematurity management presents significant challenges: (1) Diagnosis at the neonatal intensive care unit (NICU) bedside requires extensive travel and coordination and is logistically difficult. (2) The number of infants requiring surveillance is increasing. In the United States, the rate of premature births has grown from 9.4% to 12.7% since 1981.7 Worldwide, ROP incidence is rising as neonatal survival improves.8,9 New guidelines have expanded the gestational-age cutoff for examination to decrease the likelihood of missing larger infants with disease.10 (3) Availability of ophthalmologists who perform ROP examination is limited. A 2006 American Academy of Ophthalmology survey found that only 54% of retinal specialists and pediatric ophthalmologists are managing ROP and that more than 20% plan to stop because of concerns including medicolegal liability and poor reimbursement.11

One strategy for improving accessibility and delivery of ROP care is store-and-forward telemedicine, which is an emerging technology where medical data are captured for subsequent interpretation by a remote expert.12,13 Widespread adoption of telemedicine has been constrained by the lack of substantive evaluation data.1215 Although studies have shown that interpretation of digital retinal photographs may be accurate enough to identify clinically significant ROP,1619 concerns have been raised about image quality.2022 Furthermore, all published work to our knowledge has used images captured by ophthalmologists or ophthalmic photographers, and most studies have involved only a single image grader. Several studies have been confounded by designs in which the image grader was the same investigator who performed reference standard ophthalmoscopic examinations. The accuracy and reliability of telemedical ROP examination by multiple expert graders based on photographs obtained by nonophthalmic personnel are not known. This is an important gap in knowledge because large-scale ROP telemedicine systems would likely require image capture by neonatal personnel available at the point of care.

This article describes a prospective study to determine the accuracy and reliability of telemedical ROP diagnosis among 3 expert graders and the quality of image capture by a trained neonatal nurse. Results are compared with a reference standard of ophthalmoscopy by 1 of 2 experienced examiners.

METHODS
NURSE TRAINING AND TELEMEDICINE SYSTEM DESIGN

This study was approved by the Columbia University institutional review board. A neonatal nurse was trained to perform wide-angle retinal imaging using a commercially available device (RetCam II; Clarity Medical Systems, Pleasanton, California). This included 2 day-long instructional sessions with the manufacturer, followed by 6 weekly sessions with 1 of us (M.F.C.) during regular ophthalmoscopic examinations. At each session, approximately 3 infants were photographed, and images were correlated with clinical findings.

A store-and-forward ROP telemedicine application was developed by 2 of us (L.W. and M.F.C.). This included a secure database system (SQL 2005; Microsoft, Redmond, Washington); a module allowing the photographer to upload data and images; and a Web-based interface for expert interpretation. The system was designed to represent real-world telemedicine examinations and scenarios. Images from both eyes were displayed side by side, along with the birth weight, gestational age, and postmenstrual age (PMA) at the time of examination (Figure 1).

OPHTHALMOSCOPIC EXAMINATION AND RETINAL IMAGING

Infants hospitalized in the Columbia University NICU from November 1, 2005, though October 31, 2006, were included if they met existing ROP examination criteria and if their parents provided informed consent for participation.10,23 Patients were excluded if they had structural ocular anomalies, had previously received laser or other ROP treatment, or were considered unstable for examination by their neonatologist.

Each subject underwent 2 procedures sequentially performed at the NICU bedside under topical anesthesia: (1) Dilated ophthalmoscopic examination by 1 of 2 pediatric ophthalmologists (S.A.K. and M.F.C.), according to standard protocols.10,23 Both ophthalmologists had served as certified investigators in the ETROP study. Findings were documented on clinical templates, based on the international classification standard.3,4 (2) Imaging by the study nurse (O.C.), according to a protocol by which an image set consisting of posterior, temporal, and nasal photographs was captured from each retina. Each image set included up to 2 additional photos from any area of the eye, if felt by the nurse to contribute diagnostic information. Imaging was performed without input from the examining ophthalmologist. No subjects were excluded because of poor image quality or inability to capture photographs. No complications such as apnea or corneal injury occurred to prevent imaging.

Study infants were imaged during up to 2 sessions: (1) 31 to 33 weeks' PMA, which was intended to represent the time at which initial examinations are performed,10,23 and (2) 35 to 37 weeks' PMA, which was intended to optimize a time at which clinically significant disease occurs, while minimizing the number of study infants lost from hospital discharge or laser treatment. The best images were selected by the nurse, and data were uploaded to the telemedicine system.

TELEMEDICAL EXAMINATION

Three graders (T.C.L., D.J.W., and A.M.B.) independently performed telemedical examinations using the Web-based system. Each grader was a retina specialist with extensive experience reviewing RetCam images and was responsible for ROP examination and treatment in a tertiary care medical center. Two graders had authored peer-reviewed ROP manuscripts and the third had served as a principal investigator in the ETROP trial. No graders had previously examined any study infants.

Eyes were classified using an ordinal scale based on CRYO-ROP and ETROP criteria1,2: (1) no ROP; (2) mild ROP, defined as ROP less than type 2 disease; (3) type 2 prethreshold ROP (zone 1, stage 1 or 2, without plus disease, or zone 2, stage 3, without plus disease); (4) treatment-requiring ROP, defined as type 1 ROP (zone 1, any stage, with plus disease; zone 1, stage 3, without plus disease; or zone 2, stage 2 or 3, with plus disease) or threshold ROP (at least 5 contiguous or 8 noncontiguous clock hours of stage 3 in zone 1 or 2, with plus disease); or (5) unknown, meaning that the grader was uncomfortable making a diagnosis from the data provided. Graders also rated the “technical quality” and “retinal coverage” of each image set as adequate, possibly adequate, or inadequate for diagnosis. Finally, to measure intragrader reliability for interpreting the same images at different times, 20% of study examinations were randomly selected for repeated presentation by the system.

ANALYSIS

Eyes were analyzed by examination session (31-33 weeks, 35-37 weeks). Sensitivity, specificity, and area under the receiver operating characteristic curves (AUCs)19 were determined for presence of mild or worse, type 2 prethreshold or worse, and treatment-requiring ROP. Ophthalmoscopic examination was used as the reference standard. “Unknown” responses were excluded from calculations, so as not to penalize graders for reporting that they were uncomfortable providing a diagnosis.

Intergrader and intragrader reliability of telemedical examination were determined from the κ and weighted κ statistics for chance-adjusted agreement in ordinal diagnosis, using a well-known scale: 0 to 0.20 = slight agreement, 0.21 to 0.40 = fair agreement, 0.41 to 0.60 = moderate agreement, 0.61 to 0.80 = substantial agreement, and 0.81 to 1.00 = near perfect agreement.24,25

Image quality was assessed from number of “unknown” diagnoses and from image acceptability ratings by graders. Logistic regression was used to determine whether there was a tendency toward improved diagnostic performance as additional photos were captured by the nurse and as additional telemedical examinations were performed by graders. This was performed with the outcomes being sensitivity and false-positive rate for diagnosis of mild or worse, type 2 prethreshold or worse, or treatment-requiring ROP by each grader and with the predictor being order of retinal imaging and grading.

Analysis was performed using statistical software (SPSS 15.0; SPSS Inc, Chicago, Illinois, and R 2.5.0; Free Software Foundation, Boston, Massachusetts). Standard errors were calculated by the jackknife method because both eyes were used. Statistical significance was considered to be a 2-sided P value < .05.

RESULTS
OVERVIEW OF INFANTS AND EXAMINATIONS

Sixty-seven infants participated in this study, of whom 21 (31.3%) received 1 set of examinations at 31 to 33 weeks' PMA, 10 (14.9%) received 1 set at 35 to 37 weeks' PMA, and 36 (53.7%) received examinations at each session. Both eyes were examined at all sessions, for a total of 206 unique eyes. Bilateral images from 21 examinations were repeated to test intragrader reliability, for an overall total of 248 study eyes.

Mean infant birth weight was 912.4 g (range, 398-1440 g), and mean gestational age was 26.7 weeks (range, 23-33 weeks). Examination results are summarized in Table 1. From ophthalmoscopic examinations, the incidence of mild or worse ROP was 36.8% (42 of 114 eyes) at 31 to 33 weeks' PMA and 58.7% (54 of 92 eyes) at 35 to 37 weeks' PMA. From telemedical examinations at 35 to 37 weeks' PMA, grader B had a tendency to diagnose more severe disease than ophthalmoscopy (P = .005) and grader C had a tendency to diagnose more severe disease than ophthalmoscopy (P < .001), grader A (P < .001), and grader B (P = .001).

ACCURACY OF TELEMEDICINE EXAMINATION

Table 2 reports telemedicine accuracy when “unknown” responses were excluded. For infants 31 to 33 weeks' PMA, all graders had sensitivity of 0.729 or greater, specificity of 0.893 or greater, and AUC of 0.840 or greater for diagnosis of mild or worse ROP. For infants 35 to 37 weeks' PMA, all graders had sensitivity of 1.000, specificity of 0.851 or greater, and AUC of 0.955 or greater for diagnosis of type 2 prethreshold or worse ROP and sensitivity of 1.000, specificity of 0.806 or greater, and AUC of 0.903 or greater for diagnosis of treatment-requiring ROP.

RELIABILITY OF TELEMEDICAL EXAMINATION

Table 3 displays intergrader reliability of telemedical examination based on ordinal classification. The mean κ and weighted κ among all pairs of graders were 0.615 and 0.654 at 31 to 33 weeks' PMA and 0.735 and 0.823 at 35 to 37 weeks' PMA, indicating substantial to near-perfect agreement. Table 4 shows intragrader reliability results. Among infants 31 to 33 weeks' PMA, κ was 0.462 to 0.769 (moderate to substantial agreement) for detection of mild or worse ROP and was 1.000 (perfect agreement) for detection of treatment-requiring ROP by each grader. Among infants 35 to 37 weeks' PMA, intragrader κ was 0.909 to 1.000 for detection of mild or worse ROP and 0.786 to1.000 for detection of treatment-requiring ROP. Figure 2 displays examples of variations in accuracy and reliability among graders in this study.

QUALITY OF IMAGE CAPTURE BY TRAINED NURSE

At 31 to 33 weeks' PMA, grader A reported “unknown” diagnoses in 24 eyes (18.8%); grader B, in 52 eyes (40.6%); and grader C in 0 eyes (0%). At 35 to 37 weeks' PMA, grader A reported “unknown” diagnoses in 6 eyes (5.0%); grader B, in 8 eyes (6.7%); and grader C, in 0 eyes (0%) (Table 1). Ratings of image technical quality and retinal coverage are displayed in Table 5. Based on logistic regression analysis, there were no statistically significant associations between order of retinal imaging and diagnostic performance by any grader.

COMMENT

This study prospectively evaluates performance of telemedical ROP diagnosis by 3 expert graders compared with a reference standard of dilated ophthalmoscopy. The key findings are (1) telemedicine is highly accurate and reproducible, using images captured by a trained nurse and (2) accuracy, reliability, and image quality are better at later PMAs.

Accuracy of telemedical diagnosis in this study was high, although there were variations among graders. To understand whether this performance is adequate, it is essential to consider the underlying goal of an ROP telemedicine system. It could be argued that this should be either to fully classify retinal findings in each eye (diagnosis) or simply to identify infants with disease requiring referral for complete examination (screening).26 To evaluate a diagnostic approach, the accuracy of multiple graders must be examined at all levels of severity (Table 2). For example, factors leading to decreased sensitivity for detection of mild or worse ROP (eg, failure to identify stage 1 disease) are different from those leading to decreased sensitivity for detection of treatment-requiring ROP (eg, failure to identify plus disease). In a screening approach, a logical criterion for triggering full examination might be presence of type 2 prethreshold or worse ROP, which has been termed referral-warranted disease.17 The median onset of prethreshold disease has been shown to occur at 36 weeks' PMA.27,28 Therefore, to evaluate a screening approach, the accuracy for detection of type 2 or worse ROP at 35 to 37 weeks' PMA may be most relevant. Despite excellent discriminative ability under these latter conditions (Table 2), a strategy based solely on telemedical ROP screening might be impractical in developed nations. This is because the presence of subtle morphological features that are not represented by the international ROP classification system3,4 may necessitate custom-tailored approaches to individual patients that are beyond the scope of screening algorithms. Telescreening could also create medicolegal concerns, for example, if images are subjected to heavy scrutiny or if there is a perception that infants are not receiving “full” examinations.29 These issues may require further study.

Our findings show that accuracy, intergrader agreement, and image quality ratings are higher at 35 to 37 weeks' PMA than at 31 to 33 weeks. This is consistent with published results determining that sensitivity and specificity for image-based detection of mild or worse ROP were 0.46 and 1.00 at 32 to 34 weeks' PMA and 0.76 and 1.00 at 38 to 40 weeks.21 This is not surprising given that smaller infants often have corneal and vitreous haze, which may decrease image quality, as well as narrow palpebral fissures, which may limit peripheral retinal coverage by contact cameras.20,21 These factors presumably explain the number of “unknown” diagnoses from 2 graders at 31 to 33 weeks' PMA (Table 1). For comparison, 1 prior study showed that 21% of initial RetCam images taken by an ophthalmic photographer were considered unacceptable.22 A high rate of ungradeable images would create difficulty for telemedicine systems because these infants would require repeated imaging or referral for ophthalmoscopic examination. This may also be concerning because infants who develop “aggressive posterior ROP” before 35 weeks' PMA are at higher risk for adverse outcomes.4 For these reasons, it could be argued that “unknown” diagnoses should be regarded as both false-negative and false-positive errors. In that case, the sensitivity and specificity for detection of mild or worse ROP at 31 to 33 weeks' PMA would be 0.928 and 0.625 for grader A, 0.771 and 0.400 for grader B, and 0.729 and 0.938 for grader C, and the sensitivity and specificity for detection of type 2 or worse ROP at 31 to 33 weeks' PMA would be 0.714 and 0.777 for grader A, 0.857 and 0.529 for grader B, and 0.714 and 0.959 for grader C. No eyes with type 2 or worse ROP according to ophthalmoscopy were classified as “unknown” by any grader, although the number of these particular eyes (n = 7 at 31-33 weeks, n = 26 at 35-37 weeks) was likely too small to draw firm conclusions. Accuracy and reliability were uniformly high at 35 to 37 weeks' PMA, which is the most clinically relevant period.27,28 Because of this discrepancy in performance depending on infant age, a potential strategy for ROP management might combine ophthalmoscopy and telemedicine at different times.

This is the first data set to our knowledge that has examined ROP image capture by nonophthalmic personnel. It is useful to compare our accuracy results with prior studies where images were obtained by ophthalmology personnel. For detection of mild or worse ROP, we previously found mean sensitivity and specificity of 0.84 and 0.79 among 3 graders,18 and Roth et al20 found sensitivity and specificity of 0.82 and 0.94. For detection of type 2 or worse ROP, Ells et al17 measured sensitivity and specificity of 1.00 and 0.96. Although these prior studies did not systematically categorize findings based on PMA, telemedical accuracy in the present study appears comparable or better (Table 2). This suggests that it is feasible for a neonatal nurse to capture and select acceptable images. Furthermore, technical quality was rated as either “adequate” or “possibly adequate” in 81.2% to 98.4% of images at 31 to 33 weeks' PMA and 93.3% to 100% of images at 35 to 37 weeks' PMA (Table 5). Trained technicians are responsible for performing sophisticated imaging studies in fields such as radiology and cardiology. Neonatal intensive care unit nurses would be a logical choice in an ROP telemedicine strategy, given their familiarity with neonatal physiology30 and their ability to perform complex procedures on infants.

Intergrader agreement in this study was near perfect, with a weighted κ of 0.791 to 0.889 at 35 to 37 weeks' PMA (Table 4). This is higher than previously published findings, which showed a weighted κ of 0.671 to 0.834 among ophthalmologist graders.18 This difference may be because the current study involved a clearly defined imaging protocol and used graders with extensive ROP experience. By comparison, intergrader weighted κ for image-based diabetic retinopathy diagnosis using the Early Treatment Diabetic Retinopathy Study (ETDRS) 7-field criterion standard was 0.41 to 0.80, depending on the lesion type.31 Taken together, these results suggest that reliability of telemedical ROP diagnosis is comparable with that of well-accepted diagnostic tests, even when images are captured by a neonatal nurse.

There are several study limitations: (1) Three standard photographs were taken of each eye, with up to 2 additional images at the nurse's discretion. This may have contributed to decreased sensitivity, lower retinal coverage ratings, and more “unknown” diagnoses if graders could not visualize sufficient peripheral findings. In designing the study, we established this protocol because we felt that the majority of clinically significant disease can be identified temporally and nasally and because from a practical perspective we hoped to obtain the most useful image data in the least time. Future research regarding diagnostic and logistical trade-offs of different protocols may be useful. (2) Telemedicine graders were not permitted to manipulate image parameters. These adjustments might either increase or decrease performance, particularly if some graders were less skilled at image manipulation than others. To avoid potential confounding effects, we did not incorporate this functionality into the telemedicine system. (3) Dilated ophthalmoscopy was considered the reference standard. Although this has been the design of previous telemedicine studies,1622 image-based examinations may not be inherently less “correct” than ophthalmoscopy. In fact, we have demonstrated that there may be diagnostic disagreements between ophthalmoscopic and image-based examinations performed by the same physician and that there is photographic evidence that image-based diagnoses may often be more accurate.32 This has implications for the design of future telemedicine studies. (4) Data were analyzed by eye, although ROP diagnoses in 2 eyes of the same patient are not independent. Because ophthalmoscopy is performed on both eyes together, telemedical images were also presented side by side to simulate a real-world scenario. This was done to minimize bias favoring either examination and to permit analysis of both eyes in each infant.

We believe that this is the most extensive study of telemedical ROP diagnosis performed to date. Our results show that accuracy, reliability, and image quality are very high at later PMAs, even when images are captured by a trained nurse. Telemedicine is a promising strategy for addressing limitations of the current paradigm for ROP care, such as quality and accessibility, and we have shown that it is more cost-effective than ophthalmoscopy.33 Unresolved issues include medicolegal liability, engineering of telemedicine strategies into existing neonatal workflows, standardization of imaging protocols, and uncertainty about image quality at earlier PMAs.

Back to top
Article Information

Correspondence: Michael F. Chiang, MD, Columbia University College of Physicians and Surgeons, 635 W 165th St, Box 92, New York, NY 10032 (chiang@dbmi.columbia.edu).

Submitted for Publication: May 14, 2007; final revision received June 19, 2007; accepted June 19, 2007.

Author Contributions: Dr Chiang had full access to all the data in the study and takes responsibility for the integrity of the data and accuracy of the data analysis.

Financial Disclosure: Dr Chiang is an unpaid member of the Scientific Advisory Board of Clarity Medical Systems, Pleasanton, California.

Funding/Support: This work was supported by a Career Development Award from Research to Prevent Blindness (Dr Chiang) and by grant EY13972 from the National Eye Institute (Dr Chiang).

Role of the Sponsors: The sponsors had no role in the design and conduct of the study; in the collection, analysis, and interpretation of data; or in the preparation, review, or approval of the manuscript.

References
1.
Cryotherapy for Retinopathy of Prematurity Cooperative Group, Multicenter trial of cryotherapy for retinopathy of prematurity: preliminary results. Arch Ophthalmol 1988;106 (4) 471- 479
PubMedArticle
2.
Early Treatment for Retinopathy of Prematurity Cooperative Group, Revised indications for the treatment of retinopathy of prematurity: results of the Early Treatment for Retinopathy of Prematurity Randomized Trial. Arch Ophthalmol 2003;121 (12) 1684- 1694
PubMedArticle
3.
International Classification of Retinopathy of Prematurity, The Committee for the Classification of Retinopathy of Prematurity. Arch Ophthalmol 1984;102 (8) 1130- 1134
PubMedArticle
4.
International Committee for the Classification of Retinopathy of Prematurity, The international classification of retinopathy of prematurity revisited. Arch Ophthalmol 2005;123 (7) 991- 999
PubMedArticle
5.
Muñoz  BWest  SK Blindness and visual impairment in the Americas and the Caribbean. Br J Ophthalmol 2002;86 (5) 498- 504
PubMedArticle
6.
Steinkuller  PGDu  LGilbert  C  et al.  Childhood blindness. J AAPOS 1999;3 (1) 26- 32
PubMedArticle
7.
Hamilton  BMartin  JVentura  S Births: preliminary data for 2005. http://www.cdc.gov/nchs/products/pubs/pubd/hestats/prelimbirths05/prelimbirths05.htmMarch 7, 2007
8.
Gilbert  CFielder  AGordillo  L  et al.  Characteristics of infants with severe retinopathy of prematurity in countries with low, moderate, and high levels of development: implications for screening programs. Pediatrics 2005;115 (5) e518- e525http://pediatrics.aappublications.org/cgi/content/abstract/115/5/e518?rss=1May 5, 2007
PubMedArticle
9.
Gilbert  CRahi  JEckstein  M  et al.  Retinopathy of prematurity in middle-income countries. Lancet 1997;350 (9070) 12- 14
PubMedArticle
10.
Section on Ophthalmology American Academy of Pediatrics,American Academy of Opthalmology,American Association for Pediatric Ophthalmology and Strabismus, Screening examination of premature infants for retinopathy of prematurity. Pediatrics 2006;117 (2) 572- 576[published correction appears in Pediatrics. 2006;118:1324].
PubMedArticle
11.
American Academy of Ophthalmology, Ophthalmologists warn of shortage in specialists who treat premature babies with blinding eye condition. http://www.aao.org/newsroom/release/20060713.cfmJanuary 5, 2007
12.
Perednia  DAAllen  A Telemedicine technology and clinical applications. JAMA 1995;273 (6) 483- 488
PubMedArticle
13.
Bashshur  RLReardon  TGShannon  GW Telemedicine: a new health care delivery system. Annu Rev Public Health 2000;21613- 637
PubMedArticle
14.
Field  MJed Telemedicine: a Guide to Assessing Telecommunications in Health Care.  Washington, DC National Academies Press1996;
15.
Shea  SStarren  JWeinstock  RS  et al.  Columbia University's Informatics for Diabetes Education and Telemedicine (IDEATel) project: rationale and design. J Am Med Inform Assoc 2002;9 (1) 49- 62
PubMedArticle
16.
Schwartz  SDHarrison  SAFerrone  PJTrese  MT Telemedical evaluation and management of retinopathy of prematurity using a fiberoptic digital fundus camera. Ophthalmology 2000;107 (1) 25- 28
PubMedArticle
17.
Ells  ALHolmes  JMAstle  WF  et al.  Telemedicine approach to screening for severe retinopathy of prematurity: a pilot study. Ophthalmology 2003;110 (11) 2113- 2117
PubMedArticle
18.
Chiang  MFKeenan  JDStarren  JB  et al.  Accuracy and reliability of remote retinopathy of prematurity diagnosis. Arch Ophthalmol 2006;124 (3) 322- 327
PubMedArticle
19.
Chiang  MFStarren  JBDu  YE  et al.  Remote image based retinopathy of prematurity diagnosis: a receiver operating characteristic analysis of accuracy. Br J Ophthalmol 2006;90 (10) 1292- 1296
PubMedArticle
20.
Roth  DBMorales  DFeuer  WJHess  DJohnson  RAFlynn  JT Screening for retinopathy of prematurity employing the RetCam-120: sensitivity and specificity. Arch Ophthalmol 2001;119 (2) 268- 272
PubMed
21.
Yen  KGHess  DBurke  B  et al.  The optimum time to employ telephotoscreening to detect retinopathy of prematurity. Trans Am Ophthalmol Soc 2000;98145- 150
PubMed
22.
Wu  CPetersen  RAVanderveen  DK Retcam imaging for retinopathy of prematurity screening. J AAPOS 2006;10 (2) 107- 111
PubMedArticle
23.
Section on Ophthalmology American Academy of Pediatrics; American Academy of Ophthalmology; American Association for Pediatric Ophthalmology and Strabismus, Screening examination of premature infants for retinopathy of prematurity. Pediatrics 2001;108 (3) 809- 811
PubMedArticle
24.
Landis  JRKoch  GG The measurement of observer agreement for categorical data. Biometrics 1977;33 (1) 159- 174
PubMedArticle
25.
Cohen  J Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 1968;70 (4) 213- 220Article
26.
Chiang  MFKeenan  JDDu  YE  et al.  Assessment of image-based technology: impact of referral cutoff on accuracy and reliability of remote retinopathy of prematurity diagnosis. AMIA Annu Symp Proc 2005;126- 130
PubMed
27.
Palmer  EAFlynn  JTHardy  RT  et al.  Incidence and early course of retinopathy of prematurity. Ophthalmology 1991;98 (11) 1628- 1640
PubMedArticle
28.
Good  WVHardy  RJDobson  V  et al. ETROP Cooperative Group, The incidence and course of retinopathy of prematurity: findings from the Early Treatment for Retinopathy of Prematurity Study. Pediatrics 2005;116 (1) 15- 23
PubMedArticle
29.
Stanberry  B Legal and ethical aspects of telemedicine. J Telemed Telecare 2006;12 (4) 166- 175
PubMedArticle
30.
Laws  DEMorton  CWeindling  MClark  D Systemic effects of screening for retinopathy of prematurity. Br J Ophthalmol 1996;80 (5) 425- 428
PubMedArticle
31.
Early Treatment Diabetic Retinopathy Study Research Group, Grading diabetic retinopathy from stereoscopic color fundus photographs—an extension of the modified Airlie House classification. ETDRS report No. 10. Ophthalmology 1991;98 (5) ((suppl)) 786- 806
PubMedArticle
32.
Scott  KEKim  DYWang  L  et al.  Telemedical retinopathy of prematurity diagnosis: intra-physician agreement between ophthalmoscopic and image-based examinations. OphthalmologyIn press
33.
Jackson  KMScott  KEGraff Zivin  J  et al.   Cost-utility analysis of telemedicine and standard ophthalmoscopy for retinopathy of prematurity management. Ophthalmology. In press 
×