[Skip to Navigation]
Sign In
Figure.  Example of Differences in Diagnostic Process Among Experts Reviewing the Same Image Independently
Example of Differences in Diagnostic Process Among Experts Reviewing the Same Image Independently

Experts were asked to provide a diagnosis (plus, pre-plus, or neither) and annotate key findings while being videotaped “thinking aloud” to describe their reasoning. Videotapes were transcribed, coded, and analyzed to examine qualitative diagnostic process. A, Diagnosed as plus disease by expert 1. “…looks like a very low gestational age baby; it’s taken quite a long time to get to this stage. There is a lot of arterial tortuosity [annotated]; there is a little bit of venous congestion in the superior temporal and superior nasal quadrant, more in the superior half of the retina [annotated]. By definition, I think this has to be plus, because it’s 2 quadrants at least, and even the other quadrants aren’t normal….” B, Diagnosed as pre-plus disease by expert 2. “…there is a lot of tortuosity of the arteries; the veins are about 2 to 1. This could either be a pre-plus eye or a normal variant, depending on a quick look to the periphery. Curiously, there is a lot of tortuosity down here [annotated]; it looks like there is disease up here [annotated].” C, Diagnosed as neither by expert 4. “…vessels seem to be branching excessively in that region [superonasal area annotated] and some increased tortuosity [superotemporal area annotated] as well, and this vein looks too fat [superotemporal area annotated]. If all the quadrants were like this quadrant [superotemporal], then it would be at least pre-plus and verging on plus, but since it’s only 1 quadrant that’s highly questionable, would not classify it as plus. I could see why some would call it pre-plus…I would call it no plus.”

Table 1.  Interexpert Agreement in Plus Disease Diagnosis by 6 Retinopathy of Prematurity Experts From Reviewing 7 Wide-Angle Retinal Images1
Interexpert Agreement in Plus Disease Diagnosis by 6 Retinopathy of Prematurity Experts From Reviewing 7 Wide-Angle Retinal Imagesa
Table 2.  Interexpert Agreement Among Experts Ranking Severity of Overall Vascular Abnormality, Arterial Tortuosity Alone, and Venous Dilation Alone in Retinopathy of Prematurity1
Interexpert Agreement Among Experts Ranking Severity of Overall Vascular Abnormality, Arterial Tortuosity Alone, and Venous Dilation Alone in Retinopathy of Prematuritya
Table 3.  Intraexpert Agreement in Plus Disease Diagnosis by 6 Retinopathy of Prematurity Experts From Reviewing 7 Wide-Angle Retinal Images1
Intraexpert Agreement in Plus Disease Diagnosis by 6 Retinopathy of Prematurity Experts From Reviewing 7 Wide-Angle Retinal Imagesa
Table 4.  Relationship Between Perceived Vascular Abnormality and Overall Plus Disease Diagnosis1
Relationship Between Perceived Vascular Abnormality and Overall Plus Disease Diagnosisa
Table 5.  Retinal Features Considered by Experts During Plus Disease Diagnosis1
Retinal Features Considered by Experts During Plus Disease Diagnosisa
1.
International Committee for the Classification of Retinopathy of Prematurity.  The International Classification of Retinopathy of Prematurity revisited [published comment appears in Arch Ophthalmol. 2006;124(11):1669-1670].  Arch Ophthalmol. 2005;123(7):991-999.PubMedGoogle ScholarCrossref
2.
Cryotherapy for Retinopathy of Prematurity Cooperative Group.  Multicenter trial of cryotherapy for retinopathy of prematurity: preliminary results.  Arch Ophthalmol. 1988;106(4):471-479.PubMedGoogle ScholarCrossref
3.
Early Treatment for Retinopathy of Prematurity Cooperative Group.  Revised indications for the treatment of retinopathy of prematurity: results of the Early Treatment for Retinopathy of Prematurity randomized trial.  Arch Ophthalmol. 2003;121(12):1684-1694.PubMedGoogle ScholarCrossref
4.
Mintz-Hittner  HA, Kennedy  KA, Chuang  AZ; BEAT-ROP Cooperative Group.  Efficacy of intravitreal bevacizumab for stage 3+ retinopathy of prematurity.  N Engl J Med. 2011;364(7):603-615.PubMedGoogle ScholarCrossref
5.
Chiang  MF, Jiang  L, Gelman  R, Du  YE, Flynn  JT.  Interexpert agreement of plus disease diagnosis in retinopathy of prematurity.  Arch Ophthalmol. 2007;125(7):875-880.PubMedGoogle ScholarCrossref
6.
Wallace  DK, Quinn  GE, Freedman  SF, Chiang  MF.  Agreement among pediatric ophthalmologists in diagnosing plus and pre-plus disease in retinopathy of prematurity.  J AAPOS. 2008;12(4):352-356.PubMedGoogle ScholarCrossref
7.
Reynolds  JD, Dobson  V, Quinn  GE,  et al; CRYO-ROP and LIGHT-ROP Cooperative Study Groups.  Evidence-based screening criteria for retinopathy of prematurity: natural history data from the CRYO-ROP and LIGHT-ROP studies.  Arch Ophthalmol. 2002;120(11):1470-1476.PubMedGoogle ScholarCrossref
8.
Gelman  SK, Gelman  R, Callahan  AB,  et al.  Plus disease in retinopathy of prematurity: quantitative analysis of standard published photograph.  Arch Ophthalmol. 2010;128(9):1217-1220.PubMedGoogle ScholarCrossref
9.
Rao  R, Jonsson  NJ, Ventura  C,  et al.  Plus disease in retinopathy of prematurity: diagnostic impact of field of view.  Retina. 2012;32(6):1148-1155.PubMedGoogle ScholarCrossref
10.
Thyparampil  PJ, Park  Y, Martinez-Perez  ME,  et al.  Plus disease in retinopathy of prematurity: quantitative analysis of vascular change.  Am J Ophthalmol. 2010;150(4):468, e2.PubMedGoogle ScholarCrossref
11.
Ericsson  KS, Simon  HA.  Protocol Analysis: Verbal Reports as Data.2nd ed. Boston, MA: MIT Press; 1993.
12.
Hassebrock  F, Prietula  M.  A protocol-based scheme for the analysis of medical reasoning.  Int J Man Mach Stud. 1992;37(5):613-652. doi:10.1016/0020-7373(92)90026-H.Google ScholarCrossref
13.
Cohen  J.  Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit.  Psychol Bull. 1968;70(4):213-220.PubMedGoogle ScholarCrossref
14.
Landis  JR, Koch  GG.  The measurement of observer agreement for categorical data.  Biometrics. 1977;33(1):159-174.PubMedGoogle ScholarCrossref
15.
Section on Ophthalmology American Academy of Pediatrics; American Academy of Ophthalmology; American Association for Pediatric Ophthalmology and Strabismus.  Screening examination of premature infants for retinopathy of prematurity [published correction appears in Pediatrics. 2006;118(3):1324].  Pediatrics. 2006;117(2):572-576.PubMedGoogle ScholarCrossref
16.
Wilkinson  AR, Haines  L, Head  K, Fielder  AR; Guideline Development Group of the Royal College of Paediatrics and Child Health; Royal College of Ophthalmologists; British Association of Perinatal Medicine.  UK retinopathy of prematurity guideline.  Eye (Lond). 2009;23(11):2137-2139.PubMedGoogle ScholarCrossref
17.
Norman  GR, Rosenthal  D, Brooks  LR, Allen  SW, Muzzin  LJ.  The development of expertise in dermatology.  Arch Dermatol. 1989;125(8):1063-1068.PubMedGoogle ScholarCrossref
18.
Crowley  RS, Naus  GJ, Stewart  J  III, Friedman  CP.  Development of visual diagnostic expertise in pathology: an information-processing study.  J Am Med Inform Assoc. 2003;10(1):39-51.PubMedGoogle ScholarCrossref
19.
Azevedo R, Faremo S, Lajoie SP. Expert-novice differences in mammogram interpretation. In: Trafton DSMJG, ed. Proceedings of the 29th Annual Cognitive Science Society; August 1-4, 2007; Nashville, TN.
20.
Paul Chan  RV, Williams  SL, Yonekawa  Y, Weissgold  DJ, Lee  TC, Chiang  MF.  Accuracy of retinopathy of prematurity diagnosis by retinal fellows.  Retina. 2010;30(6):958-965.PubMedGoogle ScholarCrossref
21.
Myung  JS, Paul Chan  RV, Espiritu  MJ,  et al.  Accuracy of retinopathy of prematurity image-based diagnosis by pediatric ophthalmology fellows: implications for training.  J AAPOS. 2011;15(6):573-578.PubMedGoogle ScholarCrossref
22.
Gelman  R, Jiang  L, Du  YE, Martinez-Perez  ME, Flynn  JT, Chiang  MF.  Plus disease in retinopathy of prematurity: pilot study of computer-based and expert diagnosis.  J AAPOS. 2007;11(6):532-540.PubMedGoogle ScholarCrossref
23.
Wallace  DK, Freedman  SF, Zhao  Z, Jung  SH.  Accuracy of ROPtool vs individual examiners in assessing retinal vascular tortuosity.  Arch Ophthalmol. 2007;125(11):1523-1530.PubMedGoogle ScholarCrossref
24.
Wallace  DK, Jomier  J, Aylward  SR, Landers  MB  III.  Computer-automated quantification of plus disease in retinopathy of prematurity.  J AAPOS. 2003;7(2):126-130.PubMedGoogle ScholarCrossref
25.
Wallace  DK, Freedman  SF, Zhao  Z.  A pilot study using ROPtool to measure retinal vascular dilation.  Retina. 2009;29(8):1182-1187.PubMedGoogle ScholarCrossref
26.
Swanson  C, Cocker  KD, Parker  KH, Moseley  MJ, Fielder  AR.  Semiautomated computer analysis of vessel growth in preterm infants without and with ROP.  Br J Ophthalmol. 2003;87(12):1474-1477.PubMedGoogle ScholarCrossref
27.
Gelman  R, Martinez-Perez  ME, Vanderveen  DK, Moskowitz  A, Fulton  AB.  Diagnosis of plus disease in retinopathy of prematurity using Retinal Image multiScale Analysis.  Invest Ophthalmol Vis Sci. 2005;46(12):4734-4738.PubMedGoogle ScholarCrossref
28.
Koreen  S, Gelman  R, Martinez-Perez  ME,  et al.  Evaluation of a computer-based system for plus disease diagnosis in retinopathy of prematurity.  Ophthalmology. 2007;114(12):e59-e67.PubMedGoogle ScholarCrossref
29.
Rabinowitz  MP, Grunwald  JE, Karp  KA, Quinn  GE, Ying  GS, Mills  MD.  Progression to severe retinopathy predicted by retinal vessel diameter between 31 and 34 weeks of postconception age.  Arch Ophthalmol. 2007;125(11):1495-1500.PubMedGoogle ScholarCrossref
30.
Johnson  KS, Mills  MD, Karp  KA, Grunwald  JE.  Semiautomated analysis of retinal vessel diameter in retinopathy of prematurity patients with and without plus disease.  Am J Ophthalmol. 2007;143(4):723-725.PubMedGoogle ScholarCrossref
31.
Wilson  CM, Cocker  KD, Moseley  MJ,  et al.  Computerized analysis of retinal vessel width and tortuosity in premature infants.  Invest Ophthalmol Vis Sci. 2008;49(8):3577-3585.PubMedGoogle ScholarCrossref
32.
Shah  DN, Karp  KA, Ying  GS, Mills  MD, Quinn  GE.  Image analysis of posterior pole vessels identifies type 1 retinopathy of prematurity.  J AAPOS. 2009;13(5):507-508.PubMedGoogle ScholarCrossref
33.
Wallace  DK, Zhao  Z, Freedman  SF.  A pilot study using “ROPtool” to quantify plus disease in retinopathy of prematurity.  J AAPOS. 2007;11(4):381-387.PubMedGoogle ScholarCrossref
34.
Chiang  MF, Gelman  R, Martinez-Perez  ME,  et al.  Image analysis for retinopathy of prematurity diagnosis.  J AAPOS. 2009;13(5):438-445.PubMedGoogle ScholarCrossref
35.
Chiang  MF, Gelman  R, Williams  SL,  et al.  Plus disease in retinopathy of prematurity: development of composite images by quantification of expert opinion.  Invest Ophthalmol Vis Sci. 2008;49(9):4064-4070.PubMedGoogle ScholarCrossref
36.
Ells  AL, Holmes  JM, Astle  WF,  et al.  Telemedicine approach to screening for severe retinopathy of prematurity: a pilot study.  Ophthalmology. 2003;110(11):2113-2117.PubMedGoogle ScholarCrossref
37.
Chiang  MF, Keenan  JD, Starren  J,  et al.  Accuracy and reliability of remote retinopathy of prematurity diagnosis.  Arch Ophthalmol. 2006;124(3):322-327.PubMedGoogle ScholarCrossref
38.
Wu  C, Petersen  RA, VanderVeen  DK.  RetCam imaging for retinopathy of prematurity screening.  J AAPOS. 2006;10(2):107-111.PubMedGoogle ScholarCrossref
39.
Chiang  MF, Wang  L, Busuioc  M,  et al.  Telemedical retinopathy of prematurity diagnosis: accuracy, reliability, and image quality.  Arch Ophthalmol. 2007;125(11):1531-1538.PubMedGoogle ScholarCrossref
40.
Scott  KE, Kim  DY, Wang  L,  et al.  Telemedical diagnosis of retinopathy of prematurity intraphysician agreement between ophthalmoscopic examination and image-based interpretation.  Ophthalmology. 2008;115(7):1222-1228, e3.PubMedGoogle ScholarCrossref
41.
Balasubramanian  M, Capone  A, Hartnett  ME,  et al; Photographic Screening for Retinopathy of Prematurity (Photo-ROP) Cooperative Group.  The photographic screening for retinopathy of prematurity study (photo-ROP): primary outcomes.  Retina. 2008;28(3)(suppl):S47-S54.PubMedGoogle Scholar
42.
Lorenz  B, Spasovska  K, Elflein  H, Schneider  N.  Wide-field digital imaging based telemedicine for screening for acute retinopathy of prematurity (ROP): six-year results of a multicentre field study.  Graefes Arch Clin Exp Ophthalmol. 2009;247(9):1251-1262.PubMedGoogle ScholarCrossref
43.
Silva  RA, Murakami  Y, Lad  EM, Moshfeghi  DM.  Stanford University Network for Diagnosis of Retinopathy of Prematurity (SUNDROP): 36-month experience with telemedicine screening.  Ophthalmic Surg Lasers Imaging. 2011;42(1):12-19.PubMedGoogle ScholarCrossref
44.
Dai  S, Chow  K, Vincent  A.  Efficacy of wide-field digital retinal imaging for retinopathy of prematurity screening.  Clin Experiment Ophthalmol. 2011;39(1):23-29.PubMedGoogle Scholar
45.
Li  AC, Kannry  JL, Kushniruk  A,  et al.  Integrating usability testing and think-aloud protocol analysis with “near-live” clinical simulations in evaluating clinical decision support.  Int J Med Inform. 2012;81(11):761-772.PubMedGoogle ScholarCrossref
Original Investigation
Journal Club, Clinical Sciences
August 2013

Plus Disease in Retinopathy of Prematurity: Qualitative Analysis of Diagnostic Process by Experts

Journal Club PowerPoint Slide Download
Author Affiliations
  • 1Department of Ophthalmology, Weill Cornell Medical College, New York, New York
  • 2Department of Biomedical Informatics, Columbia University College of Physicians and Surgeons, New York, New York
  • 3Department of Ophthalmology, Campus Benjamin Franklin, Charité-Universitaetsmedizin Berlin, Berlin, Germany
  • 4Department of Ophthalmology, Medical Informatics, Casey Eye Institute, Oregon Health & Science University, Portland, Oregon
  • 5Department of Clinical Epidemiology, Casey Eye Institute, Oregon Health & Science University, Portland, Oregon
JAMA Ophthalmol. 2013;131(8):1026-1032. doi:10.1001/jamaophthalmol.2013.135
Abstract

Importance  Plus disease is the most important parameter that characterizes severe treatment-requiring retinopathy of prematurity, yet diagnostic agreement among experts is imperfect and the precise factors involved in clinical diagnosis are unclear. This study is designed to address these gaps in knowledge by analyzing cognitive aspects of the plus disease diagnostic process by experts.

Objective  To examine the diagnostic reasoning process of experts for plus disease in retinopathy of prematurity using qualitative research techniques.

Design  Cognitive walk-through, with qualitative analysis of videotaped expert responses and quantitative analysis of expert diagnoses.

Setting  Experimental setting in which experts were videotaped while reviewing study data.

Participants  A panel of international retinopathy of prematurity experts who had the experience of using qualitative retinal features as their primary basis for clinical diagnosis.

Intervention  Six experts were video recorded while independently reviewing 7 wide-angle retinal images from infants with retinopathy of prematurity. Experts were asked to explain their diagnostic process in detail (think-aloud protocol), mark findings relevant to their reasoning, and diagnose each image (plus vs pre-plus vs neither). Subsequently, each expert viewed the images again while being asked to examine arteries and veins in isolation and answer specific questions. Video recordings were transcribed and reviewed. Diagnostic process of experts was analyzed using a published cognitive model.

Main Outcome and Measures  Interexpert and intraexpert agreement.

Results  Based on the think-aloud protocol, 5 of 6 experts agreed on the same diagnosis in 3 study images and 3 of 6 experts agreed in 3 images. When experts were asked to rank images in order of severity, the mean correlation coefficient between pairs of experts was 0.33 (range, −0.04 to 0.75). All experts considered arterial tortuosity and venous dilation while reviewing each image. Some considered venous tortuosity, arterial dilation, peripheral retinal features, and other factors. When experts were asked to rereview images to diagnose plus disease based strictly on definitions of sufficient arterial tortuosity and venous dilation, all but 1 expert changed their diagnosis compared with the think-aloud protocol.

Conclusions and Relevance  Diagnostic consistency in plus disease is imperfect. Experts differ in their reasoning process, retinal features that they focus on, and interpretations of the same features. Understanding these factors may improve diagnosis and education. Future research defining more precise diagnostic criteria may be warranted.

Retinopathy of prematurity (ROP) is a vasoproliferative disease affecting low-birth-weight infants. An International Classification for ROP (ICROP) has been developed to standardize clinical diagnosis.1 Multicenter randomized trials such as the Cryotherapy for Retinopathy of Prematurity (CRYO-ROP) and Early Treatment for Retinopathy of Prematurity studies have found that severe ROP may be treated successfully by laser photocoagulation or cryotherapy.2,3 Pharmacological treatments are being studied.4 Despite these advances, ROP continues to be a leading cause of childhood blindness throughout the world.

Plus disease is a critical component of the ICROP system and is defined as arterial tortuosity and venous dilation in the posterior pole greater than or equal to that of a standard photograph selected by expert consensus during the 1980s.1,2 More recently, the revised ICROP system defines pre-plus disease as vascular abnormalities insufficient for plus disease but with more arterial tortuosity and venous dilation than normal.1 Presence of plus disease is a necessary feature for threshold disease and a sufficient feature for type 1 ROP, both of which have been shown to warrant prompt treatment. Therefore, accurate diagnosis of plus disease is essential.

However, there are limitations regarding the definition of plus disease. Studies have found diagnostic inconsistency, even among experts.5-7 The standard published photograph has a larger magnification and narrower field of view than clinical evaluation tools such as indirect ophthalmoscopy and wide-angle retinal imaging, and this difference in perspective may cause difficulty for ophthalmologists.8,9 Vessels in the standard published photograph have varying degrees of tortuosity and dilation, creating uncertainty regarding which vessels to focus on during examination. Finally, although plus disease is defined solely from arteriolar tortuosity and venous dilation within the posterior pole, it is possible that other vascular features or the rate of vascular change are relevant for diagnosis.9,10 Better understanding of the examination features characterizing plus disease may improve diagnostic accuracy and consistency.

It has been our observation that many ophthalmologists who trained within the past 25 years, after dissemination of ICROP and the CRYO-ROP study findings,1,2 perform ROP examination predominantly by classifying the zone, stage, and presence of plus disease based on venous dilation and arteriolar tortuosity of the posterior vessels. The premise of this study is that reliance only on this classification system, without attention to description of rich underlying retinal features, may oversimplify the characterization of clinically significant findings. This study is designed to encode detailed qualitative thoughts of experts during plus disease diagnosis, using research methods from cognitive informatics.11 The overall goals are to ascertain levels of agreement as well as to better understand underlying reasons for diagnostic discrepancy among experts and to obtain more precise information about specific retinal features of plus disease.

Methods

This study was approved by the institutional review boards at Columbia University and Oregon Health & Science University. Informed consent was obtained from all expert participants, and waiver of consent was obtained for use of deidentified retinal images.

Expert Participants

We assembled a panel of international ROP experts who had the experience of using qualitative retinal features as their primary basis for clinical diagnosis. In our view, this could be accomplished by identifying experts who had practiced ophthalmology before publication of the CRYO-ROP findings,2 participated as CRYO-ROP principal investigators, or participated on national ROP standards committees. The rationale was that this would identify a small number of experts with the background and perspective to articulate their underlying qualitative reasons for diagnosis.

Image Selection

A set of 7 wide-angle retinal images (RetCam; Clarity Medical Systems) was captured from premature infants during routine clinical ROP care. Each image showed the posterior retina and reflected some degree of vascular abnormality in our opinion. Images were printed on high-resolution photograph paper (Kodak) in a 5-in × 7-in format.

No additional information such as birth weight, systemic findings, or postmenstrual age was provided. This was to ensure that experts focused only on retinal features, without potential confounding factors. Neither the standard photograph nor any definitions of plus or pre-plus disease were provided to experts. This was to simulate a real-world examination scenario and avoid biasing expert opinions. We believed that study experts would be intimately familiar with these definitions through previous experiences and through contributing to the creation of those definitions in many cases.

Think-Aloud Protocol and Specific Image Questions

This study was conducted in 2 rounds, in which each study expert was asked a series of scripted questions (eAppendix in Supplement) by one of us (N.J.H.): (1) Round 1 (“think-aloud protocol”). The 7 retinal images were shown individually and in the same order to each expert, who was asked to diagnose each image as either plus disease, pre-plus disease, or neither plus nor pre-plus. Each expert was asked to verbalize thoughts while reviewing the image, explain the process that led to the final diagnosis, and annotate the most important findings on the printed image using a marking pen. Finally, each expert was asked to rate the degree of confidence in the diagnosis for each image (certain, somewhat certain, or uncertain). Experts were encouraged by the observer (N.J.H.) to verbalize all of their thoughts but were not otherwise interrupted or coached during the think-aloud protocol. (2) Round 2 (“specific questions”). The 7 study images were displayed again in the same order, and each expert was asked a series of specific questions about each image. For each image, experts were asked whether the arteriolar tortuosity was sufficient for plus disease, whether the venous dilation was sufficient for plus disease, and whether the overall image reflected plus disease, pre-plus disease, or neither. Experts were asked to rank the 7 images in order of increasing arteriolar tortuosity, increasing venous dilation, and increasing overall severity of vascular abnormality. Additional specific questions were custom-tailored to each image regarding features used by experts to identify severe ROP, perceptions about the nature and location of vascular abnormalities, and other diagnostic heuristics (eAppendix in Supplement).

Each of the expert sessions was recorded using a video camera (Handycam; Sony). A digital recorder (GarageBand; Apple) was used as a backup. The video camera was directed to record the retinal images and hands of each expert. Personal features of experts were not recorded, and experts were identified only by a study number.

Data Analysis

In round 1 (think-aloud protocol), digital files were processed using video editing software (iMovie; Apple). All video and audio files were manually transcribed for analysis. A modified protocol of the Hassebrock coding scheme was used to analyze the transcribed files.12 The scheme was designed to analyze medical reasoning and coding of verbal think-aloud protocols. Interexpert agreement was examined based on overall diagnosis provided after the think-aloud protocol. Specific examples were identified to represent differences in underlying qualitative diagnostic rationale among experts.

In round 2 (specific questions), interexpert agreement was examined by calculating correlation coefficients among each pair of experts who were asked to rank the 7 retinal images from least to most severe based on arterial tortuosity alone, venous dilation alone, and overall severity of vascular abnormalities related to plus disease. A published scale was used to interpret correlation coefficients: 0 to 0.30, small correlation; 0.31 to 0.50, medium correlation; and 0.51 to 1.00, strong correlation.13

Intraexpert agreement in plus disease diagnosis was calculated. As described earlier, each expert initially provided a diagnosis (plus, pre-plus, or neither) in round 1 while “thinking aloud” to explain their rationale. Each expert then provided a diagnosis in round 2 after responding to a series of questions about specific image features. Absolute intraexpert agreement and κ statistic were calculated for each expert using these diagnoses. A published scale was used to interpret κ values: 0 to 0.20, slight agreement; 0.21 to 0.40, fair agreement; 0.41 to 0.60, moderate agreement; 0.61 to 0.80, substantial agreement; and 0.81 to 1.00, near-perfect agreement.14

Finally, data from rounds 1 and 2 were analyzed to identify qualitative retinal features contributing to the ROP diagnostic process by experts and identify the relationship among individual retinal features and overall diagnosis.

Results
Characteristics of Study Experts

Six ROP experts participated: 5 of 6 (83%) were principal investigators in the CRYO-ROP and/or Early Treatment for Retinopathy of Prematurity studies, 5 of 6 (83%) published 5 or more peer-reviewed ROP articles, 5 of 6 (83%) practiced ophthalmology before publication of initial CRYO-ROP findings, and 5 of 6 (83%) contributed to expert consensus activities such as selection of the standard published photograph, development of ICROP, or creation of screening guidelines.15,16 All experts met 2 or more of these criteria.

Interexpert Agreement

Table 1 summarizes interexpert agreement in plus disease diagnosis among the 6 experts based on the round 1 think-aloud protocol. In particular, 5 of 6 experts (83%) agreed on the same diagnosis in 3 images, 4 of 6 experts (67%) agreed in 1 image, and 3 of 6 (50%) agreed in 3 images. Several images were diagnosed differently by experts. For example, image 2 was diagnosed as plus disease by 3 of 6 experts (50%), pre-plus disease by 1 of 6 experts (17%), and neither by 2 of 6 experts (33%).

In round 2, experts were asked to rank all 7 images in order of increasing overall severity of vascular abnormality related to plus disease, in order of increasing arterial tortuosity alone, and in order of increasing venous dilation alone. The correlation in ordering of arterial tortuosity among pairs of experts was strong (mean [range] correlation coefficient, 0.89 [0.80 to 1.00]), whereas there was only a small correlation in ordering of venous dilation (mean [range] correlation coefficient, 0.27 [−0.04 to 1.00]) (Table 2).

Example of Qualitative Analysis of Diagnostic Discrepancy

To examine differences in underlying diagnostic process that may have led to discrepancies among experts, transcripts of think-aloud protocols were examined and compared. For example, image 5 was diagnosed as plus disease by 2 of 6 experts (33%), pre-plus by 3 of 6 experts (50%), and neither by 1 of 6 experts (17%). The Figure displays examples of differences in key retinal features that were discussed and annotated in that image by 3 different experts.

Intraexpert Agreement

Table 3 summarizes intraexpert agreement between plus disease diagnosis provided in the think-aloud protocol from round 1 and the diagnosis provided after responding to a series of specific questions about image features in round 2. Absolute intraexpert agreement ranged from 4 of 7 (57%, 1 expert) to 7 of 7 (100%, 1 expert), and κ ranged from 0.30 (fair agreement, 1 expert) to 1.00 (perfect agreement, 1 expert).

Retinal Features of Plus Disease

In specific questions of round 2, experts were asked to characterize arterial tortuosity sufficient or insufficient for plus disease, characterize venous dilation sufficient or insufficient for plus disease, and provide an overall diagnosis (Table 4). Five individual ratings were excluded because an expert provided no response about arterial tortuosity or venous dilation; therefore, there were 37 total ratings by the 6 experts for the 7 images. In 5 of 37 ratings (14%), there was inconsistency between the expert diagnostic process and published definitions of plus disease (which requires both sufficient arterial tortuosity and venous dilation).1 In another 5 of 37 ratings (14%), there was inconsistency with the published definition of pre-plus disease (which requires arterial tortuosity and venous dilation that is insufficient for plus disease).1

Table 5 displays retinal features considered by experts in plus disease diagnosis during the think-aloud protocol of round 1. In addition to retinal features mentioned in the published definition of plus disease (sufficient arterial tortuosity and venous dilation within ≥2 quadrants of the central retina),1 experts cited many different features such as venous tortuosity, arterial dilation, peripheral retinal features, and vascular branching.

Discussion

To our knowledge, this is the first study using qualitative research methods to examine the process of plus disease diagnosis by ROP experts. Key findings are that: (1) there are inconsistencies in plus disease diagnosis among experts, (2) some diagnostic discrepancies may occur because experts are considering different retinal features, and (3) the current concept of plus disease as arteriolar tortuosity and venous dilation within the posterior pole appears oversimplified based on expert behavior.

Our results regarding interexpert disagreement in plus disease diagnosis support findings from previous studies involving image-based diagnosis5,6 and from a previous study showing that certified CRYO-ROP experts performing unmasked ophthalmoscopic examinations to confirm presence of threshold disease disagreed with the first expert diagnosis in 12% of cases.7 One demonstration of interexpert inconsistency in the current study is shown in Table 1. Another demonstration is summarized in Table 2, showing that correlation among experts for ranking severity of arterial tortuosity (mean correlation coefficient, 0.89) was much higher than correlation for ranking severity of venous dilation (mean correlation coefficient, 0.27) or overall vascular severity (mean correlation coefficient, 0.33). This suggests that arterial tortuosity is easier for experts to recognize and order visually. Conversely, venous dilation may be more difficult to identify visually, more subjective, or perhaps more difficult to represent using wide-angle images. There are several possible reasons explaining the low interexpert correlation for overall severity, including that there are differences in retinal features considered by different experts. Other possibilities are that there are differences in the interpretation of the same retinal features by different experts or that the significance of particular features is weighted differently among experts (Figure). A final demonstration of variability is summarized in Table 3, showing intraexpert differences in plus disease diagnosis using different methods. Taken together, these findings suggest that there are significant inconsistencies and that experts appear to consider different retinal features and interpret the same features differently.

The traditional definition of plus disease was created by expert consensus during the 1980s and has been used for major multicenter trials.2,3 However, another key finding from the current study is that experts consider retinal features beyond arterial tortuosity and venous dilation within the posterior pole when diagnosing plus disease. As shown in Table 5, experts explicitly mentioned many additional factors while explaining their diagnostic rationale during the think-aloud protocol. These included retinal features such as venous tortuosity and vascular branching, as well as anatomic factors such as peripheral retinal appearance and macular features, none of which are described in the published definition of plus disease.1,2 Furthermore, as shown in Table 4, there were 10 of 37 study ratings in which expert diagnoses of plus or pre-plus disease were inconsistent with published definitions.1 Overall, these findings suggest that plus disease diagnosis is considerably more complex than current rules, which combine arterial tortuosity and venous dilation in the posterior pole, and that experts do not appear to consider the same retinal features even when examining the same images.

Qualitative cognitive research methods have been used to characterize complex processes pertaining to visual diagnosis in fields such as dermatology, pathology, and radiology.17-19 The premise of this study is that current ROP management strategies are based on an international classification system,1 along with diagnosis and treatment guidelines resulting from groundbreaking multicenter trials.2,3 By nature, this translates the qualitative nuances of retinal examination into discrete evidence-based rules. Findings from the current study support the notion that plus disease diagnosis is oversimplified by these rules involving only central arterial tortuosity and venous dilation. In particular, most study experts had practice experience before publication of these definitions and rules and might therefore have greater insight about ROP diagnosis based on qualitative retinal characteristics. Follow-up research to encode the diagnostic methods and heuristics used by these experts may improve standardization and education in ROP care.20,21 Evidence-based protocols provide enormous benefits through guidelines to improve clinical management, and methods from this study can complement these protocols by providing additional information about subtle diagnostic factors.

Computer-based image analysis is an emerging method for improving accuracy and reproducibility of plus disease diagnosis using quantitative retinal vascular parameters.8,10,22-35 Development of these systems requires identification of the relevant vascular features to analyze (eg, arterial tortuosity); definition of algorithms for quantifying these features; selection of the appropriate vessels for analysis (eg, all vessels, worst vessels); and combination of individual feature values into an overall diagnosis.34 Currently, there are no standard methods for performing most of these tasks. This study provides information about the diagnostic process used by experts, which may eventually provide a scientific basis for developing computer algorithms that better mimic expert diagnosis.

Several additional study limitations should be noted: (1) Wide-angle retinal images were used for expert review, rather than ophthalmoscopic examinations. This may have biased findings to the extent that examiners were less familiar with image-based diagnosis. However, all study experts had experience with ROP imaging and multiple studies have shown that image-based ROP diagnosis agrees closely with ophthalmoscopic diagnosis.36-44 We felt that image review was the best study design to allow multiple experts to analyze the exact same retinal features. (2) Retinal images were reviewed by experts with no clinical information. This may have affected study findings to the extent that experts interpret retinal findings in the context of clinical data. However, the purpose of this study was to understand the significance of qualitative retinal features and the expert diagnostic process, not to simulate the process of ophthalmoscopic examination. (3) The number of study experts was limited. This may affect the generalizability of study findings to the extent that these 6 academic experts may not be representative of the larger group of clinical ROP specialists. However, the foundation of qualitative research is to collect detailed verbal descriptions to portray varying perspectives about complex phenomena.45 (4) This study focused only on identifying factors relevant to plus disease diagnosis. Other potentially relevant factors, such as location of retinal disease, were not explicitly asked about but were noted if mentioned by experts (Table 5). New research, such as studies relating vascular appearance with zone, may be useful.

In summary, this study suggests that agreement in plus disease diagnosis among experts is imperfect and that there are differences in the underlying diagnostic reasoning process and the retinal features examined. This study provides evidence that plus disease diagnosis is based on multiple factors that may depend on the specific examiner. Updated definitions based on detailed analysis of expert behavior, using qualitative research methods such as those used in this study, may lead to improved diagnostic accuracy and standardization. This may have implications for future definitions of plus disease, education and consistency of care, and development of computer-based diagnostic tools.

Back to top
Article Information

Corresponding Author: Michael F. Chiang, MD, Departments of Ophthalmology, Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, 3375 Terwilliger Blvd SW, Portland, OR 97239 (chiangm@ohsu.edu).

Submitted for Publication: October 7, 2012; final revision received January 8, 2013; accepted January 9, 2013.

Published Online: May 23, 2013. doi: 10.1001/jamaophthalmol.2013.135

Author Contributions: Dr Chiang had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Conflict of Interest Disclosures: Dr Chiang is an unpaid member of the scientific advisory board for Clarity Medical Systems.

Funding/Support: This work was supported by grant EY19474 from the National Institutes of Health (Dr Chiang), the Dr Werner Jackstaedt Foundation (Dr Hewing), the Friends of Doernbecher Foundation (Dr Chiang), unrestricted departmental funding from Research to Prevent Blindness (Drs Chan and Chiang), and the St Giles Foundation (Dr Chan).

Previous Presentation: Portions of this study were presented at the 2012 ARVO Annual Meeting; May 9, 2012; Ft Lauderdale, Florida.

Additional Contributions: We are very grateful to the 6 experts who generously agreed to participate in this study.

References
1.
International Committee for the Classification of Retinopathy of Prematurity.  The International Classification of Retinopathy of Prematurity revisited [published comment appears in Arch Ophthalmol. 2006;124(11):1669-1670].  Arch Ophthalmol. 2005;123(7):991-999.PubMedGoogle ScholarCrossref
2.
Cryotherapy for Retinopathy of Prematurity Cooperative Group.  Multicenter trial of cryotherapy for retinopathy of prematurity: preliminary results.  Arch Ophthalmol. 1988;106(4):471-479.PubMedGoogle ScholarCrossref
3.
Early Treatment for Retinopathy of Prematurity Cooperative Group.  Revised indications for the treatment of retinopathy of prematurity: results of the Early Treatment for Retinopathy of Prematurity randomized trial.  Arch Ophthalmol. 2003;121(12):1684-1694.PubMedGoogle ScholarCrossref
4.
Mintz-Hittner  HA, Kennedy  KA, Chuang  AZ; BEAT-ROP Cooperative Group.  Efficacy of intravitreal bevacizumab for stage 3+ retinopathy of prematurity.  N Engl J Med. 2011;364(7):603-615.PubMedGoogle ScholarCrossref
5.
Chiang  MF, Jiang  L, Gelman  R, Du  YE, Flynn  JT.  Interexpert agreement of plus disease diagnosis in retinopathy of prematurity.  Arch Ophthalmol. 2007;125(7):875-880.PubMedGoogle ScholarCrossref
6.
Wallace  DK, Quinn  GE, Freedman  SF, Chiang  MF.  Agreement among pediatric ophthalmologists in diagnosing plus and pre-plus disease in retinopathy of prematurity.  J AAPOS. 2008;12(4):352-356.PubMedGoogle ScholarCrossref
7.
Reynolds  JD, Dobson  V, Quinn  GE,  et al; CRYO-ROP and LIGHT-ROP Cooperative Study Groups.  Evidence-based screening criteria for retinopathy of prematurity: natural history data from the CRYO-ROP and LIGHT-ROP studies.  Arch Ophthalmol. 2002;120(11):1470-1476.PubMedGoogle ScholarCrossref
8.
Gelman  SK, Gelman  R, Callahan  AB,  et al.  Plus disease in retinopathy of prematurity: quantitative analysis of standard published photograph.  Arch Ophthalmol. 2010;128(9):1217-1220.PubMedGoogle ScholarCrossref
9.
Rao  R, Jonsson  NJ, Ventura  C,  et al.  Plus disease in retinopathy of prematurity: diagnostic impact of field of view.  Retina. 2012;32(6):1148-1155.PubMedGoogle ScholarCrossref
10.
Thyparampil  PJ, Park  Y, Martinez-Perez  ME,  et al.  Plus disease in retinopathy of prematurity: quantitative analysis of vascular change.  Am J Ophthalmol. 2010;150(4):468, e2.PubMedGoogle ScholarCrossref
11.
Ericsson  KS, Simon  HA.  Protocol Analysis: Verbal Reports as Data.2nd ed. Boston, MA: MIT Press; 1993.
12.
Hassebrock  F, Prietula  M.  A protocol-based scheme for the analysis of medical reasoning.  Int J Man Mach Stud. 1992;37(5):613-652. doi:10.1016/0020-7373(92)90026-H.Google ScholarCrossref
13.
Cohen  J.  Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit.  Psychol Bull. 1968;70(4):213-220.PubMedGoogle ScholarCrossref
14.
Landis  JR, Koch  GG.  The measurement of observer agreement for categorical data.  Biometrics. 1977;33(1):159-174.PubMedGoogle ScholarCrossref
15.
Section on Ophthalmology American Academy of Pediatrics; American Academy of Ophthalmology; American Association for Pediatric Ophthalmology and Strabismus.  Screening examination of premature infants for retinopathy of prematurity [published correction appears in Pediatrics. 2006;118(3):1324].  Pediatrics. 2006;117(2):572-576.PubMedGoogle ScholarCrossref
16.
Wilkinson  AR, Haines  L, Head  K, Fielder  AR; Guideline Development Group of the Royal College of Paediatrics and Child Health; Royal College of Ophthalmologists; British Association of Perinatal Medicine.  UK retinopathy of prematurity guideline.  Eye (Lond). 2009;23(11):2137-2139.PubMedGoogle ScholarCrossref
17.
Norman  GR, Rosenthal  D, Brooks  LR, Allen  SW, Muzzin  LJ.  The development of expertise in dermatology.  Arch Dermatol. 1989;125(8):1063-1068.PubMedGoogle ScholarCrossref
18.
Crowley  RS, Naus  GJ, Stewart  J  III, Friedman  CP.  Development of visual diagnostic expertise in pathology: an information-processing study.  J Am Med Inform Assoc. 2003;10(1):39-51.PubMedGoogle ScholarCrossref
19.
Azevedo R, Faremo S, Lajoie SP. Expert-novice differences in mammogram interpretation. In: Trafton DSMJG, ed. Proceedings of the 29th Annual Cognitive Science Society; August 1-4, 2007; Nashville, TN.
20.
Paul Chan  RV, Williams  SL, Yonekawa  Y, Weissgold  DJ, Lee  TC, Chiang  MF.  Accuracy of retinopathy of prematurity diagnosis by retinal fellows.  Retina. 2010;30(6):958-965.PubMedGoogle ScholarCrossref
21.
Myung  JS, Paul Chan  RV, Espiritu  MJ,  et al.  Accuracy of retinopathy of prematurity image-based diagnosis by pediatric ophthalmology fellows: implications for training.  J AAPOS. 2011;15(6):573-578.PubMedGoogle ScholarCrossref
22.
Gelman  R, Jiang  L, Du  YE, Martinez-Perez  ME, Flynn  JT, Chiang  MF.  Plus disease in retinopathy of prematurity: pilot study of computer-based and expert diagnosis.  J AAPOS. 2007;11(6):532-540.PubMedGoogle ScholarCrossref
23.
Wallace  DK, Freedman  SF, Zhao  Z, Jung  SH.  Accuracy of ROPtool vs individual examiners in assessing retinal vascular tortuosity.  Arch Ophthalmol. 2007;125(11):1523-1530.PubMedGoogle ScholarCrossref
24.
Wallace  DK, Jomier  J, Aylward  SR, Landers  MB  III.  Computer-automated quantification of plus disease in retinopathy of prematurity.  J AAPOS. 2003;7(2):126-130.PubMedGoogle ScholarCrossref
25.
Wallace  DK, Freedman  SF, Zhao  Z.  A pilot study using ROPtool to measure retinal vascular dilation.  Retina. 2009;29(8):1182-1187.PubMedGoogle ScholarCrossref
26.
Swanson  C, Cocker  KD, Parker  KH, Moseley  MJ, Fielder  AR.  Semiautomated computer analysis of vessel growth in preterm infants without and with ROP.  Br J Ophthalmol. 2003;87(12):1474-1477.PubMedGoogle ScholarCrossref
27.
Gelman  R, Martinez-Perez  ME, Vanderveen  DK, Moskowitz  A, Fulton  AB.  Diagnosis of plus disease in retinopathy of prematurity using Retinal Image multiScale Analysis.  Invest Ophthalmol Vis Sci. 2005;46(12):4734-4738.PubMedGoogle ScholarCrossref
28.
Koreen  S, Gelman  R, Martinez-Perez  ME,  et al.  Evaluation of a computer-based system for plus disease diagnosis in retinopathy of prematurity.  Ophthalmology. 2007;114(12):e59-e67.PubMedGoogle ScholarCrossref
29.
Rabinowitz  MP, Grunwald  JE, Karp  KA, Quinn  GE, Ying  GS, Mills  MD.  Progression to severe retinopathy predicted by retinal vessel diameter between 31 and 34 weeks of postconception age.  Arch Ophthalmol. 2007;125(11):1495-1500.PubMedGoogle ScholarCrossref
30.
Johnson  KS, Mills  MD, Karp  KA, Grunwald  JE.  Semiautomated analysis of retinal vessel diameter in retinopathy of prematurity patients with and without plus disease.  Am J Ophthalmol. 2007;143(4):723-725.PubMedGoogle ScholarCrossref
31.
Wilson  CM, Cocker  KD, Moseley  MJ,  et al.  Computerized analysis of retinal vessel width and tortuosity in premature infants.  Invest Ophthalmol Vis Sci. 2008;49(8):3577-3585.PubMedGoogle ScholarCrossref
32.
Shah  DN, Karp  KA, Ying  GS, Mills  MD, Quinn  GE.  Image analysis of posterior pole vessels identifies type 1 retinopathy of prematurity.  J AAPOS. 2009;13(5):507-508.PubMedGoogle ScholarCrossref
33.
Wallace  DK, Zhao  Z, Freedman  SF.  A pilot study using “ROPtool” to quantify plus disease in retinopathy of prematurity.  J AAPOS. 2007;11(4):381-387.PubMedGoogle ScholarCrossref
34.
Chiang  MF, Gelman  R, Martinez-Perez  ME,  et al.  Image analysis for retinopathy of prematurity diagnosis.  J AAPOS. 2009;13(5):438-445.PubMedGoogle ScholarCrossref
35.
Chiang  MF, Gelman  R, Williams  SL,  et al.  Plus disease in retinopathy of prematurity: development of composite images by quantification of expert opinion.  Invest Ophthalmol Vis Sci. 2008;49(9):4064-4070.PubMedGoogle ScholarCrossref
36.
Ells  AL, Holmes  JM, Astle  WF,  et al.  Telemedicine approach to screening for severe retinopathy of prematurity: a pilot study.  Ophthalmology. 2003;110(11):2113-2117.PubMedGoogle ScholarCrossref
37.
Chiang  MF, Keenan  JD, Starren  J,  et al.  Accuracy and reliability of remote retinopathy of prematurity diagnosis.  Arch Ophthalmol. 2006;124(3):322-327.PubMedGoogle ScholarCrossref
38.
Wu  C, Petersen  RA, VanderVeen  DK.  RetCam imaging for retinopathy of prematurity screening.  J AAPOS. 2006;10(2):107-111.PubMedGoogle ScholarCrossref
39.
Chiang  MF, Wang  L, Busuioc  M,  et al.  Telemedical retinopathy of prematurity diagnosis: accuracy, reliability, and image quality.  Arch Ophthalmol. 2007;125(11):1531-1538.PubMedGoogle ScholarCrossref
40.
Scott  KE, Kim  DY, Wang  L,  et al.  Telemedical diagnosis of retinopathy of prematurity intraphysician agreement between ophthalmoscopic examination and image-based interpretation.  Ophthalmology. 2008;115(7):1222-1228, e3.PubMedGoogle ScholarCrossref
41.
Balasubramanian  M, Capone  A, Hartnett  ME,  et al; Photographic Screening for Retinopathy of Prematurity (Photo-ROP) Cooperative Group.  The photographic screening for retinopathy of prematurity study (photo-ROP): primary outcomes.  Retina. 2008;28(3)(suppl):S47-S54.PubMedGoogle Scholar
42.
Lorenz  B, Spasovska  K, Elflein  H, Schneider  N.  Wide-field digital imaging based telemedicine for screening for acute retinopathy of prematurity (ROP): six-year results of a multicentre field study.  Graefes Arch Clin Exp Ophthalmol. 2009;247(9):1251-1262.PubMedGoogle ScholarCrossref
43.
Silva  RA, Murakami  Y, Lad  EM, Moshfeghi  DM.  Stanford University Network for Diagnosis of Retinopathy of Prematurity (SUNDROP): 36-month experience with telemedicine screening.  Ophthalmic Surg Lasers Imaging. 2011;42(1):12-19.PubMedGoogle ScholarCrossref
44.
Dai  S, Chow  K, Vincent  A.  Efficacy of wide-field digital retinal imaging for retinopathy of prematurity screening.  Clin Experiment Ophthalmol. 2011;39(1):23-29.PubMedGoogle Scholar
45.
Li  AC, Kannry  JL, Kushniruk  A,  et al.  Integrating usability testing and think-aloud protocol analysis with “near-live” clinical simulations in evaluating clinical decision support.  Int J Med Inform. 2012;81(11):761-772.PubMedGoogle ScholarCrossref
×