Biopsy decisions by 39 dermatologists participating in the pilot reader study. Small black circles represent individual readers; big black circles, pairs of readers; and the diamond, examining clinicians in the clinical trial on the same set of lesions.
Monheit G, Cognetta AB, Ferris L, Rabinovitz H, Gross K, Martini M, Grichnik JM, Mihm M, Prieto VG, Googe P, King R, Toledano A, Kabelev N, Wojton M, Gutkowicz-Krusin D. The Performance of MelaFindA Prospective Multicenter Study. Arch Dermatol. 2011;147(2):188-194. doi:10.1001/archdermatol.2010.302
In 2009, there were an estimated 68 700 new cases of invasive melanoma and over 53 000 new cases of melanoma in situ in the United States.1 The actual number might be significantly higher because melanoma incidence is considered to be underreported by 30% to 40% in cancer registries.2 The number of deaths due to melanoma was estimated at 8650 in 2009.1 Melanoma is virtually 100% curable if detected when it is confined to the epidermis (melanoma in situ). Thin melanomas, with a Breslow thickness of 1 mm or thinner, have a 94% rate of survival after 5 years.3 However, once melanoma has advanced and metastasized to other parts of the body, it is difficult to treat. The survival rate for patients with stage IV melanoma is less than 15% for 5 years, with most patients dying within 6 to 10 months.4 Therefore, early detection and prompt treatment are essential to improve the prognosis for patients with melanoma. The challenge is that early melanoma may be difficult to differentiate from many benign simulants.
This multicenter prospective trial was designed to establish the safety and effectiveness of MelaFind (MELA Sciences Inc, Irvington, New York) as an aid in evaluating pigmented lesions (PLs) that have 1 or more clinical or historical characteristics of melanoma. MelaFind is a noninvasive, fully automatic, computer-vision diagnostic system designed as an aid to detection of early melanoma and developed to identify PLs that should be considered for biopsy to rule out melanoma. This study evaluates the performance of MelaFind using sensitivity and specificity as metrics and comparing the specificity of MelaFind to that of the study investigators.
Patients with at least 1 PL scheduled for biopsy in toto were invited to participate in the trial. Exclusion criteria were as follows: (1) failure to give informed consent; (2) known allergy to isopropyl alcohol; (3) diameter of the PL less than 2 mm or greater than 22 mm; (4) anatomic site of PL not accessible to the device; (5) lesion previously biopsied, excised, or traumatized; (6) skin not intact (eg, open sores, ulcers, bleeding); (7) lesion within 1 cm of the eye; (8) lesion on palmar, plantar, or mucosal (eg, lips, genitals) surface or under nails; (9) lesion in an area of visible scarring; or (10) lesion containing foreign matter (eg, tattoo ink, splinter, marker).
The primary outcome measures were the sensitivity and specificity of the computer-vision system, MelaFind, and the specificity of clinicians, among PLs with the diagnoses of “melanoma cannot be ruled out” or “not melanoma.” Pigmented lesions with prebiopsy clinical diagnoses of melanoma were excluded from the analysis because such lesions would be biopsied by examining dermatologists regardless of MelaFind results. Melanomas and borderline lesions such as high-grade dysplastic nevi (HGDN) and atypical melanocytic hyperplasias (AMH) or proliferations (AMP) were defined as histologically positive lesions.
MelaFind produces a binary output: (1) positive, the lesion should be considered for biopsy to rule out melanoma; and (2) negative, the lesion should be considered for later evaluation. Clinicians were blinded to MelaFind output, and participation in the trial did not affect patient treatment. Thus investigators managed the patient care based on their clinical information, and in the study, the PL was considered positive if the prebiopsy dermatologic diagnosis was melanoma or melanoma cannot be ruled out, and negative if the prebiopsy dermatologic diagnosis was not melanoma.
The dermatologic diagnosis was the dermoscopic diagnosis, if available, or the clinical diagnosis without dermoscopy otherwise. When the prebiopsy dermoscopic diagnosis was not melanoma and the reason for the biopsy was “clinical concern,” the dermatologic diagnoses were the diagnoses without dermoscopy. The diagnostic performances of MelaFind and of the examining clinicians were evaluated using the histologic reference standard.
Digital multispectral MelaFind images (in 10 bands) were acquired for every lesion in the trial. In addition, 2 standard clinical images were acquired with a Fuji FinePix Pro SR camera (FUJIFILM Corporation, Tokyo, Japan) (an overview from about 55 cm away and a close-up from about 20 cm away) and a contact dermoscopic image with a Nikon Coolpix 4300 camera (Nikon Corporation, Tokyo, Japan) with a 3Gen dermoscopic attachment (3Gen LLC, San Juan Capistrano, California). The electronic Case Record Forms contained information about patient demographics and melanoma risk factors. The presence of clinical and historical characteristics of melanoma such as ABCDE (asymmetry, border irregularity, color variegation, diameter ≥6 mm, and evolution),3,5,6 regression, “ugly duckling” sign,7 and patient's concern was also recorded for each lesion. If dermoscopic evaluation was used, dermoscopic characteristics of lesions enrolled were recorded. Prebiopsy diagnoses (without dermoscopy and, if available, with dermoscopy) by the examining dermatologists were also included; if the dermatologic diagnosis was not melanoma, the reason for the biopsy was selected from the following: nonmelanoma skin cancer, patient's concern, patient's discomfort, cosmetic, or, if dermoscopic evaluation was used, clinical concern. A histologic specimen with the standard hematoxylin-eosin staining was provided for each lesion.
Since the prebiopsy dermatologic diagnosis might not match the histologic diagnosis, the diagnostic performance of MelaFind and of clinicians were evaluated using dermatopathology as the reference standard. Borderline lesions that are currently excised in clinical practice, such as HGDN, AMP, and AMH, were included in the analysis because HGDN are sometimes considered melanoma precursors or early evolving melanomas and are difficult to differentiate from melanoma in situ.8,9 The diagnoses AMP and AMH may be used when dermatopathologists are uncertain about diagnosis and cannot rule out melanoma.10,11
Because concordance of histologic interpretation is limited with early melanomas,12- 14 histologic slides for each lesion in the trial were evaluated by 2 independent dermatopathologists. In cases of significant discordance, histologic slides were evaluated independently by a third dermatopathologist. When 1 dermatopathologist diagnosed melanoma and 2 others diagnosed a benign lesion, histologic slides were sent again to the dermatopathologist who diagnosed melanoma for a blind rereview. The final histologic diagnosis was determined following the algorithm detailed in Table 1.
Clinicians did not receive the results of the MelaFind lesion classification algorithm, and information from MelaFind was not used in diagnosing or treating lesions. An independent entity acted as a secure data custodian and verified the integrity of the data received from the clinical sites. In addition, the data custodian analyzed all MelaFind images, using software provided by MELA Sciences Inc. At the conclusion of the accrual phase of the trial, the statistician (A.T.) received the electronic Case Record Forms and lesion classification results from the data custodian, as well as the results of histologic evaluations, and analyzed the performance of MelaFind and of clinicians participating in this trial.
Seven clinical sites with 23 investigators participated in this trial. Three sites were academic institutions (University of Pittsburgh, Duke University, and Northwestern University), and 4 sites were dermatologic practices highly experienced in managing PLs. All sites were approved to participate in the study by the appropriate institutional review boards.
The reader study investigated the biopsy sensitivity of dermatologists. It used 25 randomly selected melanomas (11 invasive and 14 in situ) and 25 nonmelanomas, matched to melanomas by anatomic location and patient age and sex. Borderline lesions (HGDN, AMP, and AMH) were excluded. The clinical history and images (clinical overview, clinical close-up, and dermoscopy images) were reviewed by the readers. The readers, dermatologists who did not participate in the clinical trial, reported for each lesion whether they would biopsy it to rule out melanoma.
MelaFind acquires digital multispectral images of a PL in 10 different spectral bands, from blue (430 nm) to near infrared (950 nm). These are contact images that use 91% isopropyl alcohol for refractive index matching. Each image is 1280 × 1024 pixels, with the pixel size in the lesion plane 20 × 20 μm. MelaFind uses automatic image analysis and statistical pattern recognition to help identify lesions to be considered for biopsy to rule out melanoma. The properties of these images as well as image analysis methods have been previously described.15- 21
All image analysis and lesion classification algorithms are automatic and were tested prospectively in the clinical trial: (1) calibration algorithms reduce noise and artifacts in the images and determine the diffuse reflectance of the skin; (2) image quality control algorithms detect image problems (eg, overexposure, underexposure, lesion too big, lesion too small, too much hair on the lesion, too many bubbles on the lesion, motion of the handheld imaging device during imaging) and, when appropriate, request the operator to reimage; (3) lesion segmentation algorithm identifies image pixels that belong to the lesion; (4) feature extraction algorithms compute quantitative lesion parameters; (5) lesion classification algorithm differentiates lesions to be considered for biopsy to rule out melanoma (MelaFind positive) from those to be considered for later evaluation (MelaFind negative).
Between January 2007 and July 2008, 1383 patients with 1831 PLs were enrolled. More female than male patients entered the study, and the median age was 47 years. Most patients (about 98%) were white, consistent with the fact that melanoma is much more frequent among whites.22 There were no significant differences between the group of enrolled patients (1383) and those with evaluable lesions (1257) (Table 2).
Of the 1831 registered lesions, 1 patient with 1 lesion withdrew from the trial. Three lesions were determined to be ineligible by clinicians, and 14 by dermatopathologists (mostly owing to previous scarring that was not identified clinically); 19 lesions were not evaluable because of missing or inadequate histologic slides. One hundred sixty-two lesions were not evaluable owing to unsuccessful imaging attempts: 65 owing to operator errors (eg, too much hair, too many bubbles, lesion not centered in the field of view), 36 owing to MelaFind or standard camera malfunctions, and 61 owing to causes that might have been either operator errors or MelaFind malfunctions (ie, either lesion too small or failure of automatic segmentation). MelaFind does not provide a result if the image fails automatic image quality control algorithms, but all enrolled lesions were imaged during the trial.
Of 1632 eligible and evaluable lesions, 143 (8.8%) required more than 2 evaluations by the dermatopathologists. Melanomas made up about 8% of all eligible and evaluable PLs; borderline lesions (HGDN, AMP, and AMH) accounted for about 3% (Table 3). Most lesions were nevi, with 61% being low-grade dysplastic nevi. About 15% of lesions were nonmelanocytic, including seborrheic keratoses, actinic lentigines, and pigmented nonmelanoma skin cancers. About 45% of melanomas were in situ, which are almost 100% curable by complete excision.23 The invasive melanomas were thin (median thickness, 0.36 mm). Most of the invasive melanomas were of the most common, superficial spreading type. Only 2 melanomas (both nodular) were relatively thick: 1.0 and 1.2 mm. Thus, almost all melanomas in this trial were early lesions that are difficult to differentiate from benign simulants.
Dermoscopic characteristics were provided for 645 lesions, including 60 melanomas. Among the 29 lesions considered not melanoma dermoscopically—but melanoma cannot be ruled out clinically—and that were biopsied because of clinical concern, 1 was found to be melanoma and another HGDN by histologic analysis (Table 4). There were 82 lesions with a final dermatologic diagnosis of not melanoma (5%), most of which were biopsied owing to patient's concern (57 of 82); among these lesions 1 was melanoma and 1 HGDN (Table 4). Most of the histologically verified melanomas were diagnosed prior to biopsy as melanoma cannot be ruled out, with about a third of melanomas considered unlikely (ie, likelihood between 1% and 33%). These results indicate that the lesions enrolled in this trial presented a significant diagnostic challenge to the investigators.
A small study investigated the biopsy sensitivity of dermatologists who did not participate in the clinical study but who served as readers. The average biopsy sensitivity to melanoma of the 39 readers was 78%. The interreader variability (SD) was high, κ = 0.22 (0.01), indicating only fair agreement. This variability is illustrated in the Figure, which shows that some of the readers made biopsy decisions very similar to those of examining clinicians, who biopsied all of these lesions to rule out melanoma; it also shows that many did not. Only 5 of 25 melanomas would have been biopsied by all readers, and different readers missed different melanomas.
There were 1612 lesions (including 114 melanomas) evaluable for primary end points, ie, excluding lesions with the prebiopsy dermatologic diagnosis of melanoma. The data on these 114 melanomas were pooled to determine the sensitivity of MelaFind to melanoma and the 95% lower confidence bound (LCB) on sensitivity. Since the measured values of sensitivity were very high, the exact mid-P method was used to compute the LCB.24 Because of the high degree of variability among investigators (specificity range, 0%-25%), the specificity was determined separately for the set of lesions from each investigator. The specificities for MelaFind and clinicians were obtained by averaging over investigators and then compared (Table 5).
The secondary end points included all eligible and evaluable lesions with any prebiopsy dermatologic diagnosis. Since no comparisons between MelaFind and study investigators were performed, the data were pooled. These end points are summarized in Table 6. Measured negative predictive values are very high (>98%) owing to the very high sensitivity of MelaFind.
In this multicenter prospective trial, MelaFind achieved very high sensitivity to thin melanomas and borderline lesions (>98%; 95% LCB, >95%). For lesions that were not melanomas and had prebiopsy diagnoses of melanoma cannot be ruled out or not melanoma, MelaFind had an average specificity of 9.5%, ie, significantly higher than that of investigators (3.7%) (P = .02). However, this trial could not determine the true sensitivity of dermatologists, since melanomas that were not scheduled for biopsy, ie, missed by the examining clinicians, would not be evaluable. The pilot reader study found that dermatologists miss thin melanomas. Thus, even though all lesions in the clinical trial were biopsied by the examining dermatologists, many of the melanomas would not have been biopsied by other dermatologists.
The sensitivity of physicians can be assessed by longitudinal studies, ie, long-term follow-up of the patients to determine whether lesions considered benign later turned out to be clinically suspect for melanoma. In the 9-year longitudinal study at a PL clinic in the United Kingdom by Bataille et al,25 221 melanomas (both invasive and in situ) were detected. Melanomas on 14 patients were diagnosed as benign on the first visit and biopsied on the second or third visit; one of these patients died of metastatic melanoma. The biopsy sensitivity to melanoma measured in this study was 93.7% (95% LCB, 90.5%). The median Breslow thickness of melanomas in this study was 0.9 mm. The study by Carli et al26 found biopsy sensitivity to melanoma to be 86.7% (95% LCB, 66.7%) based on a comparison with the cancer registry, but the sample was very small.26 Thus, the 95% LCB on MelaFind's sensitivity is higher than the biopsy sensitivities of longitudinal studies reported in the literature.
Another method of assessing physician sensitivity is through reader studies. Such studies have been widely used to evaluate diagnostic performance in mammography and colonoscopy.27,28 In dermatology, readers are presented with a series of images of lesions (clinical and/or dermoscopic) and, for each image, may be asked to decide whether a lesion is a melanoma and whether it should be biopsied to rule out melanoma. Assuming acceptable quality of images (clinical and/or dermoscopic) and availability of patient history, and taking into account the fact that most of the training of dermatologists is actually done with such images, reader studies can provide estimates of sensitivity to melanoma. Such studies have been performed in dermatology to evaluate the effectiveness of teledermatology,29,30 to compare the diagnostic performance of dermatologists and primary care physicians,31 to assess the dermoscopic examination of lesions,32 and to determine the diagnostic and biopsy sensitivities of dermatologists.33
The pilot reader study conducted as a part of this clinical trial included a random sample of lesions enrolled in the trial and provided readers with the clinical overview, a clinical close-up image, contact dermoscopic images, and patient demographic and melanoma risk factor information; all images were checked for quality by an experienced dermatologist. The average biopsy sensitivity of 39 dermatologist readers was 78%, which is similar to a prior reader study by Friedman et al33 for small (<6 mm in diameter) PLs from the MelaFind database. Among 10 expert dermoscopists reviewing a series of 99 small PLs, the average biopsy sensitivity to melanoma was 71%; the biopsy sensitivity to melanoma in situ was 63%. Interreader variability was very high for lesion management decisions (κ = 0.34), indicating that different experts make different biopsy decisions.
A limitation of the present study is that since only PLs scheduled for biopsy in toto were evaluable in the trial, the benign lesions are not representative of such lesions in the general population. As a result, the specificity reported here cannot be generalized to the general population, for either clinicians or MelaFind. Tsao et al34 estimated that, in the general population in the United States, there are about 70 000 nevi for every invasive melanoma or about 35 000 nevi per invasive or in situ melanoma; the overwhelming majority of these nevi have no clinical or historical characteristics of melanoma. A study by Schäfer et al35 examined common and atypical melanocytic nevi in the general adult population in Germany and found that the average number of atypical nevi per person was 0.074.35 In the US population of about 300 million, with about 120 000 new cases of invasive and in situ melanomas per year,1 this implies about 200 atypical nevi per melanoma. So, even if all atypical nevi were selected for biopsy to rule out melanoma, the specificity in the general population would be over 99% (ie, about [35 000 − 200]/35 000). Therefore, the fact that the specificities of examining clinicians and MelaFind are rather low in the clinical trial does not mean that the specificities of clinicians and MelaFind would be low in the general population. It is only a reflection of the fact that almost all the lesions in this trial were sufficiently atypical to be selected for biopsy to rule out malignant melanoma.
MelaFind is intended to be used on lesions with 1 or more clinical or historical characteristics of melanoma, ie, atypical lesions. If all atypical lesions were to be biopsied to rule out melanoma, the biopsy ratio (number of false-positive biopsy findings per true-positive biopsy finding) of about 200:1 would be very high. In the trial, the biopsy ratio for MelaFind was 10.8:1 for melanomas and 7.6:1 for melanomas and borderline lesions. The values of the biopsy ratio reported in the literature are highly variable. A study by Cohen et al36 reported biopsy ratios of about 135:1 for patients with a history of melanoma and about 576:1 for patients without personal history of melanoma. Among general practitioners in Australia, the biopsy ratio varied from 82:1 in the youngest patients to 10:1 in the oldest patients.37 For dermatologists, prospective studies reported biopsy ratios of about 8:1 in the general population,38 and from 33:1 to 47:1 among high-risk patients with atypical nevi.39- 41 One study by Banky et al42 reported a very low biopsy ratio of about 3:1 in patients at high risk for melanoma using a combination of baseline images and dermoscopy. However, this study also reported that at least 5 of 18 melanomas (4 in situ and 1 invasive at the time of biopsy) were not detected on the first examination. Thus, MelaFind's biopsy ratio is at the lower end of the values reported in the literature.
MelaFind is a safe and effective tool to help identify PLs to be considered for biopsy to rule out melanoma. In this trial, MelaFind demonstrated very high sensitivity to early melanomas and borderline lesions, specificity superior to that of clinicians, and a biopsy ratio of about 8:1. Direct comparison of the results of this study with those of other studies is not possible. As pointed out by Menzies et al,43 diagnostic performance depends on the difficulty of lesions included in the study. It would be helpful if the dermatologic community could agree on a standard mix of lesions to be used for future testing of methods developed for early detection of melanoma.
Correspondence: Dina Gutkowicz-Krusin, PhD, MELA Sciences Inc, 50 S Buckhout St, Ste 1, Irvington, NY 10533 (firstname.lastname@example.org).
Accepted for Publication: August 11, 2010.
Published Online: October 18, 2010. doi:10.1001/archdermatol.2010.302
Author Contributions: Drs Toledano and Gutkowicz-Krusin had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Rabinovitz, Toledano, Kabelev, and Gutkowicz-Krusin. Acquisition of data: Monheit, Cognetta, Ferris, Rabinovitz, Gross, Martini, Grichnik, Mihm, Prieto, and Googe. Analysis and interpretation of data: Gross, Martini, Mihm, Prieto, Googe, King, Toledano, Kabelev, Wojton, and Gutkowicz-Krusin. Drafting of the manuscript: Monheit, Prieto, and Gutkowicz-Krusin. Critical revision of the manuscript for important intellectual content: Monheit, Cognetta, Ferris, Rabinovitz, Gross, Martini, Grichnik, Mihm, Prieto, Googe, King, Toledano, Kabelev, and Wojton. Statistical analysis: Toledano and Gutkowicz-Krusin. Administrative, technical, and material support: Martini, Mihm, Wojton, and Gutkowicz-Krusin. Study supervision: Monheit, Mihm, Prieto, and Kabelev. Provided additional literature and references: Grichnik. Research and development: Kabelev.
Financial Disclosure: Dr Monheit has served and/or currently serves as consultant and/or clinical investigator for Allergan Corporation (Juvederm), Dermik Laboratories (Sculptra), Genzyme Corporation (Captique, Prevelle), Colbar LifeScience Ltd (now owned by Johnson & Johnson) (Evolence), Contura (Aquamid), Ipsen/Medici (Dysport), Stiefel, Electro-Optic Sciences Inc (MelaFind), Revance, Kythera, Galderma, Mentor, and Mertz. Drs Cognetta, Rabinovitz, and Mihm are members of the MELA Sciences Advisory Committee. Drs Monheit, Cognetta, Ferris, Grichnik, Rabinovitz, Prieto, King, and Toledano serve as consultants to MELA Sciences Inc. Messrs Kabelev and Wojton and Dr Gutkowicz-Krusin are employees of MELA Sciences Inc.
Funding/Support: This study was supported by MELA Sciences Inc.