Wechsler J, Bastuji-Garin S, Spatz A, Bailly C, Cribier B, Andrac-Meyer L, Vergier B, Fraitag S, Verola O, Wolkenstein P, and the French Cutaneous Cancerology Group. Reliability of the Histopathologic Diagnosis of Malignant Melanoma in Childhood. Arch Dermatol. 2002;138(5):625-628. doi:10.1001/archderm.138.5.625
To assess interrater reliability in the diagnosis of malignant melanoma in children.
Design, Setting, and Participants
We collected 85 slides of melanomas diagnosed in patients younger than 17 years through a network of dermatopathologists and dermatologists. The slides were classified into 3 categories: (1) slides from children with metastatic melanoma; (2) slides from disease-free children with a follow-up of less than 5 years; (3) slides from disease-free children with a follow-up of 5 years or longer. Category 1 was considered the gold standard. Four pairs of expert dermatopathologists reviewed the slides and classified them into melanoma, nevus (including Spitz nevus), or ambiguous tumors.
Main Outcome Measure
Concordance between pairs of experts.
For category 1 slides (n = 20), the concordance was weak to moderate. For category 2 slides (n = 47), the concordance was weak. For category 3 slides (n = 18), the concordance was poor to moderate.
This study demonstrates that the reliability of diagnosis of melanoma in childhood is poor, even when submitted to experts.
MELANOMA remains a rare tumor in childhood: fewer than 0.5% of cases of melanoma occur in children and adolescents.1,2 Therefore, there are only small series of melanoma reported in children, and the definition of children varies. Some studies included patients aged from 15 to 20 years,3- 12 and others included only prepubescent children.13- 18 The diagnosis of malignant melanoma in children is more difficult than in adults because of unusual clinical and histologic features. Although Spitz nevus19- 21 is now considered a benign melanocytic tumor, the differential diagnosis with melanoma can be extremely difficult. Misdiagnosis seems to be common, occurring in up to 40% of children eventually found to have the disease.12 Consequently, some studies included only children with metastatic disease13,22 because metastatic spread is the only definite proof of malignancy. However, pathologists faced with melanocytic tumors in children have to decide whether they are malignant. Using metastatic spread as the gold standard for malignancy, we conducted a study to evaluate the difficulties of melanoma diagnosis in childhood. Toward that end, we tested the interrater reliability of the diagnosis of melanoma in childhood among a group of expert dermatopathologists.
The expert dermatopathologist panel was made up of 8 of us (J.W., A.S., C.B., B.C., L.A.-M., B.V., S.F., and O.V.). Each member of this panel has actively practiced dermatopathology for at least 10 years in a French academic dermatology and/or oncology center and deals with melanoma in his or her daily practice. The members of the panel came from different centers throughout France and were not accustomed to working together. All members have an extensive record of publications in dermatopathology, and 3 of us (J.W., A.S., and C.B.) participated in a previous European study on melanoma in childhood.12 In most cases, the experts were not involved in making the initial diagnosis of the cases included in the present study. Considering that in their practice, dermatopathologists often discuss difficult cases with colleagues, 4 groups of 2 experts each were randomly assigned (groups A, B, C, and D). Each pair reviewed the same 120 slides according to a predetermined method.
We included patients younger than 17 years with a diagnosis of melanoma made on the basis of histopathologic examination. One representative slide per case was selected. As control cases, slides of patients with unambiguous tumors, either benign nevi (n = 14) or typical adult melanomas (n = 21), were collected from the slide collections of the following centers: Hôpital Henri-Mondor, Institut Gustave Roussy, Centre Léon Bérard, Hôpital Saint-Louis, University Hospital of Strasbourg, and Hôpital Necker-Enfants Malades. A total of 120 slides representing 120 lesions (1 slide per lesion) were collected: 85 childhood melanoma cases and 35 unambiguous cases. The study childhood cases were obtained from the following sources: the Cutaneous Cancerology Group of the French Society of Dermatology, Paris, France (n = 17); a federation of private pathology laboratories, the CRISAP (Centres de Regroupement Informatique et Statistiques en Anatomie Pathologique), Dijon, France (n = 18); Institut Gustave-Roussy (Villejuif, France) (n = 9); Centre Léon-Bérard (Lyon, France) (n = 21); Hôpital Henri-Mondor (Créteil, France) (n = 18); and University Hospital (Strasbourg, France) (n = 2). The vast majority of these slides were referred to these participating centers for a review after the diagnosis of melanoma was made. Four of the experts reviewed some of the cases in their centers after the diagnosis of melanoma was made (J.W., 18 cases; A.S., 9 cases; C.B., 21 cases; and B.C., 2 cases).
To classify childhood case slides, the following clinical data were collected in each case: age at diagnosis, presence or absence of metastatic spread and/or death, and date of last follow-up if the patient was disease free. Metastatic melanoma was defined by nodal metastasis and/or distant metastasis. The slides were then classified by the 2 nonpathologists (S.B.G. and P.W.) into the following categories: category 1—slides from children with metastatic melanoma (n = 20); category 2—slides from disease-free children with a follow-up of less than 5 years (n = 47); and category 3—slides from disease-free children with a follow-up of 5 years or longer (n = 18). A qualified person not involved in the study blinded these 120 slides, both childhood cases and unambiguous cases.
The same set of 120 slides was sent to each pair of dermatopathologists (pairs A, B, C, and D) with 1 coding sheet per slide. There was no attempt to reach a consensus about the diagnosis of melanoma in childhood before the study. Our purpose was to follow the routine procedure for reading slides. Nevertheless, slides were sent to the experts without any clinical information. Each pair of experts was asked to classify each lesion as melanoma, nevus, or ambiguous tumors (ie, impossible to classify as benign or malignant). The 2 members of each pair had to agree on a diagnosis by joint analysis using a multiheaded microscope. If no agreement could be reached by the 2 members of a pair of experts, this disagreement had to be specifically notified.
Unless a discrepancy between the 2 members of a pair occurred, the statistical unit of the analysis was the answer of each pair of experts. We first analyzed the diagnosis of experts on unambiguous adult case slides as controls by evaluating the 2 × 2 agreement between pairs of experts in paired comparisons. The agreement between the reviewers' diagnoses and the initial diagnoses were also analyzed.
For the childhood case slides, we first analyzed the univariate distribution of the pairs of dermatopathologists' diagnoses and tested, using the χ2 test, whether the percentage of each diagnosis was different. We then evaluated the agreement between the pairs of experts in paired comparisons. The mean percentages of agreement were also compared among the 3 diagnosis categories by analysis of variance.
The agreement was evaluated by the percentage of concordance. The κ indexes and their 95% confidence intervals (when strength was sufficient) were calculated to evaluate the degree of agreement by taking into account the random concordance.23 All κ indexes have values approaching zero if agreement is due only to chance and can assume either positive or negative values. A value of 100% indicates perfect agreement. Concordance is considered excellent if the κ value is between 100% and 81%; good, between 80% and 61%; moderate, between 60% and 41%; weak, between 40% and 21%; poor, between 20% and 0%; and bad, lower than 0%.24
The 120 coding sheets (85 childhood cases and 35 unambiguous cases) that were mailed to the 4 pairs of reviewers were completed and returned. The mean ± SD age of patients was 12 ± 4 years in category 1, 12 ± 4 years in category 2, and 11 ± 4 years in category 3 (P = .28). Each pair of experts reached a consensus in classifying the slides into 1 of the 3 categories.
For the 35 unambiguous cases slides, the agreement among the 4 pairs of experts was perfect (κ indexes = 100%). There was no discrepancy between the initial and the review diagnoses. The distribution of diagnosis of childhood cases differed widely among pairs of dermatopathologists (P<.001). Diagnosis of melanoma ranged from 45.9% to 58.8% of the cases, and diagnosis of ambiguous tumor ranged from 15.3% to 35.3% (Table 1 and Table 2). The interrater reliability ranged from 54% to 66%, with weak to moderate κ indexes. Nevertheless, no significant difference was observed between the κ indexes. A complete agreement among the 4 pairs of pathologists was observed in 33 (38.8%) of the cases (ie, in 13 of the 20 category 1 slides; 14 of the 47 category 2 slides; and 6 of the 18 category 3 slides). Table 3 summarizes the analysis of the agreement between pairs of experts in the 3 categories of slides. The concordance was weak to moderate in category 1 slides, weak in category 2 slides, and poor to moderate in category 3 slides. The mean κ index was slightly higher in cases with metastatic spread than in the other categories, but the difference was not significant.
The distribution of the diagnoses in each category of slides is summarized in Table 4. The experts diagnosed melanoma for category 1 slides more often than for slides from categories 2 and 3 (P = .04; χ2 test).
The histopathologic diagnosis of melanoma in children is difficult because of the occurrence of lesions with indeterminate diagnosis such as atypical Spitz tumors.25 The distinction between melanoma and Spitz nevus or pigmented spindle cell nevus is one of the most difficult problems in pathology and may lead to overdiagnosis of melanoma. Thus, in a previous study,12 40% of lesions initially diagnosed as melanoma were reclassified as nevus after review.
Because of the difficulty of making the diagnosis of childhood melanoma, the ultimate gold standard is distant metastases and/or death of the patient.9 Nonetheless, certain authors have maintained that morphologic criteria such as resemblance to adult melanoma and "anaplasia" are sufficient criteria for histologic diagnosis of melanoma in childhood.18,26 In the present study, we used the rather stringent criterion of metastatic spread for the ultimate diagnosis of melanoma to test the reliability of histopathologic diagnosis among experts. Using a network of clinicians and pathologists, we were able to collect 20 cases of melanoma occurring in patients younger than 17 years who developed metastatic spread (category 1). On the other hand, some of the other cases in our study were "potential" childhood melanoma. As the vast majority of recurrences of childhood melanomas occur within 5 years of diagnosis,8 we decided to separate these slides into 2 groups according to the length of follow-up (ie, <5 years and ≥5 years). Theoretically, the group of disease-free patients with a follow-up of less than 5 years could include cases with a potential of future metastatic spread. The present study intended to quantify the reliability of the histopathologic diagnosis of childhood melanomas in a panel of pathologists.
Nonpublished confrontations between experts at a pathology meeting have suggested (personal communication, Bernard Cribier, MD, PhD), and our investigation confirms, the challenge of making the diagnosis of melanoma in childhood; it may not be easily diagnosed even by a panel of expert pathologists. Our study confirms the poor reliability of this diagnosis. The concordance among pairs of experts was moderate to weak in the group of documented metastatic tumors even though the diagnosis of melanoma was made more frequently in this category. The concordance among experts was even lower in the other cases.
This is the first study to evaluate the interrater reliability of the histopathologic diagnosis of childhood melanomas. To improve diagnosis reliability, we need to identify those morphologic criteria that will allow us to separate the metastatic cases from the benign or ambiguous tumors. Diagnostic criteria should be documented.27 Nevertheless, it may not be possible to develop definitive diagnostic criteria because there are many overlapping features with benign nevi and Spitz nevi, even in adult melanoma. Unless clearly delineated criteria are identified, progress is unlikely.
Accepted for publication January 22, 2002.
This study was supported by a grant from Vaincre le Mélanome (SANOFI), Paris, France (Dr Wolkenstein).
This study was presented in part at the Journées Dermatologiques de Paris, Paris, December 1-4, 1999.
We would like to thank L. Thomas, MD, Service de Dermatologie, Hôtel-Dieu, Lyon; M. Delaunay, MD,* Unité de Dermatologie-Cancérologie, Hôpital Saint André, Bordeaux; P. Bioulac-Sage, MD, Département d'Anatomie Pathologique, Hôpital Pellegrin, Bordeaux; B. Guillot, MD,* Service de Dermatologie, Hôpital Saint-Eloi, Montpellier; C. Labrèze, MD, Dermatologie Pédiatrique, Hôpital Pellegrin, Bordeaux; F. Truchetet, MD,* Hôpital Beauregard, Thionville; F. Boitier, MD,* Service de Dermatologie, Hôpital Henri-Mondor, Créteil; P. Joly, MD,* Service de Dermatologie, Hôpital Charles Nicolle, Rouen; C. Lok, MD,* Service de Dermatologie, Hôpital Sud, Amiens; B. Dreno, MD,* Clinique Dermatologique, Hôtel-Dieu, Nantes; G. Lorette, MD,* Service de Dermatologie, CHU Trousseau, Tours; J. J. Grob, MD,* Service de Dermatologie, Hôpital Sainte-Marguerite, Marseille; and la Fédération des CRISAP. (Asterisk indicates that the person is a member of the French Cutaneous Cancerology Group).
Corresponding author: Pierre Wolkenstein, MD, PhD, Department of Dermatology, Henri-Mondor Hospital, F-94010 Créteil CEDEX, France (e-mail: firstname.lastname@example.org).