The otherwise carefully designed and executed study by Weinstock et al1 comes to conclusions that seem rather at variance with the objective data. According to the authors, "melanocytic dysplasia can be reproducibly graded among diverse general dermatopathologists." A review of the salient data suggests otherwise.
Their values for interobserver concordance in the diagnosis of melanocytic dysplasia are presented according to several different measures, a few of which will have little intuitive meaning for most of those who deal with issues of diagnostic reliability. For example, how much reliability is indicated by a mean intraclass correlation coefficient of 0.67? How about a statistic of 0.76? Or a range of Pearson product moment correlations from 0.67 to 0.84? On the other hand, rates of exact agreement among observers are readily understood. Weinstock et al1 report that for a 5-point rating scale (no dysplasia, slight dysplasia, moderate dysplasia, severe dysplasia, and melanoma) the average pairwise exact agreement rate was 50% for the 5 dermatopathologists; when the scale was condensed to 3 points (not dysplastic, dysplastic, melanoma), the rate of agreement improved to 63% among the observers. Weinstock et al1 suggest from these data that melanocytic dysplasia can be diagnosed with "reasonable although imperfect reliability." How useful can a diagnosis be with a mere 50% to 63% rate of observer agreement? No wonder that a study using the dysplastic nevus as a genotypic marker in genetic linkage analyses and purporting the assignment of a melanoma susceptibility gene to chromosome 1p2 has to date proved irreproducible.3
Piepkorn M. The Diagnostic Reproducibility of Melanocytic Dysplasia. Arch Dermatol. 1998;134(8):1037-1038. doi: