Piepkorn raises several issues concerning the analysis of our data and our conclusions.
First, he criticizes our use of intraclass correlation coefficients and r coefficients because they have little "intuitive meaning," a deficiency that is not shared by the percentage of "exact agreement." However, our criteria were based on the concept that melanocytic dysplasia includes a spectrum of disordered hyperplasia and cellular findings, and that the difference between no dysplasia and dysplasia with slight cellular atypia is minimal compared with the difference between no dypslasia and dysplasia with severe cellular atypia. Hence, the appropriate measure is one that takes the grades of dysplasia into account. Percentage of exact agreement does not. It should be noted that exact agreement on our 5-point scale was noted only 50% of the time, but most of the disagreements were by only 1 point (grade) on the 5-point scale, hence, relatively inconsequential. It is unfortunate that the most appropriate measures are not intuitive to a substantial portion of readers, but we chose our primary outcome measures to be less intuitive rather than misleading. When these types of measures are used more widely, we believe they will become more readily understood.
Weinstock MA, Barnhill RL, Rhodes AR, Brodsky GL. The Diagnostic Reproducibility of Melanocytic Dysplasia—Reply. Arch Dermatol. 1998;134(8):1038-1039. doi: