[Skip to Content]
Access to paid content on this site is currently suspended due to excessive activity being detected from your IP address 34.226.244.70. Please contact the publisher to request reinstatement.
[Skip to Content Landing]
Brief Report
April 17, 2019

Interrater Reliability of Clinical Grading Measures for Cutaneous Chronic Graft-vs-Host Disease

Author Affiliations
  • 1Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina
  • 2Department of Dermatology, Duke University, Durham, North Carolina
  • 3Durham Veterans Affairs Medical Center, Durham, North Carolina
  • 4Division of Cellular Therapy, Department of Medicine, Duke University, Durham, North Carolina
  • 5Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina
JAMA Dermatol. 2019;155(7):833-837. doi:10.1001/jamadermatol.2018.5459
Key Points

Question  What is the interrater reliability of the skin-specific scores of the National Institutes of Health response criteria for chronic graft-vs-host disease?

Findings  In this study of 10 physicians (6 blood and marrow transplant specialists and 4 dermatologists) who evaluated 8 patients with cutaneous chronic graft-vs-host disease, interrater agreement was best for range of motion scoring, among all groups. Dermatologists had acceptable agreement for the skin graft-vs-host disease and skin feature scores, near perfect agreement in identifying sclerosis, and poor agreement for skin sclerosis grading.

Meaning  Although dermatologists had significant agreement in identifying cutaneous sclerosis, methods of grading severity of cutaneous chronic graft-vs-host disease appear to need improvement.

Abstract

Importance  Cutaneous chronic graft-vs-host disease (cGVHD) is common after allogeneic hematopoietic stem cell transplant and is often associated with poor patient outcomes. A reliable and practical method for assessing disease severity and response to therapy among these patients is urgently needed.

Objective  To evaluate the interrater agreement and reliability of skin-specific and range of motion (ROM) variables of the 2014 National Institutes of Health (NIH) response criteria for cGVHD and a skin sclerosis grading scale (SSG).

Design, Setting, and Participants  In this observational study performed at a single tertiary academic center, 6 academic blood and marrow transplant specialists and 4 medical dermatologists examined 8 patients with diagnosed cutaneous cGVHD on July 10, 2015. The patient cohort was enriched for patients with sclerotic features. Each patient was evaluated by using the skin-specific and ROM criteria of the 2014 NIH response criteria for cGVHD and an SSG ranging from 0 to 3. Each patient was also asked to complete quality-of-life scoring instruments. Interrater agreement and reliability were estimated by calculating the Krippendorff α and Cohen κ statistics. Data were analyzed from September 29, 2015, through November 22, 2018.

Main Outcomes and Measures  Estimation of interrater agreement by interclass coefficient (Krippendorff α and Cohen κ statistics) for the skin-specific and ROM components of the 2014 NIH Response Criteria for Chronic GVHD and for the SSG.

Results  The median age of the patients evaluated was 54 years (range, 46-58 years). Patients were predominantly male (6 [75%]). Six of the 8 patients had a predominantly sclerotic cutaneous phenotype. Interrater agreement among our experts was acceptable for NIH skin feature score (0.68; 95% CI, 0.30-0.86) and good for NIH ROM scoring (0.80; 95% CI, 0.68-0.86). Dermatologists had acceptable agreement for NIH skin GVHD score (0.69; 95% CI, 0.25-0.82) and skin feature score (0.78; 95% CI, 0.17-0.98), good agreement in ROM grading (0.85; 95% CI, 0.69-0.90), and near perfect agreement in identifying sclerosis (0.82; 95% CI, 0.27-0.97).

Conclusions and Relevance  Although dermatologists had acceptable agreement in NIH skin GVHD score and skin features score, near perfect agreement in identifying cutaneous sclerosis, better agreement in grading severity of cutaneous cGVHD, especially in the intermediate grades, appears to be needed.

×