What is the interrater reliability of the skin-specific scores of the National Institutes of Health response criteria for chronic graft-vs-host disease?
In this study of 10 physicians (6 blood and marrow transplant specialists and 4 dermatologists) who evaluated 8 patients with cutaneous chronic graft-vs-host disease, interrater agreement was best for range of motion scoring, among all groups. Dermatologists had acceptable agreement for the skin graft-vs-host disease and skin feature scores, near perfect agreement in identifying sclerosis, and poor agreement for skin sclerosis grading.
Although dermatologists had significant agreement in identifying cutaneous sclerosis, methods of grading severity of cutaneous chronic graft-vs-host disease appear to need improvement.
Cutaneous chronic graft-vs-host disease (cGVHD) is common after allogeneic hematopoietic stem cell transplant and is often associated with poor patient outcomes. A reliable and practical method for assessing disease severity and response to therapy among these patients is urgently needed.
To evaluate the interrater agreement and reliability of skin-specific and range of motion (ROM) variables of the 2014 National Institutes of Health (NIH) response criteria for cGVHD and a skin sclerosis grading scale (SSG).
Design, Setting, and Participants
In this observational study performed at a single tertiary academic center, 6 academic blood and marrow transplant specialists and 4 medical dermatologists examined 8 patients with diagnosed cutaneous cGVHD on July 10, 2015. The patient cohort was enriched for patients with sclerotic features. Each patient was evaluated by using the skin-specific and ROM criteria of the 2014 NIH response criteria for cGVHD and an SSG ranging from 0 to 3. Each patient was also asked to complete quality-of-life scoring instruments. Interrater agreement and reliability were estimated by calculating the Krippendorff α and Cohen κ statistics. Data were analyzed from September 29, 2015, through November 22, 2018.
Main Outcomes and Measures
Estimation of interrater agreement by interclass coefficient (Krippendorff α and Cohen κ statistics) for the skin-specific and ROM components of the 2014 NIH Response Criteria for Chronic GVHD and for the SSG.
The median age of the patients evaluated was 54 years (range, 46-58 years). Patients were predominantly male (6 [75%]). Six of the 8 patients had a predominantly sclerotic cutaneous phenotype. Interrater agreement among our experts was acceptable for NIH skin feature score (0.68; 95% CI, 0.30-0.86) and good for NIH ROM scoring (0.80; 95% CI, 0.68-0.86). Dermatologists had acceptable agreement for NIH skin GVHD score (0.69; 95% CI, 0.25-0.82) and skin feature score (0.78; 95% CI, 0.17-0.98), good agreement in ROM grading (0.85; 95% CI, 0.69-0.90), and near perfect agreement in identifying sclerosis (0.82; 95% CI, 0.27-0.97).
Conclusions and Relevance
Although dermatologists had acceptable agreement in NIH skin GVHD score and skin features score, near perfect agreement in identifying cutaneous sclerosis, better agreement in grading severity of cutaneous cGVHD, especially in the intermediate grades, appears to be needed.
Cardones AR, Sullivan KM, Green C, et al. Interrater Reliability of Clinical Grading Measures for Cutaneous Chronic Graft-vs-Host Disease. JAMA Dermatol. 2019;155(7):833–837. doi:10.1001/jamadermatol.2018.5459
Coronavirus Resource Center
Customize your JAMA Network experience by selecting one or more topics from the list below.
Create a personal account or sign in to: