The whiskers illustrate the minimum and maximum reported within each group; the lower and upper borders of the box, the first and third quartiles, respectively; the line in the box, the median; and the diamond, the mean change in VHI-10 score within each group. Greater global change scores were associated with greater mean changes.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Misono S, Yueh B, Stockness AN, House ME, Marmor S. Minimal Important Difference in Voice Handicap Index–10. JAMA Otolaryngol Head Neck Surg. 2017;143(11):1098–1103. doi:10.1001/jamaoto.2017.1621
What is the minimal important difference on the Voice Handicap Index–10 total score?
In this cohort study of 273 patients examined in a university voice clinic, the minimal important difference was 6 points on the Voice Handicap Index–10 total score.
A change of 6 points on the Voice Handicap Index–10 total score may represent a minimal important difference; future studies are needed to investigate how different temporal, medical, and diagnostic contexts may influence this estimate.
The minimal important difference (MID) on patient-reported outcome measures can indicate how much of a change on that scale is meaningful.
To use an anchor-based approach to estimate MID in the Voice Handicap Index–10 (VHI-10) total score.
Design, Setting, and Participants
In this cohort study, a volunteer sample of adult patients visiting the voice clinic at the University of Minnesota from April 7, 2013, through July 3, 2016, completed the VHI-10 (range, 0-40, with higher scores indicating greater voice-related handicap) at baseline and 2 weeks later in conjunction with a global rating of change. An anchor-based approach was used to identify an MID. The association between the global change score and change in VHI-10 score was analyzed using Pearson rank correlation. A distribution-based method was used to corroborate the findings.
Main Outcome and Measures
Global rating of change on the VHI-10.
Of the 273 participants, 183 (67.0%) were women and 90 (33.0%) were men (mean [SD] age, 54.3 [15.6] years); 259 (94.9%) were white. Participants had a variety of voice disorders, most commonly muscle tension dysphonia, irritable larynx, benign vocal fold lesions, and motion abnormalities. Among patients reporting no change on the global change score, the mean (SD) change in VHI-10 score was 1 (5). Among those reporting a small change, the mean (SD) change in VHI-10 was also 1 (5). Among those reporting a moderate change in voice symptoms, the mean (SD) change in VHI-10 score was 6 (8). Among those with a large change, the mean (SD) change in VHI-10 score was 9 (13). The correlation between the global change score and the change in VHI-10 score was 0.32 (95% CI, 0.12-0.49). Distribution-based analyses identified effect sizes comparable to those of the anchor-based categories.
Conclusions and Relevance
These findings suggest that a difference of 6 on the VHI-10 may represent an MID. This difference was associated with a moderate change on the global rating scale, and the small-change and no-change categories were indistinguishable. Given the lack of differentiation between small and no change and the modest correlation between the global change score and change in the VHI-10 score, additional studies are needed.
Minimal important difference (MID) has been defined as the “smallest difference in score in the domain of interest which patients perceive as beneficial”1(p408) and underscores the importance of interpreting numerical findings within the context of the degree of their clinical impact.2 Specifically assessing what degree of numerical change on a symptom or a quality-of-life measure is meaningful to individuals with a relevant condition can allow investigators and others to look beyond whether a difference is statistically significant to determine its importance to patients.
Multiple approaches for determining the MID have been proposed that can generally be categorized as anchor-based and distribution-based methods.3 The anchor-based methods use a separate questionnaire instrument as a comparator for the index of interest, and the distribution-based methods estimate effects based on the distribution of scores on the scale of interest in a relevant population. Other studies4,5 have examined MIDs derived using a variety of approaches for a wide range of quality-of-life or symptom scales and have observed that MIDs frequently an approximate 0.5 SD of measured scores on the indices.
The significance of a small change may depend on the perspective of the beholder,6 with potentially different magnitudes of change being important at the individual patient level vs at a population level. The relative contribution of different domains to a total outcome measure score and individual vs group responses may also influence interpretation of MIDs.7 The focus of this study was to identify patient-reported MIDs.8,9
The Voice Handicap Index–10 (VHI-10; range, 0-40, with higher scores indicating greater voice-related handicap)10 is a widely used shorter version of the 30-item VHI.11 The MID has been used to evaluate changes in other disease processes, such as asthma,8 but has not been fully explored with regard to voice outcomes. This information would be useful to interpret changes in the clinical setting and for calculations of power and sample size in study design. The primary objective of this study was to apply an anchor-based method to estimate an MID on the VHI-10; the secondary objective, to use a distribution-based method as a comparator to the anchor-based method. We hypothesized, based on existing literature reflecting expert opinion,12-14 that a change of 5 would be shown to be an MID on the VHI-10 using these methods. To our knowledge, no prior study has aimed to formally establish an MID for the VHI-10.
A single invitation to participate was emailed to patients who visited the voice clinic of the University of Minnesota, Minneapolis, from April 7, 2013, through July 3, 2016; who were included in the clinic research registry; and who had indicated an interest in hearing about future research study opportunities (N = 761). Reminders and response tracking were not performed, and participants were not offered a stipend. All data were stored online using the Research Electronic Data Capture (REDCap) system15 through the University of Minnesota. Data were extracted from the REDCap database and analyzed using SAS software (version 9.3; SAS Institute). The study was approved by the institutional review board of the University of Minnesota. All participants provided written informed consent.
All participants completed the VHI-1010 on paper forms or electronically as part of routine clinical care. The VHI-10 has been described as having good reliability, validity, and treatment responsiveness.10,16-18 Two weeks later,19 participants were invited via email to complete an online global questionnaire about changes in their voice-related quality of life. This question was adapted from a previously established anchor-based method to assess MIDs in patients with asthma,8 and the word voice was substituted for breathing to render it appropriate for patients with voice concerns. Patients could indicate whether their voice-related quality of life was better, about the same, or worse. Patients responding better or worse were then asked to report the level of change; for example, “If better, please check one of the following boxes: A little better, somewhat better, moderately better, a good deal better, a great deal better, a very great deal better.” Each potential response was assigned a numeric value (Table 1). Participants also completed the VHI-10 online at 2 weeks. No systematic voice-related interventions occurred during this interval. A 2-week time frame has been used in prior studies of MID19,20 and was chosen to reduce potential confounding effects and optimize feasibility in this initial study of voice-related MID.
In an anchor-based approach, changes in the outcome of interest are compared with corresponding changes in a global question measuring well-being or treatment effect that clinicians can easily interpret.8 The anchor-based approach uses external clinical or patient-reported markers to distribute patients into several different groups that reflect a level of change in clinical or health status. Anchor-based estimates were summarized using means, medians, and ranges of all available global scales of change. We used the Pearson correlation to assess the associations between individual total VHI-10 scores at baseline and after 14 days. Given the skewed distribution of our data, in which more patients indicated that they had improved than worsened, we also used the Spearman rank correlation, with similar findings. In accordance with prior work by Juniper et al,8 the absolute values of change on the global scale were categorized into magnitudes of change. Absolute change of 0 to 1 was categorized as no change; 2 to 3, as small; 4 to 5, as moderate; and 6 to 7, as large. Means and SDs for the change in total VHI-10 score within each of the change categories were calculated. Correlations between change on the global scale and baseline, follow-up, and change in VHI-10 total score were also calculated using Pearson correlations, as previously described.21
A second approach, the distribution-based method, relates the change in the outcome of interest to some measure of its normal variability, for example, the sample SD of the measure at baseline. This method was used to examine the magnitude of effect size that was represented by the MID identified using the aforementioned anchor-based method. The effect size technique that we used, attributed to Cohen,22 was calculated as the standardized measure of change obtained by dividing the difference in scores from baseline to after treatment by the SD of absolute change of baseline scores over the percentage score change.8,9,22 This effect size was then adjusted to account for the dependence between samples (formula for standardized response mean),23 with the Cohen d value representing the effect size for independent samples and the r value representing correlation between baseline and outcome. Although their application is limited, recommended cutoffs by Cohen22 to aid the interpretation of effect sizes are defined as small (effect size of 0.2), moderate (effect size of 0.5), and large (effect size of 0.8).
Post hoc analyses examining the impact of typical vocal demand were also completed. All patients in the voice clinic were asked to rate their typical vocal demand in 1 of the following 5 categories: undemanding, intermittent, routine, extensive, and extraordinary. To determine whether routine vocal demand influenced responses, patients reporting extensive and extraordinary vocal demands underwent analysis together as a subgroup. Anchor- and distribution-based analyses were conducted in this group as described above.
A total of 273 patients 18 years or older who had 14-day follow-up VHI-10 data were identified for analysis (183 women [67.0%] and 90 men [33.0%]; mean [SD] age, 54.3 [15.6] years). Distributions of patient demographics and voice-related clinical diagnoses were consistent with those previously published from this patient population,24 including a predominance of white race, a wide range of ages, female predominance, and most frequent diagnoses including muscle tension dysphonia, irritable larynx syndrome, and benign vocal fold lesions (Table 2).
Most respondents reported that their voice-related symptoms were about the same (165 [60.4%]), with 90 (33.0%) reporting that their symptoms were better and 19 (7.0%), worse. Mean (SD) total VHI-10 score at baseline was 20.0 (9.6) and at 14 days, 17.5 (8.4). The mean (SD) change score for all respondents was 3.0 (6.5).
Distributions of the change of the VHI-10 score after 14 days are shown in the Figure. For anchor-based analysis, we observed that the VHI-10 Δ range was smallest (−12 to 15) for patients reporting a small change after 14 days and largest (4 to 34) for the patients reporting a large change after 14 days (Table 3). The mean (SD) change score was 1 (5) for the no-change and small-change groups, 6 (8) for the moderate-change group, and 9 (13) for the large-change group. Pearson correlation between baseline and follow-up scores was 0.74 (95% CI, 0.68-0.79), and the Spearman correlation coefficient was 0.75 (95% CI, 0.69-0.79). The correlation between global change and baseline score was 0.08 (95% CI, −0.04 to 0.19); between global change and follow-up score, −0.14 (95% CI, −0.25 to 0.02); and between global change and VHI-10 Δ, 0.32 (95% CI, 0.12-0.49), supporting the validity of the global change score.
To corroborate the anchor-based findings, the estimated changes were also evaluated in a distribution-based fashion. The Cohen d was calculated for each category of change. We observed that a small change corresponded to an effect size of 0.14, a moderate change corresponded to an effect size of 0.71, and a large change corresponded to an effect size of 0.98 (Table 3).
Post hoc subgroup analysis was performed to assess the effect of routine vocal demand on patient responses. This analysis showed that in the 93 patients with extensive or extraordinary vocal demand, the no-change and small-change categories remained indistinguishable (Table 4). In these patients, a moderate change was associated with a mean (SD) VHI-10 change score of 8 (10; range, −4 to 29) and a large change with a mean (SD) change score of 15 (12; range, 4 to 34). Associated effect sizes are listed in Table 4.
Anchor-based analyses suggested that a mean change of 6 in the total VHI-10 score was associated with an MID, which was considered to be moderate in size by this group of 273 respondents. As would be expected, observed effect sizes using a distribution-based approach demonstrated effect sizes that were larger for categories reflecting greater change.22 The effect size corresponding to the MID of 6 that was identified using the anchor-based method was moderate. Anchor-based and distribution-based approaches have been heavily used in the literature, with some authors6 proposing integration of the 2 approaches and others25 suggesting that an anchor-based absolute change method may be more robust. For this study, we used simple and well-established methods for each approach with the goal of identifying an MID that could serve as a basis for future research in the area.
Based on clinical experience, experts have estimated a difference of 5 as a clinically important difference on the VHI-10,12 and this difference is comparable to the 0.5-SD estimate in other studies that have used the VHI-10 as an outcome measure. For example, prior reported SDs have included 9.5 in a single-center study of 533 new patients to the same tertiary care voice clinic as in the present study,24 8 in a study of 170 patients in a national practice-based research network,26 and 7.2 in a study of 139 patients with spasmodic dysphonia,14 suggesting that the 0.5-SD approach4 might identify 3.6 to 4.75 as an MID. On the 30-item VHI, a difference of 13 to 16 points has been proposed to be clinically meaningful27; scaling this difference according to proposed conversion factors for VHI to VHI-1010 suggests that an MID would be expected to range from 3.7 to 7.2 points on the VHI-10, as the findings did herein. In post hoc subgroup analysis of patients reporting extensive or extraordinary routine vocal demand, moderate and large degrees of change were associated with slightly larger differences in VHI-10 scores. The differences between responses from those with extensive or extraordinary vocal demand and the overall study population were not statistically significant and may not be large enough for meaningful interpretation. Of note, the same and small-change categories were comparable in the extensive or extraordinary vocal demand group; these findings were the same in the overall study population.
We estimated the MID from the patient’s perspective as recommended by Schünemann and Guyatt,28 who advocated for a shift away from minimal (clinically) important difference to minimal important difference to emphasize the central importance of the patient perspective29 when interpreting patient-reported outcome measures.28 Findings were gathered in a broad sample of patients with voice disorders, contributing to the robustness of the study. In addition, we observed a higher correlation of the global change score and the VHI-10 Δ value than with the baseline or follow-up scores, providing some evidence of the validity of the transition. This correlation was modest, suggesting that further investigation into the MID is warranted.
The finding that the small-change and no-change categories had comparable means suggests that perhaps a small change was not believed to be important or was difficult to distinguish by the participants, even those with extensive or extraordinary vocal demand. This finding was somewhat surprising because this scale has previously been described as having good sensitivity to change30; however, vocal function and voice-related handicap can be affected by a wide range of factors, including environment and vocal demand.31,32 Some items of the VHI-10 also may be more sensitive to change than others. For example, the item, “My voice problem upsets me,” may change more rapidly than the item, “My voice problem causes me to lose income,” because the VHI-10 was not designed around a specific time frame for assessment. These scale characteristics may influence patient perception of change.
Limitations of this study are important to acknowledge. First, the participants were volunteers from an academic voice clinic research registry. The baseline scores of the participants who volunteered for this study were not statistically different from those in the research registry who did not participate in this study, suggesting that the findings are not heavily influenced by selection bias; however, findings from a single academic voice center cannot necessarily be assumed to generalize to the entire population. The sample was of limited racial diversity and generally educated, reflecting the population of Minnesota and our clinic, respectively. Future studies would benefit from broader participant diversity. Second, most of the patients reporting a change indicated that their voice-related quality of life had improved. Although this change is clinically desirable, the MID derived from these data reflects a minimal important improvement, which cannot necessarily be assumed to be the same as a minimal important decrement. Another potential limitation is the incorporation of multiple types of media, including electronic completion and paper completion. However, a recent unrelated shift at the institutional level from the use of paper forms to electronic forms has not led to detectable differences in population mean baseline VHI-10 score among patients who seek voice care, suggesting that the media used for questionnaire completion is unlikely to have greatly influenced the findings. Nonetheless, follow-up studies will be conducted with all questionnaires administered electronically.
This study was conducted with a 2-week follow-up time in a voice clinic population with a variety of laryngeal diagnoses. Future studies would benefit from longer follow-up to determine whether different time frames may influence patient responses. In addition, different voice-related diagnoses may be associated with different magnitudes of MID33 because the VHI-10 is not diagnosis-specific. Typical vocal demand, voice treatment, and psychosocial or other factors may influence responses on the VHI-1024 or on the global change scale. Larger studies would facilitate investigation of these issues.
In this study examining MID in VHI-10, small change and no change on an anchoring global scale was associated with indistinguishable changes on total VHI-10 score. A moderate change was associated with a mean change of 6 in the total VHI-10 score and represented an MID in this sample of patients with a variety of voice-related diagnoses. The correlation between the global change score and change in VHI-10 score was modest, suggesting that additional investigation is needed. Further studies, including additional time points, larger sample sizes, and inclusion of other patient factors, could provide more evidence for an MID in this widely used scale.
Corresponding Author: Stephanie Misono, MD, MPH, Department of Otolaryngology, 420 Delaware St SE, MMC Box 396, Minneapolis, MN 55410 (email@example.com).
Accepted for Publication: June 30, 2017.
Published Online: September 28, 2017. doi:10.1001/jamaoto.2017.1621
Author Contributions: Drs Misono and Marmor had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Misono, Yueh, Marmor.
Acquisition, analysis, or interpretation of data: Misono, Stockness, House, Marmor.
Drafting of the manuscript: Misono, Marmor.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Marmor.
Obtained funding: Misono.
Administrative, technical, or material support: Misono, Yueh, Stockness, House.
Study supervision: Misono.
Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest and none were reported.
Funding/Support: This study was supported by grants KL2TR000113 and UL1TR000114 from the National Institutes of Health (Dr Misono).
Role of the Funder/Sponsor: The sponsor had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Additional Contributions: Gordon Guyatt, MD, MSc, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada, provided critical review and input. No compensation was received for this contribution.