Digital picture of maximum glottal opening. A, Patient with bilateral vocal cord paralysis. B, Healthy control subject. GA indicates glottal area; s, standard distance in pixels.
Göttingen Hoarseness Diagram showing voice quality. Ellipses reflect the distributions of vowel irregularity and noise (center of ellipses, group mean; semiaxes, SDs). *P ≤ .01 for the study group with bilateral vocal cord paralysis (BVCP) vs normal, aphonic, untreated vocal cord paralysis voice data as published by Fröhlich et al.10
Harnisch W, Brosch S, Schmidt M, Hagen R. Breathing and Voice Quality After Surgical Treatment for Bilateral Vocal Cord Paralysis. Arch Otolaryngol Head Neck Surg. 2008;134(3):278-284. doi:10.1001/archoto.2007.44
Copyright 2008 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2008
To evaluate long-term results of surgical treatment for bilateral vocal cord paralysis using objective and subjective measures of breathing and voice quality.
Prospective cross-sectional case series.
Tertiary care otolaryngology and speech pathology referral center.
Ten patients with bilateral vocal cord paralysis who underwent surgical treatment between October 1996 and May 2006 at the Department of Otorhinolaryngology–Head and Neck Surgery, University of Würzburg, were examined at a mean of 27.2 months after surgery.
Main Outcome Measures
Glottal area, voice range profile, Voice Handicap Index, pulmonary function test results, Göttingen Hoarseness Diagram, microlaryngostroboscopic findings, chronic respiratory disease questionnaire, and European Organization for Research and the Treatment of Cancer quality-of-life questionnaire, including the head and neck module.
Residual recurrent nerve function was seen in 9 of 10 patients. Pulmonary data varied widely and did not correlate with the size of the glottal area. Quality of life, subjective dyspnea, and physical functioning correlated with expiratory airflow measures. Voice range was reduced in all patients. High breathiness and reduced maximum phonation time led to increased Voice Handicap Index scores.
Microlaryngostroboscopic findings did not necessarily correlate with subjective dyspnea and vocal complaints. Reduction of inspiratory speaking efforts and acquisition of special breathing techniques improve airflow stability and effectiveness of respiration, leading to enhanced quality of life.
To relieve upper airway obstruction caused by bilateral vocal cord paralysis (BVCP), several techniques for surgical glottal widening have been developed. Previous studies evaluated the efficacy of different approaches such as vocal cord laterofixation,1 arytenoidectomy,2,3 cordectomy,4 and posterior transverse cordotomy.5,6 Individual success for increased airflow is often compromised by impaired vocal function and deglutition. Patient satisfaction is influenced by the surgeon's ability to maintain a balance between those various parts of laryngeal function. The objective of this study was to evaluate long-term results of surgical glottal widening by using objective methods and symptom-related quality-of-life questionnaires. Ethics approval was obtained before enrolling subjects in the study.
An overview of the patient characteristics and interventions is given in Table 1. Ten patients (2 men and 8 women) aged 41 to 76 years (mean age, 66.5 years) who underwent surgical glottal widening for BVCP between October 15, 1996, and May 8, 2006, at the Department of Otorhinolaryngology–Head and Neck Surgery, University of Würzburg, Würzburg, Germany, gave their informed consent to participate in the study. Subjects with a tracheostoma were excluded to allow comparison of pulmonary function test (PFT) results. The cause of BVCP was strumectomy (7
patients), idiopathic (2 patients), or thoracic surgery (1 patient). Initially, 2 patients had been treated with posterior cordectomy, while 1 patient received a laterofixation and 7 patients underwent either conventional (1 patient) or laser surgical (6 patients) posterior transverse cordotomy. None of the subjects received any topical agent such as mitomycin C. Surgery was performed by different senior consultants of our clinic (R.H. and coworkers). In 4 patients, revision surgery was necessary because of scarring (3 patients) or postoperative bleeding. Posterior transverse cordotomy was performed in all cases. One patient developed a polyp of the vocal fold, which was removed 3 months after glottal widening. Postoperative treatment included voice rest for 3 to 7 days (mean, 4.2 days) and anti-inflammatory medication as needed. Most of the patients received intravenous prednisolone-21-hydrogen succinate, 250 to 500 mg, once during the first postoperative week. In 4 patients, daily inhalation of budesonide was performed. One patient had been taking antireflux medication before surgical glottal widening, which was continued after surgery. The remaining patients were not using proton pump inhibitors or histamine2 blockers such as omeprazole or ranitidine hydrochloride. Nine of 10 patients were nonsmokers at the time of intervention and follow-up. One woman had been smoking for more than 30 years and continued to smoke after glottal widening. Seven patients attended postoperative speech therapy to reduce inspiratory speaking efforts and to improve breathing technique. The study protocol was conducted during a single visit at least 6
months (range, 6-57 months; mean, 27.2 months) after the last surgical vocal cord procedure.
Microlaryngostroboscopy was performed and videotaped by one of us (S.B.). Laryngeal status was assessed by mucosal wave, residual movement, vocal cord position, additional pathologic findings, and evaluation of laryngeal function during swallowing of colored water. A frame in the state of maximum glottal opening was selected for subsequent analysis.
Measurement of genuine glottal area (GA) during microlaryngostroboscopy requires special laryngoscopes with scales that take angle, distortion, and resolution of the lenses into account. A relative measure of GA can be derived by analyzing a digital picture of maximum glottal opening and by computing the GA fraction compared with a standard measure. As shown in Figure 1, we used the distance between the posterior laryngeal commissure and the anterior vocal fold angle during maximum glottal opening as the standard distance in pixels. The GA index (GAi) was defined as the fraction of GA of the standardized area (s) in pixels squared, as shown in the following equation: GAi = GA/s2. Microlaryngoscopic videotapes of 40 healthy adults without disturbance of vocal cord movement were analyzed as a reference group.
Pulmonary function tests were administered by a single operator in the Department of Medicine using a commercially available device (JAEGER Master Screen Body; VIASYS Healthcare GmbH, Höchberg, Germany). Full calibration and verification of equipment were performed before each set of tests. Measurements included registration of flow-volume loops during forced ventilation and body plethysmography. We analyzed forced vital capacity (FVC), forced expiratory volume in the first second of expiration (FEV1), peak expiratory flow (PEF), forced inspiratory volume in the first second of inspiration (FIV1), and peak inspiratory flow (PIF), as well as intrathoracic gas volume, total airway resistance, resistance during inspiration, and resistance during expiration.
During clinical assessment, all patients were asked whether they experienced shortness of breath at rest or at exertion. However, during subsequent analysis, a more detailed characterization of subjective dyspnea seemed desirable. Therefore, we administered the German version (originally developed by Guyatt et al7) of the Standardized Self-administered Chronic Respiratory Disease Questionnaire8 (CRQ-SAS). The CRQ-SAS is divided into 4 domains: dyspnea, fatigue, emotional functioning, and mastery. Answers are given on a 7-point scale expressing the degree of disability from 1 (maximum impairment) to 7 (no impairment). The standardized dyspnea domain comprises 5 items validated by patients, with chronic respiratory diseases being the most important everyday life activities causing shortness of breath.7 All patients received the questionnaire via mail after their visit. Previous studies5,6 showed stable breathing and voice measures after the sixth postoperative month in patients who underwent surgical treatment for BVCP. When returning the questionnaire, none of our patients reported any changes in breathing or quality of life since their last visit. Therefore, objective measures obtained during the study visit could be correlated with subsequently reported CRQ-SAS scores.
For each patient, a voice range profile (phonetogram) was recorded using commercially available equipment (lingWAVES Phonetogram Plus; lingCOM GmbH, Forchheim, Germany). Analysis included minimum and maximum intensity in decibels, dynamic range in decibels, minimum and maximum frequency in hertz, frequency range on a logarithmic halftone scale, and the mean fundamental frequency. Maximum phonation time in seconds was recorded during voicing of the vowel /a/ after maximal inspiration at spontaneous comfortable pitch and loudness. The phonatory quotient was calculated by dividing vital capacity in liters by maximum phonation time in seconds. Voice acoustics were evaluated using standard variables such as jitter and shimmer, as well as the Göttingen Hoarseness Diagram (GHD). The GHD is a voice analysis software program that was developed by Michaelis et al9 and provides objective voice analysis even in irregular voices. The recording protocol requires phonation of 4 series of the vowels /ε/, /a/, /e/, /i/, /o/, /u/, and /ε/, during which the patient phonates at comfortable pitch (first series), at low pitch (second series), at high pitch (third series), and again at comfortable pitch after reading a standardized text passage (fourth series [afterload]). The stationary part of the signal is used for computerized analysis. Results are plotted in a 2-dimensional diagram, where x values reflect aperiodicity of the voice (irregularity component), calculated from jitter, shimmer, and period correlation, and y values represent the noise component of the signal, calculated from the glottal-to-noise excitation ratio. To allow easy interpretation, ellipses are used to illustrate the distribution of single-vowel measures (28 vowels per test). The center of the ellipses represents the mean value of the noise and irregularity component, while semiaxes are defined by the associated standard deviation. The GHD offers a graphic illustration of voice quality that has proven to be statistically significantly different for specific pathophysiologic phonation conditions.10
A standardized reading passage recorded during measurement of the GHD was evaluated by 6 experienced listeners (4 phoniatric specialists and 2 speech therapists [S.B. and coworkers]) according to a simplified version of the GRBAS scale, which is widely used in German-speaking clinics. Grades of hoarseness (G), roughness (R), and breathiness (B) are reported using a 4-point scale ranging from 0 (no deviance)
to 3 (severe deviance). Speech samples were presented anonymously in random order. Within the test set, 1 sample was presented twice to evaluate intrapersonal judgment reliability.
To assess patients' voice-related quality of life, the German version of the Voice Handicap Index (VHI), validated by Nawka et al11 was applied. The questionnaire contains 30 items in 3 subscales (functional, emotional, and physical [10 items per subscale]), designed to quantify patients' self-assessment of everyday voice handicap. Answers are given on a 5-point scale ranging from 0 (never) to 4 (always). The overall VHI score (raw score) can be used to grade subjective handicap from 0 (no handicap [raw score, 0-14]) to 3 (severe handicap [raw score, 51-120]).
The European Organization for Research and the Treatment of Cancer quality-of-life questionnaire (core questionnaire, QLQ-C30
version 3.0),12 including the head and neck module (QLQ-H&N35), was used for assessment of overall and ear, nose, and throat–specific quality of life. The core questionnaire is composed of 30 items on quality of life and various symptom and function scales, while the head and neck module consists of 35 symptom-related questions. On the symptom and function scales, answers are binominal (yes or no) or on a scale ranging from 1 (not at all) to 4 (very much). Quality of life is assessed by an analog scale ranging from 1 (very bad) to 7 (excellent). For interpretation of scores, data from the QLQ-C30 and QLQ-H&N35 subdomains were compared with published reference values,13,14 as recommended by the authors of the questionnaire.12 Michelson et al13 presented QLQ-C30 data from a large sample (range, 1536-1613) from a healthy Swedish population. Bjordal et al14 published QLQ-C30
and QLQ-H&N35 scores from a cohort of patients with head and neck cancer grouped into newly diagnosed (n = 204), recurrent (n = 58), and disease-free (n = 360) subjects.
Normal distribution of data was verified. Correlation analyses were performed among size of GA, PFT findings, objective data of voice quality, and all questionnaire results. Published quality-of-life data in healthy subjects13 and in patients with head and neck cancer14 as reflected by the European Organization for Research and the Treatment of Cancer quality-of-life questionnaire were used for comparison. An independent-sample t test was conducted to verify statistical significance of observed differences. Computerized statistical analysis was performed using available software (R, version 2.2.1; http://www.r-project.org).
All patients experienced various degrees of hoarseness and dyspnea during moderate physical activity but denied difficulties during deglutition. Patients' reports on the number of stairs they were able to climb gave unreliable results. Random testing (climbing stairs in the clinic, supervised by medical staff) revealed that subjects were often overestimating or underestimating their true capacity.
Minimal vocal cord movement was present in 8 patients (6 unilaterally and 2 bilaterally), and all showed residual mucosal waves. In 1 patient, unilateral mucosal wave without visible vocal cord movement was evident. Deglutition of colored water was undisturbed in all patients.
In healthy subjects, the GAi ranged from 0.286 to 0.716 (mean [SD], 0.438 [0.099]). No statistically significant difference between men and women was found. Compared with the control group, patients with BVCP had a statistically significantly lower GAi (range, 0.071-0.214; mean [SD], 0.122 [0.050]; P < .001). Glottal area was sufficient for low to moderate physical activity in 9 of 10 patients. One woman had dyspnea at rest and had the lowest GAi (0.071) in the group. Additional surgery was recommended to relieve airway obstruction.
An overview of respiratory measures is given in Table 2. The mean PFT results were within normal values for FVC and for intrathoracic gas volume. Expiratory and inspiratory flow measures (FEV1, PEF, FIV1, and PIF) were reduced, while resistance (total airway resistance, resistance during inspiration, and resistance during expiration) was increased. No statistically significant correlation was found between PFT results and the GAi. Although patient 3 had the highest GAi (0.214)
in the group, and resistance within normal values, he exhibited the smallest PEF (1.58 [19.2% of predicted]) in the group and a small PIF (1.34). In contrast, patient 2, with the lowest GAi (0.071) in the group, had a mean PEF of 2.49 (44.8% of predicted) and the third largest PIF (1.76) in the BVCP group, while resistance was moderately increased.
One patient did not return the questionnaire. Among the returned forms, no data were missing. In the dyspnea domain of the CRQ-SAS, patients reported themselves on average to be “quite a bit short of breath” (mean [SD] score, 3.24 [0.51]). Dyspnea was worst when subjects were “angry or upset” (mean [SD] score, 2.25 [0.89] [“very short of breath”]). In general, patients were “moderately tired” (mean [SD] score, 3.67 [1.17])
and “some of the time” were able to master their breathing difficulties (mean [SD] score, 4.44 [0.58]). Emotional functioning was reported to be “a good bit of the time” impaired (mean [SD] score, 3.54 [0.63]). A positive correlation (r = 0.75, P =.02)
was found between the CRQ-SAS dyspnea score and PEF. In addition, a statistically significant correlation between self-reported mastery and FIV1 (r = 0.73, P =.03), as well as between self-reported mastery and resistance during inspiration (r = −0.70, P =.04), was evident.
An overview of voice quality measures is given in Table 3. Voice range was reduced in all patients but varied widely. Patient 8 had normal values in all categories except shimmer (score, 15.8) and aerodynamics (maximum phonation time, 9.3 seconds; phonatory quotient, 0.35 L/s). Computerized voice analysis (based on the GHD) showed a mean (SD) vowel irregularity of 6.12 (1.12)
and a mean (SD) noise component of 2.91 (0.70). Ellipses reflecting the distribution of means (center of ellipses) and the standard deviations (semiaxes) are shown in Figure 2. Only complete data sets based on analysis of 28 vowels per patient were included. For comparison, published data10 of normal and aphonic voices, as well as of subjects with untreated vocal cord paralysis, were added. Analysis revealed statistically significant differences between the BVCP group and normal voices (irregularity, P < .001; noise, P =.001), aphonic voices (irregularity, P < .001; noise, P =.007), and voices of patients with untreated vocal cord paralysis (irregularity, P =.01; noise, P =.01).
Perceptive voice evaluation according to the GRBAS scale varied widely between listeners. Patients' scores ranged from G1R1B0 to G3R3B2, with mean (SD) values of 2.0 (0.67) for G, 1.6 (0.7) for R, and 1.4 (0.84) for B. Self-reported voice handicap was in general moderate to severe (mean [SD],2.5 [0.8]), with raw scores ranging from 25 to 96 (mean [SD], 54.8 [19.5]). Patient 8 scored herself as “mildly handicapped” and obtained the best GRB grades in the group. Computerized voice analysis assessed her voice quality as close to normal on the GHD. For the remaining patients in the study group, no correlation between VHI score, GRB grade, and GHD position was found. Neither the GAi nor PFT results were correlated with any objective or subjective voice measure.
Patients reported their overall quality of life on average as “moderate” (mean [SD] quality of life, 49.17 [14.41]), which was statistically significantly lower than that in healthy controls (P =.003)13 and in patients with cancer who were newly diagnosed (P =.02) or in complete remission (P =.001).14 Physical functioning and social functioning were impaired as well, with scores in patients with BVCP (mean [SD], 58.67 [15.01] for physical functioning; mean [SD], 41.67 [36.22] for social functioning) being statistically significantly lower than those in healthy controls (P < .001 for physical functioning and P =.004 for social functioning)
or in patients with head and neck cancer (P < .05 for all categories). On the function scales, fatigue and dyspnea were scored statistically significantly higher in the BVCP group (mean [SD], 53.33 [25.01] for fatigue; mean [SD], 66.67 [35.14] for dyspnea) than in healthy subjects (P =.002 for fatigue and P =.002
for dyspnea) or in patients with head and neck cancer (P < .01 for all categories [except for P =.08 for fatigue in the group with recurrence]). The other subdomains did not demonstrate statistically significant differences. In some head and neck–specific items of the QLQ-H&N35
(such as opening mouth, social eating, and pain), patients with BVCP scored statistically significantly lower than patients with cancer (P < .01), indicating fewer symptoms. In contrast, the presence of laryngeal palsy led to increased speaking difficulties compared with patients with cancer who were newly diagnosed or disease free (P < .05).
Analysis revealed a statistically significant correlation between expiratory PFT results and subjective quality of life and physical functioning. Quality of life was statistically significantly correlated with resistance during expiration (r = 0.73, P =.02), while physical functioning was correlated with the ratio of FEV1 to FVC (r = 0.68, P =.03). Cross-sectional validity between the QLQ-C30 and the CRQ-SAS was high for the dyspnea domain (r = −0.84, P =.002) but was low for the fatigue domain (r = −0.62, P =.08) and the emotional domain (r = 0.25, P =.52).
The objective of the study was to evaluate long-term results of glottal widening surgery on breathing, voice quality, and deglutition as reflected by objective measures and self-assessed quality of life. Ten patients who received surgical treatment for BVCP at least 6 months before clinical evaluation were included in our study. Previous studies15,16 reported cases of aspiration after glottal widening. However, consistent with more recent studies,6,17 neither subjective nor objective swallowing difficulties occurred in our patients, indicating improved surgical techniques. Analysis of maximum glottal opening revealed similar GAis in the BVCP group, which were statistically significantly lower than those in healthy controls. In contrast, outcomes in terms of voice quality and dyspnea varied widely.
From a respiratory point of view, BVCP causes variable extrathoracic stenosis, resulting in reduced inspiratory flow measures (PIF and FIV1) and almost normal expiratory flow measures (PEF and FEV1) reflected by PEF/PIF ratios greater than 1.18 Although PEF was reduced in all patients, the PEF/PIF ratio exceeded 1 (mean [SD], 2.09 [0.69]). No statistically significant correlation was found between the GAi and PFT results. Because flow measures are based on forced maneuvers and the GAi was derived from pictures during quiet breathing, additional factors might apply such as increased turbulence or the Bernoulli effect (passive movement of paralyzed vocal cord into the line of flow due to suction during inspiration), which were not evident during microlaryngostroboscopy. Previous studies3- 6 on surgical outcomes compared preoperative and postoperative PFT results, demonstrating a statistically significant relationship between glottal widening and an increase in flow measures and a respective decrease in airway resistance. Our study did not investigate PFT changes caused by surgical treatment but compared the present glottal status with PFT results and subjective dyspnea symptoms. Because no correlations between the GAi and lung function measures were found, the GAi cannot serve as a predictor of pulmonary function or subjective dyspnea. However, because GAi measurement was based on microlaryngostroboscopy findings during quiet breathing, future investigations should evaluate the effect of forced inspiration and expiration on GA in patients with BVCP.
For surgical outcomes, the effects on a patient's quality of life and physical capacity are of major interest. In patients with chronic pulmonary diseases, previous studies19 tried to establish physiologic measures as surrogate markers for subjective dyspnea, but no single lung function measure sufficiently described the effect of shortness of breath on patients' daily life. In addition, changes in subjective dyspnea (eg, during rehabilitation or short-term bronchodilatation) did not result in corresponding changes in PFT results (for a review, see Ries19). Therefore, the use of PFTs should be reconsidered. Ries19 suggests that dyspnea-specific quality-of-life questionnaires would be more suitable measures. The CRQ-SAS scores of our patients with BVCP in the dyspnea, fatigue, and mastery domains were similar to those reported by subjects with chronic obstructive pulmonary disease.20 In the emotional domain, our patients with BVCP scored statistically significantly lower than the subjects with chronic obstructive pulmonary disease (P =.01),20 indicating reduced emotional functioning. This may be related to additional handicaps in patients with BVCP such as voice-related problems. Concerning evaluation of physical capacity, it was an unreliable measure for subjects to state the number of stairs that he or she is able to climb without shortness of breath. Patient ability to recall this information varies, and true capacity was often overestimated or underestimated. Further studies should include more reliable exercise measures such as the 6-minute walk test.21
Perceptive voice evaluation according to the simplified GRBAS scale varied widely. Intrapersonal reliability was weak. In some cases, experienced listeners obtained different scores when listening twice to an identical voice sample. Computerized voice analysis using the GHD has been proven to be a useful tool for discrimination of pathologic voices and for longitudinal voice evaluation.10 Statistically significant differences of objective voice measures were found between our subjects and published data of normal voices, aphonia, and voices of patients with untreated vocal cord paralysis.10 Similar results were published by Olthoff et al,4 who examined 17 patients who had been treated by laser surgical bilateral posterior cordectomy. Comparison of their postoperative noise and irregularity component findings with those of our cohort revealed no statistically significant difference (P =.87 for irregularity; P =.83
In a single case of mild voice pathologic function, self-assessed VHI score corresponded to GHD position. The remaining patients, who reported themselves as being moderately to severely impaired, could not be distinguished using the GHD data. Further studies, including a larger number of subjects, should investigate the relationship between VHI scores and GHD position. In patients in whom speech therapy led to effective changes in breathing technique and speaking habits, a positive effect on subjective physical capacity and voice quality was visible. Detailed information about how those techniques can improve quality of life should be provided during consultations to increase patients' motivation for speech and breathing therapy.
Surgical treatment of BVCP is a safe and effective method to improve patients' quality of life. However, microlaryngostroboscopic findings did not necessarily correlate with subjective dyspnea symptoms, PFT results, and vocal complaints. Specific quality-of-life questionnaires can be useful for evaluating the everyday effect of glottal widening. Voice rehabilitation should be monitored objectively using computerized voice analysis, which is superior to perceptive voice evaluation. Reduction of inspiratory speaking efforts and acquisition of special breathing techniques improve airflow stability and effectiveness of respiration, leading to enhanced quality of life and patient satisfaction.
Correspondence: Wilma Harnisch, Dr med, Department of Otorhinolaryngology–Head and Neck Surgery, University of Würzburg, Josef-Schneider Strasse 11, 97080 Würzburg, Germany (Wilma.Harnisch@uni-wuerzburg.de).
Submitted for Publication: March 19, 2007; final revision received May 31, 2007; accepted July 16, 2007.
Author Contributions: Dr Harnisch had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Harnisch and Brosch. Acquisition of data: Harnisch, Brosch, and Schmidt. Analysis and interpretation of data: Harnisch, Brosch, Schmidt, and Hagen. Drafting of the manuscript: Harnisch. Critical revision of the manuscript for important intellectual content: Brosch, Schmidt, and Hagen. Statistical analysis: Harnisch. Administrative, technical, and material support: Harnisch. Study supervision: Brosch, Schmidt, and Hagen.
Financial Disclosure: None reported.
Additional Contributions: Perceptual voice evaluation according to the simplified GRBAS-scale was conducted by Katrin Baumbusch, Dr med, Ulrike Beßler, Wafaa Shehata-Dieler, Dr med, and Christiane Völter, Dr med. Christina Motschmann performed all objective voice measures and evaluated the voice samples. Pulmonary function tests were conducted by Karin Kretzer. Stefan Brill, Dipl Ing, reviewed the manuscript critically.