Still frame of high-speed digital imaging of the larynx.
Age range of the 50 healthy subjects undergoing high-speed digital imaging of the larynx and videostroboscopy.
Vocal fold vibration amplitude for videostroboscopy vs high-speed digital imaging (HSV) of the larynx. Ratings are on a 100-point scale. A rating of 42 is considered normal. Higher ratings indicate larger amplitudes of vibration.
Vocal fold mucosal wave for videostroboscopy vs high-speed digital imaging (HSV) of the larynx. Ratings are on a 100-point scale. A rating of 42 is considered normal. Higher ratings indicate a larger mucosal wave.
Percentage of open phase for videostroboscopy vs high-speed digital imaging (HSV) of the larynx. Ratings are on a 100-point scale.
Kendall KA. High-Speed Laryngeal Imaging Compared With Videostroboscopy in Healthy Subjects. Arch Otolaryngol Head Neck Surg. 2009;135(3):274–281. doi:10.1001/archoto.2008.557
Copyright 2009 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2009
To describe normal vocal fold vibratory characteristics recorded with high-speed digital imaging (HSV) of the larynx.
Prospective study of healthy subjects who volunteered to undergo laryngeal HSV and videostroboscopy. Image analysis was randomly assigned to 3 blinded raters.
Community-based clinic with a specialty in laryngology.
Fifty healthy subjects aged 21 to 65 years who were nonsmokers and who had no voice problems, laryngopharyngeal reflux, or reactive airway disease.
Main Outcome Measures
The following characteristics of vibration were described: glottal configuration, phase closure, vibratory symmetry, mucosal wave propagation, amplitude of vibration, and periodicity of vibration. Interrater and intrarater reliabilities were calculated for both imaging modalities.
The range of findings for each measure is described. The comparison of videostroboscopy ratings with ratings from HSV studies did not reveal any significant difference between the 2 modalities for any of the measures except for the assessment of periodicity. Aperiodic vibratory characteristics were noted on 30% of the videostroboscopy studies (n = 15) and in only 4% of the HSV studies (n = 2) (P <. 001). Although interrater and intrarater agreement were considered to be generally acceptable, a significant rater effect was identified.
This preliminary study describes a range of normal values for vocal fold vibratory characteristics as recorded with laryngeal HSV, providing a basis for comparison of studies in patients with voice problems.
Cameras capable of recording laryngeal vocal fold vibration at a rate of 2000 frames per second by means of high-speed digital imaging (HSV) are now commercially available; however, to my knowledge, there are no published normative data regarding vocal fold vibratory characteristics using this method. As a result, the ability to identify pathologic vocal fold function using HSV is limited by the inability to distinguish it from normal function. To evaluate the range of normal function as documented with HSV, laryngeal recordings in 50 healthy individuals without voice problems were made with HSV and compared with videostroboscopic studies performed in the same subjects.
Since the 1960s, videostroboscopy has been the primary method used to evaluate vocal fold vibration and is the clinical criterion standard for laryngeal imaging. However, a major limiting factor of this imaging method is that the slow-motion image seen with videostroboscopy is actually a composite image averaged over several vibratory cycles rather than a real-time image.1 In addition, the clinical use of videostroboscopy is limited because it relies on periodic vocal fold vibration and a stable phonation frequency to activate the strobe light—conditions that are not always present in disordered voices. Patients with aperiodic vocal fold vibration cannot undergo adequate assessment using videostroboscopy because the strobe is able to illuminate only 1 frequency of vibration at a time and cannot track aperiodic vibrations. Aperiodic movement due to lesions on the folds or asymmetry of vocal fold function, characterized clinically as a hoarse or rough vocal quality, cannot be visualized with stroboscopy.1,2
High-speed digital imaging of the larynx is not subject to the limitations of videostroboscopy because it uses a conventional rigid endoscope to record images of the larynx at a rate of 2000 images per second, irrespective of vibratory frequency (Figure 1). This allows imaging of aperiodic vibration. In addition, HSV can visualize vocal onset and offset, which occurs too quickly to be captured by videostroboscopy and where significant aperiodicity occurs as a normal phenomenon.3
This preliminary study proposes to establish a foundation for the clinical use of HSV by describing normative high-speed values for several variables currently used in the evaluation of vocal fold vibration with videostroboscopy. Although many of the study variables involve subjective evaluation by clinicians, the technique has been previously well established with videostroboscopy as a useful, reproducible, and valuable method of analysis. It is widely used for the diagnosis of vocal fold disorders and is familiar to most clinicians involved in the evaluation and management of vocal disorders.3- 9
Study approval was granted by the institutional review board. Fifty volunteer subjects aged 21 to 65 years were recruited to participate in the study, and informed consent was obtained. Participants were required to be generally healthy. Smokers and individuals taking medications for asthma or acid reflux disease were excluded from the study. Every subject was able to walk up 2 flights of stairs without stopping to take a breath.
Data were collected by 2 speech/language pathologists with a physician supervising (K.A.K.). Each participant was asked to make a voice recording while reading a standard passage (the Rainbow Passage). The average fundamental frequency was then determined from the voice recording (Visi-Pitch IV 3950; Kay Elemetrics Corp, Lincoln Park, New Jersey). After the acoustic analysis established an average fundamental frequency for the study participant, he or she was asked to voice at a sustained pitch within the range of 20 Hz above and 20 Hz below the average fundamental frequency. This method minimized the impact of frequency of vibration on mucosal wave and amplitude of vibration.10 Several practice recordings were made to ensure that each participant was able to sustain a pitch in the desired range. A visual display was used to give immediate feedback to each participant about the pitch.
Next, each participant underwent standard videostroboscopy while voicing at the average fundamental frequency using a rhinolaryngeal stroboscope (RLS 9100 B; Kay Elemetrics Corp), followed by HSV of the larynx (HSV model 9700; Kay Elemetrics Corp), also while voicing at the average fundamental frequency.
Each imaging study was assigned a randomly generated study number so that later data analysis was blinded. Each of 3 judges was randomly assigned to review one-third of the studies. The judges reviewed the studies on their own so that rating conditions reflected those found in clinical practice, where a clinician will choose the part of the imaging study believed best for making measures. From the studies assigned to each investigator, 5 of the videostroboscopic studies and 5 of the HSV studies were chosen randomly to be blindly reviewed twice by the investigator. These data were used to establish intrarater reliability. In addition, 10 of the HSV studies and 10 of the videostroboscopic studies were randomly assigned to be rated by more than 1 investigator. These data were used to determine interrater reliability. Each of the 3 investigators who participated in the study as a judge had significant experience (4-15 years) in the clinical evaluation of voice disorders and specifically in the use of laryngeal imaging.
The following variables of vocal fold vibration were analyzed for each study sample. Glottal configuration, the shape of the glottic opening at maximum closure, was judged as complete closure, hourglass configuration, anterior gap, posterior gap, or incomplete closure. Phase closure, the amount of open/closed time, was recorded as the percentage of time the vocal folds were open during a single vibratory cycle. A normal phase closure percentage is 50%.11Symmetry of vocal fold vibration recorded in each study was judged to be symmetric or asymmetric. Periodicity was measured by determining whether variation in the cycle duration was present. The studies were judged to exhibit periodic or aperiodic vibration. Aperiodic vibration was defined as at least 5 seconds of poor tracking on videostroboscopy, despite stable phonation and a greater than 30% variation in cycle length during 10 vibratory cycles. Mucosal wave propagation, for which the right and left vocal folds were judged separately, was rated as the mucosal wave on a 100-point scale, with 42 as normal. Amplitude of vibration was also rated on a continuous scale from 0 to 100, with 42 representing the normal vibration amplitude. The normal amplitude of vibration was defined as approximately one-half to one-third of the visible width of the true cords, with each cord contributing equally to the vibratory amplitude.11,12
For each imaging modality, a total of 77 ratings were recorded because some of the studies were rated more than once by the same rater or were rated by multiple raters. The data were submitted to 2 types of analysis. One analysis used all of the data for the comparison of the videostroboscopy and HSV ratings. These analyses tested for systematic differences between image forms and raters and asked whether the differences between raters depend on the image form (logistic regression) (SAS GENMOD procedure, version 9.0; SAS Institute Inc, Cary, North Carolina). When possible, a second analysis evaluated the results when 1 rater judged both imaging modalities from the same subject (exact 2-tailed McNemar test). The percentage of interrater and intrarater agreement was calculated and, where possible, Cohen κ was determined. In addition, a general description of the data are given to establish the range of normal findings.
Thirty-six women and 14 men were recruited to participate in the study, for a total of 50 subjects. The ages of the subjects ranged from 21 to 65 years and were relatively evenly distributed (Figure 2). Twenty-four female participants (67%) reported that they enjoyed singing on a regular basis, and 8 (22%) characterized themselves as professional singers. Of the men, 10 (71%) enjoyed singing regularly, and 7 (50%) characterized themselves as professional singers.
Aperiodic vibratory characteristics were noted on 30% of the videostroboscopy studies (n = 15) and in only 4% of HSV studies (n = 2) (P < .001). McNemar tests of the 25 studies in which the same individual judged both imaging modalities for a given subject demonstrated evidence of bias toward videostroboscopy identifying aperiodicity and HSV failing to identify aperiodicity (P = .03). Interrater agreement for periodicity was noted in 86% (12 of 14) of the HSV studies and in 85% (11 of 13) of the videostroboscopy studies evaluated by more than 1 rater. Intrarater agreement was 93% (14 of 15) for the HSV studies and 80% (12 of 15) for the videostroboscopy studies.
The 77 ratings of glottal closure can be viewed in Table 1. Because some of the studies were rated more than once by the same or multiple individuals, the numbers reported in Table 1 do not reflect the incidence of these various glottal configurations in the study population. For the purposes of the statistical analysis, the glottal configuration ratings were grouped into closed vs other configurations. There was no statistical difference found between the 2 imaging modalities with respect to ratings of open vs other glottic configurations (P = .75). When considering only studies in which both imaging modalities for a given subject were judged by the same rater (n = 25), the Cohen κ was 0.72, indicating good agreement between imaging modalities. Interrater agreement for glottal configuration was found in 69% of the videostroboscopy studies (9 of 13 studies) (κ = 0.53) and in 64% of the HSV studies (9 of 14) (κ = 0.54). Intrarater agreement for glottal configuration was found in 87% of the videostroboscopy studies (13 of 15 studies) (κ = 0.76) and in 73% of the HSV studies (11 of 15) (κ = 0.64).
Ten male subjects (71%) were judged to have a closed glottic configuration on the videostroboscopy and HSV studies, and 1 male subject was judged to exhibit a posterior gap with both imaging modalities. The raters judged the other 3 male subjects differently within and between the imaging modalities, indicating significant variability within the examinations. Only 8 of the female participants (22%) were judged to exhibit a closed glottic configuration with both imaging modalities. Eight of the female participants (22%) were judged to exhibit a posterior glottic gap with both imaging modalities, and 1 (3%) was judged to exhibit an anterior gap with both imaging modalities. The judgment of glottic configuration in the other 19 female participants (53%) was inconsistent, with 10 of them being judged differently between and within an imaging modality. Female subjects demonstrating a posterior glottic gap were more likely to be younger than 40 years (7 of 8 subjects), whereas those demonstrating a closed glottic configuration were more likely to be 40 years or older (7 of 8).
Twenty-six percent of the videostroboscopy studies (n = 13) demonstrated asymmetry of vibration between the right and left vocal folds. Asymmetry was noted on 38% of the HSV studies (n = 19). There was no statistically significant difference in the incidence of asymmetry between the 2 imaging modalities (P = .21). However, in only 29 of the subjects (58%) was there agreement between the study modalities with regard to symmetry. The κ was 0.09 (essentially 0), meaning that the agreement between the 2 imaging modalities for symmetry was no better than chance. In addition, interrater agreement was found in only 50% of the HSV studies evaluated by more than 1 rater and in 77% of the videostroboscopy studies evaluated by more than 1 rater. On the other hand, intrarater agreement was found in 93% of the HSV studies and in 80% of the videostroboscopy studies rated more than once by the same rater.
Most of the ratings for amplitude of vibration fell within 5 points of the normal mark, irrespective of imaging modality. As expected, the ratings of vibration amplitude were clustered around the designated normal mark; however, the range of ratings was skewed toward larger amplitudes of vibration, with some of the subjects demonstrating vocal fold vibration amplitudes rated to be as high as 28 points above the normal mark (Table 2 and Figure 3). In studies that were judged by more than 1 rater (n = 25), the standard deviation of the difference between 2 raters was small for both imaging modalities, indicating good overall interrater agreement (<8 points on a 100-point scale). The standard deviation of the difference between the 2 ratings given by a single rater for both modalities was also small (<10 points on a 100-point scale).
Similar to the range of ratings for amplitude of vibration, most of the ratings of the mucosal wave fell within 5 points of the normal mark, irrespective of image modality. As expected, the ratings of vibration amplitude were clustered around the designated normal mark; however, the range of ratings was skewed toward an increased mucosal wave, with some of the subjects demonstrating a mucosal wave rated to be as high as 25 points above the normal mark (Table 3 and Figure 4). The standard deviation of the difference between 2 raters was calculated for mucosal wave ratings and found to be very small for both modalities (<6.3 points on a 100-point scale), indicating good interrater agreement. The standard deviation of the difference between 2 ratings of the same study by the same rater was also small (<4.3 on a 100-point scale), indicating that overall the intrarater agreement was excellent.
No statistically significant difference in the overall ratings for percentage of open phase could be identified between the videostroboscopy and HSV studies (Table 4). The mean value of the percentage of open phase as measured by HSV was 62.3% and ranged from 44% to 100% (Figure 5). Evaluation of the standard deviation of the difference between 2 raters for the percentage of open phase showed higher variation between the raters than for the other continuous data (12.95 on a 100-point scale for HSV and 11.02 on a 100-point scale for videostroboscopy). The standard deviation of the difference between 2 ratings given by a single rater was 6.75 for HSV and 9.91 for videostroboscopy.
The data analysis showed significant differences in the ratings of each variable based on the raters. The differences were consistent across measures, with one of the raters differing from the other two for all measures except periodicity. For periodicity, there was no difference in the ratings based on the rater (Table 5). The tests of interaction to determine whether the differences between raters depend on the image form (the rater × image form interaction) showed no interaction. In other words, the rater influence was the same for the videostroboscopy and HSV modalities.
The evaluation of the dynamic properties of vocal fold vibration is essential for understanding the pathophysiology of voice disorders. Owing to the inability of videostroboscopy to image vocal fold vibrations in aperiodic voices, HSV of the larynx is now used more frequently for voice assessment. Recent studies comparing videostroboscopy with HSV have found that use of HSV results in less difficulty in obtaining studies suitable for interpretation.7 That interpretation, however, requires an established range of normal findings against which to compare the vibratory characteristics seen in patients with hoarseness. To that end, this preliminary study was performed to document vocal fold vibratory characteristics as recorded with HSV in a group of healthy subjects. Videostroboscopy was performed in the same group of subjects for comparison, and the study results establish the clinical reliability of HSV as similar to that of videostroboscopy.
The assessment of periodicity was the only variable that differed significantly between the HSV and videostroboscopy studies. Aperiodic vibrations are relatively easy to identify on videostroboscopy because the strobe is unable to track during aperiodic vibrations, and the resulting video sequence jumps from one part of the vibratory sequence to another. On the other hand, the only way to identify aperiodic vibration on HSV is to carefully analyze the number of images from cycle to cycle over several cycles, which is time consuming and may be impractical for the busy clinician. Use of the kymography function linked to most HSV systems allows for a quick visual inspection of symmetry and periodicity over multiple cycles and may be used to overcome this difficulty.3,9 However, clinicians probably should not rely on HSV to assess periodicity until the reliability of the kymography function to detect aperiodicity has been established. With respect to stroboscopy, the fact that 30% of healthy individuals demonstrate aperiodic vibration indicates that periods of aperiodic vibration should be considered as the baseline when judging abnormal voices by this modality.
Every type of glottal configuration was identified in the subject population, although the closed configuration predominated, identified in 52% of the studies. The next most common configuration was a posterior glottic gap, identified in 31% of the study population. The findings in this study are consistent with those of previous videostroboscopic studies of glottal configuration in women that report a posterior glottic opening in 30%, seen more frequently in younger women.5,10,13 This study found more variability in the ratings of glottic configuration in the women under study.
Asymmetry of vibration was seen in approximately 25% of the study population with both imaging modalities. However, there was essentially no agreement about symmetry of vibration between modalities, indicating that periods of asymmetric vibration may occur in up to 50% of the population. In addition, intrarater agreement was poor for this measure, implying, again, that short periods of asymmetry may occur in the healthy population. This finding is supported by other authors using HSV techniques who have noted a significant incidence of vibratory asymmetry in healthy subjects.14
Unlike the categorical measures, the continuous measures made in this study (mucosal wave, percentage of open phase, and vibratory amplitude) demonstrated less variability and better interrater agreement, consistent with what would be expected in a healthy population. The distribution of the mucosal wave and amplitude of vibration, however, did not reflect a normal distribution curve but demonstrated a skewed distribution with an elongated tail toward increased mucosal wave and amplitude.
The distribution of the open-phase ratings demonstrated 2 distinct patterns when the 2 imaging modalities were compared. The videostroboscopy ratings were based on the montage function of the strobe system that selects 10 images from a single virtual composite cycle of vibration and displays them on the screen. The rater then counts the number of images with open vocal folds. This method results in an estimated percentage of open phase that is a multiple of 10. Consequently, the percentage of open-phase data from videostroboscopy falls into distinct groups. On the other hand, because HSV of the larynx has a fixed rate of image capture, the number of images per cycle varies. The measurement of the number of frames with open vocal folds per cycle results in data of a more continuous nature owing to the potentially smaller increments of measurement.
With respect to the range of normal findings identified in this study, one may conclude the following: First, asymmetry of vibration and aperiodicity can be found in at least 25% of the population and may not represent a vocal fold vibratory abnormality. Second, the most frequently identified glottic configuration is completely closed, but a posterior glottic opening is found in 31% of subjects, and an anterior gap, hourglass configuration, and even incomplete closure may be identified. Third, the average normal percentage of open phase at the fundamental frequency is approximately 60%. Fourth, diminished amplitude of vibration and mucosal wave likely represent abnormality or disease because the healthy population is skewed toward larger values for these 2 variables. Finally, aperiodicity of vibration may be difficult to identify with HSV unless the kymography function is used.
The reliability of measures obtained from HSV studies did not differ significantly from the reliability of measures obtained from videostroboscopy studies except for the measure of symmetry. Interrater agreement was high for symmetry on videostroboscopy (77%) and much lower on HSV studies (50%). Because intrarater agreement was very high for HSV studies (93%), interrater agreement could likely have been improved with better definition of the criteria for asymmetry before rating the studies.
A lack of objectivity in the analysis of laryngeal imaging is an inherent drawback to the methods. In the usual clinical situation, the imaging studies are subjectively evaluated by clinicians. Previous studies of videostroboscopy judgments with high interrater agreement generally depend on the use of specific segments of the study to establish interrater agreement.10,14,15 When clinicians are free to make assessments from any part of the study, there may be differences in interpretation based on variables such as the loudness, pitch, effort level, and modal register used by the subject.6,11 Recent studies that report the reliability of clinicians evaluating images from a large patient population report reliabilities similar to those found in this study.7,15 Because each clinician who participated in this study likely chose a slightly different part of the imaging study by which to judge the vibratory characteristics, the findings varied. Recording was conducted near the subject fundamental frequency to minimize this variability but could not eliminate it.
Furthermore, in this study, one of the raters typically differed from the other two regarding the study interpretation. This individual identified fewer instances of abnormality when rating mucosal wave, amplitude, glottic configuration, and percentage of open phase. The same individual identified more instances of asymmetry than the other 2 raters. This individual is an otolaryngologist and the other 2 raters are speech-language pathologists. The difference in ratings between an otolaryngologist and 2 speech pathologists may reflect a difference in the training, background, and clinical perspective of the raters. This finding highlights the importance of communication between clinicians regarding the findings of laryngeal imaging and the importance of using the clinical picture to help interpret the results. There is also a need to continue the development of quantitative methods of measurement using imaging studies.16,17 Strict imaging protocols with respect to frequency, phonation mode, and loudness may help to minimize variability.
In a research setting, HSV has been used to learn more about normal vocal fold vibratory behaviors. Tissue characteristics and the influence of other forces such as aerodynamics, muscle tension, and vocal fold length have been studied.16,18- 22 Most reports have been limited, however, to results from a few human subjects or animal larynges.2,9 This information is critical for the development of successful surgical and medical techniques used to restore normal vocal fold vibratory function to patients with hoarseness but is not directly applicable to patient evaluation in a clinical setting.
Studies analyzing the application of HSV in the clinical setting are beginning to emerge.15 High-speed digital imaging analysis of the glottal area and the amplitude of vibration for each vocal fold has been applied to the evaluation of unilateral vocal fold paralysis before and after medialization.8,23 Another area in which HSV has great potential is in the assessment of vocal tremor and in the differentiation of spasmodic dysphonia from muscle tension dysphonia. Because of the acoustic characteristics of tremor and breaks in phonation, videostroboscopy cannot capture the vibratory pattern in these patients. High-speed digital imaging is being applied to assist with quantification of vocal tremor, which previously has been very difficult.8,9
In summary, HSV offers benefits over standard videostroboscopy in the analysis of aperiodic vocal fold motion and will likely develop as an important adjunct to videostroboscopy in the evaluation of voice disorders. This technology is still in the early stages of clinical application. As the knowledge base with this form of laryngeal analysis expands, our ability to evaluate and treat patients with dysphonia will improve.
Correspondence: Katherine A. Kendall, MD, Division of Otolaryngology, Minneapolis Veterans Affairs Medical Center, University of Minnesota, 1 Veterans Dr, Minneapolis, MN 55417 (email@example.com).
Submitted for Publication: December 29, 2007; final revision received May 14, 2008; accepted July 24, 2008.
Financial Disclosure: None reported.
Funding/Support: This study was supported by a grant from the KayPENTAX Corporation and the Park Nicollet Research Institute. The grant provided funds to buy a microphone for our recording equipment, paid the salaries of the 2 speech pathologists who worked on the study, and paid for the statistical analysis of the data.
Additional Contributions: Carol Rue, MS, and Kari Urberg-Carlson, MS, worked on data collection and image analysis, and James Hodges, PhD, helped with the statistical analysis.