A, Mean fundamental frequency (P = .23). B, Jitter (P < .001). C, Shimmer (P < .001). D, Noise to harmonic ratio (P < .002). MDVP indicates Multi-Dimensional Voice Program (KayPENTAX); VES, Voice Evaluation Suite (Estill Voice International).
Rohrer J, Maturo S, Hill C, Bunting G, Ballif C, Hartnick C. Pediatric Voice AnalysisComparison of 2 Computerized Analysis Systems. JAMA Otolaryngol Head Neck Surg. 2014;140(8):742-745. doi:10.1001/jamaoto.2014.1162
Copyright 2014 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.
This research contributes to the pediatric objective voice measurement database while identifying comparable measurements between 2 available voice analysis systems.
To compare selective normative pediatric acoustic variables between the Multi-Dimensional Voice Program (MDVP) and the Voice Evaluation Suite (VES) computerized voice analysis systems. Additionally, it describes the first comprehensive pediatric database analyzing fundamental frequency, jitter, shimmer, and noise to harmonic ratio using the VES.
Design, Setting, and Participants
Cross-sectional study with planned data collection conducted at a tertiary referral otolaryngologic clinic. Participants were 335 children, aged 4 to 18 years, with normal voices.
Objective voice data were collected on the MDVP and the VES systems.
Main Outcomes and Measures
Fundamental frequency, jitter, shimmer, and noise to harmonic ratio.
The fundamental frequencies agreed with previous pediatric normative values. There was not a statistically significant difference between MDVP and VES measurements of mean fundamental frequency (P = .23). Jitter percentage (P < .001), shimmer percentage (P < .001), and noise to harmonic ratio (P < .002) for all children were statistically different between the 2 voice evaluation systems.
Conclusions and Relevance
These data show that the measured fundamental frequency of normal voices in children is comparable between the MDVP and VES voice analysis systems. Jitter, shimmer, and noise to harmonic ratio values are not interchangeable between voice analysis systems. The voice analysis system should be reported when providing voice measurement outcomes in the literature.
The evaluation of the pediatric voice is a challenging undertaking. An examiner may have difficulty detecting subtle abnormalities or small changes over time. Self-report and proxy quality-of-life questionnaires are subject to bias. Additionally, video recordings or stroboscopy using rigid or flexible laryngoscopy can be anxiety provoking in the uncooperative child. Computerized voice analysis is an objective, noninvasive technique that provides valuable information. Common voice characteristics analyzed include fundamental frequency, jitter percentage, shimmer percentage, and noise to harmonic ratio. These characteristics have been used to assess adult voices, yet until recently, pediatric norms have not been available. Recently, a pediatric normative database has been published with a large cohort covering ages 4 to 18 years using the Multi-Dimensional Voice Program (MDVP) (KayPENTAX), a common commercially available computerized analysis system.1 There is no consensus on the best objective voice measurements with concerns about validity and reliability, as discussed in a review by Carding et al.2 Lopes et al3 did find that there was a correlation between fundamental frequency and strain severity in children aged 3 to 9 years. Shimmer was also noted to correlate with roughness, breathiness, and voice instability in that study.3
To our knowledge, no studies have been performed to evaluate the intersystem agreement between computerized analysis systems in pediatric patients. Previous work by Karnell et al4 and Smits et al5 have indicated in adults that different acoustic analysis systems often were comparable in fundamental frequency. Smits et al5 found that shimmer and noise to harmonic ratio could be comparable if a correction factor was applied, but jitter was not comparable. Maryn et al6 found that 2 commercially available systems were not comparable in any of the parameters for the adult voice.
Intersystem reliability information is important because today patients may be forced to change clinicians secondary to insurance coverage, relocation, or inability to bear the financial burden of traveling to tertiary centers. If patients change clinicians, the usefulness of previous measurements may depend on the equipment at the new location. Additionally, ongoing research may only allow extrapolation to compatible analysis systems. The goal of this study is to describe a comprehensive computerized pediatric voice database using the Voice Evaluation Suite (VES) (Estill Voice International) and compare the results to previously established data using the MDVP.
This study was approved by the institutional review board of the Massachusetts Eye and Ear Infirmary. Written informed consent was obtained from legal guardians in young children and from both legal guardian and child in children older than 12 years.
Patients between ages 4 and 18 years were recruited from an outpatient pediatric otolaryngology clinic. A “yes” answer to any question on the screening form excluded the child from the study. These questions were designed to discover a history of voice disorders, developmental delay, cognitive delay and/or learning disability, smoking, and parental concern for hearing loss. Children also had to have normal hearing reported by the parent or guardian; if there was concern for hearing abnormalities, audiograms were evaluated. Children with abnormal hearing were excluded from the study. Each parent completed a pediatric voice-related quality-of-life (PVRQOL) questionnaire. If the PVRQOL indicated voice concerns, the child was excluded from the study.
Voice recordings were created in a soundproof room using a Dell (Dell Inc) Optiplex 960 personal computer (Microsoft Windows XP Professional, version 2002). Children used a headset-mounted microphone placed 3 cm from the right oral commissure. Children’s voices were recorded with the MVDP, Model 5105 software option for the computerized speech laboratory model 4500. Immediately thereafter, their voices were recorded with the VES system. Identical voice tasks were performed with both systems. This involved producing a sustained “ah” at a comfortable volume and pitch. The voice task was performed 4 times with recording starting 1 second after the start of the fourth ah. A total of 3.5 seconds of recording was obtained and analyzed. Fundamental frequency, jitter, shimmer, and noise to harmonic ratio were acquired.
Descriptive statistics were performed by sex and age group. The t test was then performed to compare the 2 recording systems against each other.
The data were compiled and stratified based on age and sex. As shown in Figure, A, there was no difference between the mean fundamental frequency of the VES and MDVP systems (P = .23). Figure, A, also shows the trends of fundamental frequency as they change based on age and sex.
The next variable was jitter percentage. A statistically significant difference between the analysis systems was seen for the compiled group (P < .001) (Figure, B). The MDVP system measures had a larger range and consistently had a larger jitter percentage until age 10 years. Shimmer percentage showed statistically significant differences in 27 of the 30 stratified groups and as a whole group (P < .001) (Figure, C). The MDVP system consistently reported a larger shimmer percentage. The last measurement was noise to harmonic ratio. The compiled group showed statistically significant differences between the 2 systems (P < .002) (Figure, D). Large differences were seen between boys younger than 9 years. The MDVP system had very little variability in boys’ voices across the entire age spectrum.
Our results agree with previous work showing that the pediatric voice is dynamic through development, and therefore both age and sex need to be accounted for during voice measurements.1 Changes of the fundamental frequency are seen starting at age 11 years for girls and age 13 years for boys, and these changes progress into adult voice characteristics. Adult women’s voices have fundamental frequencies ranging from 165 to 255 Hz with a mean of 217 Hz. Men’s voices, on the other hand, are significantly lower, ranging from 85 to 155 Hz with a mean of 116 Hz.7 The VES data agree with and fit into these ranges.
Fundamental frequency measurements were found to be equivalent between the MDVP and VES voice analysis systems. Equivalent values for the perturbation measures of shimmer and jitter percentage or for noise to harmonic ratio were not produced. Jitter was seen to have much more overall variability in the MDVP system as well as being higher in the younger age groups. Shimmer consistently measured higher on the MDVP system and also showed greater overall variability. The noise to harmonic ratio had very little variation among boys measured with the MDVP system and large variations in the boys younger than 9 years on the VES system. The variations between systems might be from subjects changing their voice between tasks or large outliers, which were not seen, however.
Large variations between systems were found in the noise to harmonic ratio for boys aged 5 years and 8 years. Shimmer had maximum divergence for boys aged 6 years and 13 years, and jitter had maximum divergence in boys at ages 6, 10, and 13 years. If the voice was changed between tests, we would expect to see large variation in all the parameters at those same age and sex groups. These changes did not exist in the fundamental frequency, and while jitter and shimmer correlate, they do not match the noise to harmonic ratio mismatches. Likely explanations include small variations between test runs but also the fact each voice analysis system has distinct, proprietary software analysis tools and data point extraction algorithms from the recorded voice samples. The variability of these measures agree with similar findings in adult studies.5,6
By using the same participants for both systems, we eliminated the intersubject variability. However, using the same participants also introduced a possible training effect. This may be a possibility because the same order of system recording was used in all patients. The methods included practice of the vocal commands prior to the first recording, so we believe that this bias is likely inconsequential. It would also be expected to have small variations in voice between voice tasks. The large number of patients should also have decreased the effects of these small variations and also decreased the effect of any outliers in the data set.
This study shows that normal pediatric voice analysis measurements of fundamental frequency are comparable between the MDVP and VES systems. The other voice characteristics of jitter, shimmer, and noise to harmonic ratio should not be compared between MDVP and VES systems. The recording and voice analysis system used should be noted in future publications.
Submitted for Publication: October 17, 2013; final revision received May 19, 2014; accepted May 21, 2014.
Corresponding Author: Stephen Maturo, MD, Department of Otolaryngology, San Antonio Military Medical Center, 3851 Roger Brooke Dr, San Antonio, TX 78234 (Stephen.email@example.com).
Published Online: July 10, 2014. doi:10.1001/jamaoto.2014.1162.
Author Contributions: Drs Rohrer and Maturo had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Maturo, Bunting, Hartnick.
Acquisition, analysis, or interpretation of data: Rohrer, Hill, Bunting, Ballif.
Drafting of the manuscript: Rohrer, Maturo, Bunting.
Critical revision of the manuscript for important intellectual content: Rohrer, Maturo, Hill, Ballif, Hartnick.
Statistical analysis: Rohrer, Maturo, Hartnick.
Obtained funding: Maturo.
Administrative, technical, or material support: Maturo, Hill, Bunting, Ballif.
Supervision: Bunting, Ballif.
Conflict of Interest Disclosures: None reported.
Funding/Support: This study was supported by Massachusetts Eye and Ear Infirmary internal funding (Dr Maturo).
Role of the Sponsor: Massachusetts Eye and Ear Infirmary had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.