Importance
Scientific understanding of human voice production to date is a product of indirect investigations including animal models, cadaveric tissue study, or computational modeling. To our knowledge, direct experimentation of human voice production has previously not been possible owing to its invasive nature. The feasibility of an ex vivo perfused human phonatory model has recently allowed systematic investigation in virtually living human larynges with parametric laryngeal muscle stimulation.
Objective
To investigate the association between adductor muscle group stimulation and the open quotient (OQ) (the fraction of the cycle during which the glottis remains open) of vocal fold vibration.
Design, Setting, and Participants
An ex vivo perfused human tissue study was conducted at a physiology laboratory. Human larynx recovered from organ donors within 2 hours of cardiac death was used. The study was performed on May 19, 2014; data analysis took place from June 1, 2014, to December 15, 2014.
Interventions
Perfusion with donated human blood was reestablished shortly after cardiac death. Ex vivo perfused human phonation was then achieved by providing subglottal airflow under graded neuromuscular electrical stimulation bilaterally to the intrinsic adductor groups and cricothyroid muscles.
Main Outcomes and Measures
Phonation resulting from the graded states of neuromuscular stimulation was evaluated using high-speed vibratory imaging; the OQ was derived through digital kymography and glottal area waveform analysis.
Results
During constant glottal flow, a stepwise increase in adductor muscle group stimulation decreased the OQ. Quantitatively, OQ values decreased with increased stimulation levels from 2 V (OQ, 1) to 5 V (OQ, 0.68) and reached a lower limit of 8 V (OQ, 0.42). Increased stimulation above maximal muscle deformation was unable to affect OQ beyond this lower limit.
Conclusions and Relevance
To our knowledge, a negative association between adductor muscle group stimulation and phonatory OQ has been demonstrated for the first time in a neuromuscularly activated human larynx. Further experience with the ex vivo perfused human phonatory model will aid in systematically defining this causal relationship.
The compendium of human voice production understanding is a product of observational associations and direct-control laryngeal models. However, such direct systematic control of physical variables has been confined to comparative laryngeal models. Specifically, laryngeal modeling has been applied in a number of methods, including phonation within excised human larynges,1,2 in vivo physiologically active animal larynges,3,4 and computational modeling.5,6 However, these experimental constructs are limited in the successful extrapolation to fully describe human voice production. For instance, creating phonation by mechanically adducting excised cadaveric human larynges does not incorporate the effects of physiologically active thyroarytenoid muscle, which increases the bulk and tension of the vocal fold body. In addition, within the cadaveric laryngeal model, many changes to the soft tissue are seen without physiologic blood flow, such as epithelial dehydration and loss of tissue elasticity. In vivo animal phonation is limited by anatomic and physiologic variances between the number of proposed mammalian larynges described in the literature7 and human larynges. Computational phonatory modeling is also limited because it uses specific quantitative data from the other described models.8 By analyzing specific data related to vocal fold compliance or elasticity from limited scientific platforms, the computational models are insufficient for direct clinical application.
In response to these shortcomings in human voice production research, a novel phonatory model using ex vivo perfused human larynges was developed by Berke and his colleagues.9,10 As described in those studies, human larynges are recovered shortly after cardiac death and are reperfused to maintain physiologic viability of the larynges. By applying neuromuscular electrical stimulation, physiologic laryngeal muscle activation was accomplished, and the feasibility has been described.9,10 In continued preliminary experience with this new phonatory model, an association between muscle stimulation level and phonatory open quotient (OQ) (the fraction of 1 cycle during which the glottis remains open) was observed. We therefore set out to describe our preliminary observations regarding this association for what we believe to be the first time within physiologically viable human larynges.
Institutional board review of the methods determined that this study was exempt from the board’s approval. Written informed consent for the recovery of human larynges from transplant donors was obtained from the donor patients’ families by the patient care coordinators of One Legacy, the Southern California organ transplant distribution center. The study was conducted on May 19, 2014; data analysis took place from June 1, 2014, to December 15, 2014.
The general methods of the ex vivo perfused human phonatory model have been described9,10; however, improvements in the methods have been selectively applied during the intervening period. A focused description of the updated ex vivo methods specific to the presently reported phonatory data is discussed below.
Following the recovery of clinically transplantable organs, the larynx, esophagus, trachea, strap muscles, thyroid, carotid arteries, vagus nerves and branches, and internal jugular veins were excised en bloc. The carotid arteries were then irrigated with Wisconsin perfusate solution, placed in 4°C sterile storage, and transported by car to the laboratory.
In the laboratory, the recurrent and superior laryngeal nerves were identified, and the arterial branches not related to laryngeal perfusion were tied off. The common carotid arteries were cannulated by 12F cannulae (DLP Pediatric One-Piece Arterial Cannulae; Medtronic). Venous outflow was via gravity drainage into the collection system (RM3 Renal Preservation System; Waters Medical Systems). Type O Rh-negative, commercially obtained human whole blood was infused with lactated Ringer solution to a hematocrit concentration between 25% and 40% (to convert to a proportion of 1.0, multiply by 0.01) and was used to reestablish blood flow to the larynx. A perfusion pump (RM3; Waters Medical Systems) provides blood flow in a pulsatile manner resulting in arterial blood systolic and diastolic pressure.
Dialysate solution (PrismasateBgk 4-2.5; Gambro Renal Products) was used in line with a pediatric dialysis filter to remove the severe hyperkalemia (potassium >5.0 mEq/L; to convert to millimoles per liter, multiply by 1) resulting from the previously infused Wisconsin perfusate solution and maintain electrolyte homeostasis. During the experiments, organ pH, oxygen pressure, and serum electrolyte levels were monitored by blood gas and adjusted as needed. The perfusion pump strength was adjusted to provide a systolic pressure ranging from 60 to 80 mm Hg, with a minimally acceptable pulse pressure of 10 mm Hg, at a pump rate of 60 pulses/min.
After approximately 1 hour of reapplication of blood flow, the organ was assessed for neuromuscular contractility. Initially, the recurrent laryngeal nerves were activated with constant current stimulators (WPI 301-T; World Precision Instruments), and the larynx was observed for muscular activation. If muscular contraction was not achieved with nerve stimulation, such contraction was produced by direct muscle stimulation. Hook-wire bipolar electrodes11 were inserted directly into laryngeal muscles. The cricothyroid (CT) and adductor muscle groups were electrically stimulated by 2 constant current stimulators (Model WPI 301-T) set at 60 Hz. The electrodes were placed directly through the thyroid cartilage into the thyroarytenoid muscles. The CT muscle needles were placed under direct vision of the anterior/superior laryngeal surface. The CT muscles were stimulated during phonatory trials based on our group’s prior work12 in canine phonatory experimentation. The quality and strength of the phonation produced is improved with constant high levels of CT activation. Within the present study, the level of CT activation was not changed throughout the experiments, and the effect of CT activation was not evaluated. This qualitative improvement is likely the result of slight adductory action of the CT muscle, although specific investigation will be required to elucidate the phonatory effects of the human CT muscle.
For initial stimulation observations, current ranged from 0.5 to 10 mA to identify the range in which observable muscular deformation was produced. After confirmation of appropriate levels of neuromuscular stimulation, warmed (37°C) 100% humidified air was flowed rostrally through the larynx at 350 mL/s using an endotracheal tube with an inflated cuff. Apart from physiologic neuromuscular activation–induced laryngeal adduction, no methods of physical approximation of the vocal process were used. With a constant rate of subglottic airflow, muscular activation was produced in a steadily increasing ramp in a voltage-dependent manner ranging from 2 V to 10 V; this ramp was repeated 3 or more times. Our findings have been corroborated by continuous experience with the ex vivo model.
Phonatory Data Extraction
Vocal fold vibration was recorded using a high-speed camera (Phantom v210; Vision Research) at 3000 frames per second and a resolution of 512 × 512 pixels. Spatial-temporal plots of vocal fold vibration (kymogram) were first extracted from the high-speed recordings. To generate the kymogram, a medial-lateral line was extracted from each frame of the recordings. These image lines were then stacked consecutively to form a kymogram.13 For our analysis, a line near the medial point of the glottal midline was chosen. The glottal area waveform was also extracted from the recordings using a region-growing algorithm14 from which the OQ was calculated for each oscillation period, and the mean was determined for each recording.
After confirmation of perceptually normal phonation in the ex vivo human larynx, stimulation ramps were applied symmetrically to the adductor muscle groups under constant CT muscle stimulation. Figure 1 displays the vibratory images seen with high-speed laryngoscopy, which highlights the OQ variation between the lowest (2 V) and highest (10 V) levels of adductor muscle stimulation. With high levels of adductor group stimulation, the glottis closed longer. Figure 2 shows the kymographs for each of the recorded adductor muscle stimulation levels that quantitatively display the OQ variances. Figure 3 presents the glottal area waveforms that were extracted to complement the demonstrated kymographs since kymography is limited to the selected horizontal line of analysis, whereas glottal area waveforms integrate the whole glottal aperture. Figure 4 graphs the calculated OQ by level of adductor stimulation, suggesting a direct relationship. The OQ decreased with increased stimulation levels from 2 V (OQ, 1) to 5 V (OQ, 0.68) to 8 V (OQ, 0.42) and plateaued at 10 V (OQ, 0.42). The OQ did not change with stimulation levels larger than 8 V, suggesting that, after maximal adductor neuromuscular activation, no change in OQ can be produced through adductor muscle stimulation. The phonatory principles reported here were repeated multiple times to ensure the reliability and repeatability of the model.
In this study, human voice production within a physiologically active human larynx was investigated. For this report, we have chosen to describe the vibratory characteristic of the OQ. The observable relationship suggests that the OQ was directly decreased with increased adductor muscle group stimulation.
The origins of OQ descriptions began with Timcke et al,15 who defined OQ as the time ratio of a single vibratory cycle spent with the glottis in an open configuration during the duration of the entire vibratory cycle. Timcke and colleagues were among the first to identify OQ variability with vocal modulation, specifically voice intensity and pitch. Although variations in OQ have been associated with changes in voice quality,16,17 these alterations and other reported associations between OQ and voice measures are only loosely associated, and the control of OQ has been poorly described. However, our control of the vibratory cycle during phonation is thought to be a critical aspect in normal and abnormal voice production.
For instance, several studies15,18-20 have examined the association between the OQ and vocal intensity without a consensus conclusion regarding a causal relationship. In the canine in vivo model,12 increased medial tension of the phonating vocal folds produced increased vocal intensity, although a follow-up clinical study19 could not show a causal relationship between increased vocal intensity and the OQ. To postulate the source of this discrepancy, it has been suggested21 that global laryngeal aerodynamics and vocal tract mechanics can supersede the associations seen between the OQ and vocal factors seen in experimental data. Otherwise stated, human voice production with the numerous postural and behavioral complexities can obscure the physiologic mechanisms for OQ control.
The components that directly modulate the OQ are thought to be directly associated with the viscoelastic properties of the vibrating folds.22 With prior evidence that adductor group stimulation increases vocal fold tension of the body layer,23 the use of physiologically viable larynges is critical toward the study of phonatory OQ. With the use of the ex vivo perfused human larynx, the data presented here suggest that an increase in adductor group stimulation leads to direct modulation of the OQ. Further systematic variations in the ex vivo model will provide greater understanding of the source mechanisms of OQ modulation and, thereby, voice quality.24 Future studies will be targeted to further define normal phonatory mechanisms, such as vocal fold tension during phonation, as well as pathologic states of voicing, such as unilateral vocal fold paresis with asymmetric tension.
As the ex vivo human phonation model continues to be refined, the current shortcomings of this model can be improved. Planned refinements include a quieter blood pump system to allow for high-fidelity acoustic recordings to accurately investigate the vocal changes that are observed with vibratory alterations. The model will also include electromyographic analysis as well as subglottic pressure recordings during ex vivo phonation. However, a continued challenge for this model is the lack of statistical analysis, which appears to result largely from the inherent organ-to-organ variability. Factors that lead to this interorgan variability include the variances in clinical data, including sex, age, body height, medical comorbidities, cause of death, length of intubation before organ recovery, and knowledge of prerecovery laryngeal function, as well as methodologic data, including intraoperative medication administration (eg, neuromuscular blockade), time from aortic cross-clamp to laryngeal preservate solution infusion, transport time from recovery hospital to laboratory, and inability to measure consistent reestablishment of whole organ microcirculation perfusion. Because the available donor pool is limited, strict control of these variables cannot be selected for, and statistical variance would preclude accurate analysis. However, as experience with the phonation model increases, phonation from organs with similar variables may be combined to allow for reliable statistical analysis.
Using human larynx ex vivo perfused phonation, we investigated the association between adductor muscle group stimulation and glottal OQ. This observational study suggests that increased adductor group neuromuscular stimulation may modulate the OQ. Specifically, increasing adductor stimulation resulted in decreasing the OQ ratio. Future work using the ex vivo perfused human phonatory model will concentrate on direct study of human voice production.
Submitted for Publication: February 20, 2015; final revision received May 3, 2015; accepted May 24, 2015.
Corresponding Author: Abie H. Mendelsohn, MD, Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles, 924 Westwood Blvd, Ste 515, Los Angeles, CA 90024 (Amendelsohn@mednet.ucla.edu).
Published Online: July 16, 2015. doi:10.1001/jamaoto.2015.1249.
Author Contributions: Dr Mendelsohn had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Mendelsohn, Zhang, Luegmair, Berke.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Mendelsohn, Zhang, Berke.
Critical revision of the manuscript for important intellectual content: Mendelsohn, Zhang, Orestes, Berke.
Obtained funding: Zhang, Berke.
Administrative, technical, or material support: All authors.
Study supervision: Mendelsohn, Zhang, Berke.
Conflict of Interest Disclosures: None reported.
Funding/Support: This study was supported by grant R01 DC009229 from the National Institute on Deafness and Other Communication Disorders, the National Institutes of Health.
Role of the Funder/Sponsor: The funding organization had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Previous Presentation: This study received the Broyles-Maloney award for outstanding thesis in the field at the 95th annual meeting of the American Broncho-Esophagological Association; April 22, 2015; Boston, Massachusetts.
1.Hirano
M. Morphological structure of the vocal cord as a vibrator and its variations.
Folia Phoniatr (Basel). 1974;26(2):89-94.
PubMedGoogle ScholarCrossref 2.Hast
MH. Subglottic air pressure and neural stimulation in phonation.
J Appl Physiol. 1961;16:1142-1146.
PubMedGoogle Scholar 3.Rubin
HJ. Experimental studies on vocal pitch and intensity in phonation.
Laryngoscope. 1963;73:973-1015.
PubMedGoogle Scholar 4.Moore
DM, Berke
GS. The effect of laryngeal nerve stimulation on phonation: a glottographic study using an in vivo canine model.
J Acoust Soc Am. 1988;83(2):705-715.
PubMedGoogle ScholarCrossref 5.Ishizaka
K, Flanagan
JL. Synthesis of voiced sounds from a two-mass model of the vocal cords.
Bell Syst Tech J. 1972;51:1233-1268.
Google ScholarCrossref 7.Kim
MJ, Hunter
EJ, Titze
IR. Comparison of human, canine, and ovine laryngeal dimensions.
Ann Otol Rhinol Laryngol. 2004;113(1):60-68.
PubMedGoogle ScholarCrossref 9.Berke
G, Mendelsohn
AH, Howard
NS, Zhang
Z. Neuromuscular induced phonation in a human ex vivo perfused larynx preparation.
J Acoust Soc Am. 2013;133(2):EL114-EL117.
PubMedGoogle ScholarCrossref 10.Howard
NS, Mendelsohn
AH, Berke
GS. Development of the ex vivo laryngeal model of phonation.
Laryngoscope. 2015;125(6):1414-1419.
PubMedGoogle ScholarCrossref 11.Hirano
M, Ohala
J. Use of hooked-wire electrodes for electromyography of the intrinsic laryngeal muscles.
J Speech Hear Res. 1969;12(2):362-373.
PubMedGoogle ScholarCrossref 12.Berke
GS, Hanson
DG, Gerratt
BR, Trapp
TK, Macagba
C, Natividad
M. The effect of air flow and medial adductory compression on vocal efficiency and glottal vibration.
Otolaryngol Head Neck Surg. 1990;102(3):212-218.
PubMedGoogle Scholar 14.Lohscheller
J, Eysholdt
U, Toy
H, Dollinger
M. Phonovibrography: mapping high-speed movies of vocal fold vibrations into 2-D diagrams for visualizing and analyzing the underlying laryngeal dynamics.
IEEE Trans Med Imaging. 2008;27(3):300-309.
PubMedGoogle ScholarCrossref 15.Timcke
R, Von Leden
H, Moore
P. Laryngeal vibrations: measurements of the glottic wave, I: the normal vibratory cycle.
AMA Arch Otolaryngol. 1958;68(1):1-19.
PubMedGoogle ScholarCrossref 16.Alku
P, Vilkman
E. A comparison of glottal voice source quantification parameters in breathy, normal and pressed phonation of female and male speakers.
Folia Phoniatr Logop. 1996;48(5):240-254.
PubMedGoogle ScholarCrossref 17.Klatt
DH, Klatt
LC. Analysis, synthesis, and perception of voice quality variations among female and male talkers.
J Acoust Soc Am. 1990;87(2):820-857.
PubMedGoogle ScholarCrossref 18.Orlikoff
RF. Assessment of the dynamics of vocal fold contact from the electroglottogram: data from normal male subjects.
J Speech Hear Res. 1991;34(5):1066-1072.
PubMedGoogle ScholarCrossref 19.Hanson
DG, Gerratt
BR, Berke
GS. Frequency, intensity, and target matching effects on photoglottographic measures of open quotient and speed quotient.
J Speech Hear Res. 1990;33(1):45-50.
PubMedGoogle ScholarCrossref 20.Sundberg
J, Cleveland
TF, Stone
RE
Jr, Iwarsson
J. Voice source characteristics in six premier country singers.
J Voice. 1999;13(2):168-183.
PubMedGoogle ScholarCrossref 21.Henrich
N, D’Alessandro
C, Doval
B, Castellengo
M. Glottal open quotient in singing: measurements and correlation with laryngeal mechanisms, vocal intensity, and fundamental frequency.
J Acoust Soc Am. 2005;117(3, pt 1):1417-1430.
PubMedGoogle ScholarCrossref 22.Patel
RR, Dubrovskiy
D, Döllinger
M. Measurement of glottal cycle characteristics between children and adults: physiological variations.
J Voice. 2014;28(4):476-486.
PubMedGoogle ScholarCrossref 23.Berke
GS, Smith
ME. Intraoperative measurement of the elastic modulus of the vocal fold, part 2: preliminary results.
Laryngoscope. 1992;102(7):770-778.
PubMedGoogle ScholarCrossref 24.Kreiman
J, Shue
Y-L, Chen
G,
et al. Variability in the relationships among voice quality, harmonic amplitudes, open quotient, and glottal area waveform shape in sustained phonation.
J Acoust Soc Am. 2012;132(4):2625-2632.
PubMedGoogle ScholarCrossref