Mitchell RB, Garetz S, Moore RH, Rosen CL, Marcus CL, Katz ES, Arens R, Chervin RD, Paruthi S, Amin R, Elden L, Ellenberg SS, Redline S. The Use of Clinical Parameters to Predict Obstructive Sleep Apnea Syndrome Severity in ChildrenThe Childhood Adenotonsillectomy (CHAT) Study Randomized Clinical Trial. JAMA Otolaryngol Head Neck Surg. 2015;141(2):130-136. doi:10.1001/jamaoto.2014.3049
It is important to distinguish children with different levels of severity of obstructive sleep apnea syndrome (OSAS) preoperatively using clinical parameters. This can identify children who most need polysomnography (PSG) prior to adenotonsillectomy (AT).
To assess whether a combination of factors, including demographics, physical examination findings, and caregiver reports from questionnaires, can predict different levels of OSAS severity in children.
Design, Setting, and Participants
Baseline data from 453 children from the Childhood Adenotonsillectomy (CHAT) study were analyzed. Children 5.0 to 9.9 years of age with PSG-diagnosed OSAS, who were considered candidates for AT, were included.
Polysomnography for diagnosis of OSAS.
Main Outcomes and Measures
Linear or logistic regression models were fitted to identify which demographic, clinical, and caregiver reports were significantly associated with the apnea hypopnea index (AHI) and oxygen desaturation index (ODI).
Race (African American), obesity (body mass index z score > 2), and the Pediatric Sleep Questionnaire (PSQ) total score were associated with higher levels of AHI and ODI (P = .05). A multivariable model that included the most significant variables explained less than 3% of the variance in OSAS severity as measured by PSG outcomes. Tonsillar size and Friedman palate position were not associated with increased AHI or ODI. Models that tested for potential effect modification by race or obesity showed no evidence of interactions with any clinical measure, AHI, or ODI (P > .20 for all comparisons).
Conclusions and Relevance
This study of more than 450 children with OSAS identifies a number of clinical parameters that are associated with OSAS severity. However, information on demographics, physical findings, and questionnaire responses does not robustly discriminate different levels of OSAS severity.
clinicaltrials.gov Identifier: NCT00560859
Obstructive sleep apnea syndrome (OSAS) is a common disorder of breathing during sleep characterized by episodic upper airway obstruction that interferes with normal respiratory gas exchange.1 In addition to a sleep disturbance, there is compelling evidence that pediatric OSAS negatively affects behavior and quality of life (QOL). The most common cause of OSAS in children is adenotonsillar hypertrophy. Consequently, adenotonsillectomy (AT) is the first-line surgical treatment for pediatric OSAS,2,3 with more than 500 000 procedures performed per annum in the United States.4
Polysomnography (PSG) is the gold standard for the diagnosis and quantification of OSAS in children. However, most children (90%)3,5 have an AT performed for OSAS based on clinical criteria without objective data from PSG. This is because PSG is costly, can be burdensome, and is often unavailable for children. Also, it remains unknown whether PSG thresholds can predict clinically significant long-term morbidity or disease burden. The diagnosis is, therefore, often made based on clinical parameters that include nighttime and daytime symptoms of OSAS in the presence of adenotonsillar hypertrophy. However, evidence to date has shown a poor correlation between clinical parameters and the presence of OSAS. Several studies,6- 8 including a systemic review of the literature,9 have shown that clinical evaluation is inaccurate at distinguishing primary snoring from OSAS. These studies have been mostly small cohort or retrospective reviews with a variety of diagnostic criteria used to define OSAS. A study by Brouillette et al10 developed an “OSAS score” that was able to distinguish children with OSAS from normal, nonsnoring children. However, the utility of this OSAS score was not confirmed by subsequent studies that demonstrated its inability to distinguish primary snoring from OSAS.6,11
A previous study, based on the Childhood Adenotonsillectomy (CHAT) study data, reported that a higher apnea hypopnea index (AHI) correlated with race, body mass index (BMI) z score, environmental tobacco smoke, family income, and referral source. However, it did not look specifically at whether these associations can distinguish severe from mild to moderate OSAS.12 It remains unclear how individual or a combination of clinical parameters can predict OSAS severity as measured by PSG. Predicting OSAS severity is important. In particular, it can identify children who most need an objective investigation to evaluate their risk for perioperative respiratory complications or for persistent OSAS after AT. There is, therefore, a compelling reason to identify children with severe OSAS preoperatively using clinical parameters and to prioritize those needing PSG.
The present study reports an analysis of baseline data from the CHAT study, a randomized controlled clinical trial (RCT) designed to assess neuropsychological and health outcomes in children randomized to early AT compared with watchful waiting with supportive care.13 The objective of the current analysis is to assess whether a combination of factors, including demographics, physical examination findings, and caregiver reports from validated questionnaires can predict OSAS severity in children. Because we hypothesized that symptoms and physical findings may differ across subgroups of children, we also explored whether the predictive value of these parameters differed in obese vs normal weight and African American vs non–African American children.
The CHAT study was a multicenter, single-blinded, RCT conducted at 6 academic sleep centers. A detailed description of the methodology of the CHAT study is described elsewhere13 and the primary outcomes have been reported.14 A total of 453 children were recruited and randomized to AT or watchful waiting with supportive care and followed-up at 7 months.14 Participants were recruited from pediatric sleep centers and sleep laboratories, pediatric otolaryngology clinics, general pediatric clinics, and the general community. Institutional review board approval was obtained from each participating institution, children provided assent, and parents provided written informed consent. Patients were compensated for the primary study.
The targeted study population was children 5.0 to 9.9 years of age who were considered candidates for AT. Children were included in the study based on a parental report of snoring and an otolaryngology evaluation that confirmed the child as a candidate for AT (tonsillar hypertrophy score ≥1 based on a standardized scale of 0-4) and a standardized, centrally scored PSG showing an obstructive apnea index (OAI) (the number of obstructive apneas per hour of sleep) of at least 1 or AHI of at least 2.13 Exclusion criteria included an OAI greater than 20, an AHI greater than 30, a percentage of sleep time at an oxyhemoglobin saturation of less than 90% for more than 2% of total sleep time, comorbidities (craniofacial or cardiac disorders), recurrent tonsillitis requiring surgical intervention, use of medications for attention-deficit/hyperactivity disorder, and extreme obesity (BMI z score ≥3.00 for age group and sex z score).13
Data were derived from screening and baseline measurements.13 Caregivers completed questionnaires about demographic information, including age, sex, race, height, and weight. Race was categorized as African American and non–African American. Height and weight were converted into BMI-, age-, and sex-adjusted percentiles and z scores (http://www.cdc.gov/growthcharts/). Tonsillar size was based on an otolaryngologic evaluation and categorized as grades 1 and 2 (≤50% obstruction) and grades 3 and 4 (>50% obstruction).13 The Friedman palate position score15 was also recorded. The Friedman palate position score is based on visualization of structures in the mouth with the mouth open widely without protrusion of the tongue or phonation. Palate grade I allows the observer to visualize the entire uvula and tonsils, grade II allows visualization of the uvula but not the tonsils, and grade III allows visualization of the soft palate but not the uvula, and grade IV allows visualization of the hard palate only. For the Friedman scores, grades I and II were compared with grades III and IV for easier interpretation of the data.15
Data from the following 3 questionnaires were included in the analysis:
Epworth Sleepiness Scale modified for children, an 8-item questionnaire in which scores range from 0 to 24, with higher scores indicating greater daytime sleepiness.16
2. Pediatric Sleep Questionnaire (PSQ) Sleep-Related Breathing Disorder scale, a 22-item questionnaire in which scores range from 0 to 1, with higher scores indicating greater severity.17
The OSA-18, an 18-item disease-specific QOL questionnaire18 assessing symptoms in domains of sleep disturbance (score range, 4-28), physical suffering (range, 4-28), emotional distress (range, 3-21), daytime problems (range, 3-21), and caregiver concerns (range, 4-28). Total scores ranged from 18 to 126, with higher scores indicating worse QOL.
All children underwent full-night PSG by study-certified technicians using a standardized protocol and following the American Academy of Sleep Medicine (AASM) guidelines. Scoring was performed according to the AASM pediatric criteria by certified technologists blinded to all other study data at a central PSG reading center (Case Western Reserve University/Brigham and Women’s Hospital). The AHI was defined as the sum of all obstructive and mixed apneas, plus hypopneas associated with a 50% reduction in airflow and either a greater than 3% desaturation or electroencephalographic arousal, divided by hours of total sleep time. The ODI was defined as the number of times per sleep hour that oxygen saturation dropped by 3% or more.
Summary statistics are presented as means (SDs) for continuous independent variables and frequencies (percentages) for categorical independent variables. The primary outcomes, the AHI and ODI, were examined in continuous form using linear regression and in binary form using logistic regression. To satisfy the assumptions for the linear regression models, the natural log of AHI (lnAHI) and of ODI (lnODI) were used in analyses. In the logistic regression models, AHI and ODI were dichotomized at AHI of at least 10. To identify which of the demographic, clinical, and caregiver reports were significantly associated with AHI and ODI, univariate linear or logistic regression models were fitted. To further examine potential effect modification by obesity status, linear and logistic regression models were fitted that added an interaction term between each characteristic and obesity status (BMI percentile ≥95th or <95th). Similar models were fitted to assess if there was significant interaction between each characteristic and race (African American vs non–African American). No additional covariates were added to these models.
In addition, multiple linear or logistic regression models were fitted to assess which characteristics simultaneously predicted each outcome. Any characteristic that was significant at α < .20 in the univariate analysis was entered into a multiple linear or logistic regression model. The final multivariable models for AHI and ODI included all variables identified by the selection procedures (backward, stepwise) that remained simultaneously significant at α < .05. Diagnostic procedures were conducted to ensure that none of the model assumptions were violated. All tests were 2-sided. All analyses were performed using SAS statistical software (version 9.3; SAS Institute Inc) and R software (version 2.13.0 or higher).
Analyses are based on 453 children with baseline characteristics reported in Table 1. Their mean age was 7.0 years, and 52% were girls. Most of the children (55%) were African American. The mean BMI z score was 0.9, and 33% of children were obese. Seventy-four percent had grade III or IV tonsils, and 64% a grade III or IV Friedman palate position score. The mean Epworth score was 7.7, the mean OSA-18 total score was 53.6, and the mean PSQ total score was 0.5. A total of 88 (19.5%) had an AHI of at least 10.
In univariate analyses (Table 2), the OSA-18 and PSQ total scores were each significantly associated with higher levels of AHI or ODI, either measured continuously or dichotomized. Race (African American), was significantly associated with higher levels of AHI, either measured continuously or dichotomized, and ODI measured continuously. Obesity (BMI z score >2; >95th percentile) was associated with higher levels of AHI and ODI measured continuously. When dichotomized, obesity (BMI z score >2) was significantly associated with ODI of at least 10 but not AHI of at least 10. Conversely, obesity (>95th percentile) was significantly associated with AHI of at least 10 but not ODI of at least 10. The modified Epworth Sleepiness Scale score was significantly associated with ODI measured on a continuous scale and with either AHI or ODI dichotomized. Ages, sex, tonsillar size, and Friedman palate position scores were not associated with AHI or ODI (Table 2). Models that tested for potential effect modification by race or obesity showed no evidence of interactions with any clinical measure and AHI or ODI (P > .20 for all comparisons).
In multivariate analysis modeling of lnAHI (Table 3), explanatory variables that remained significant were race (African American) and the PSQ total score. For models predicting the lnODI, significant variables were BMI z score and the OSA-18 total score. The partial R-squared, which measures the proportion of the variance of OSAS severity explained by these variables, was less than 3% for all calculations (Table 3). In the multivariate logistic regression models of the AHI of a least 10 (Table 4), explanatory variables that remained significant were race (African American) and the OSA-18 total score. For the model predicting the ODI of at least 10, the only significant variable was BMI z score. The odds ratio (OR) for race (African American) predicting the AHI of at least 10 was 1.65 (95% CI, 1.01-2.69) and for the OSA-18 total score 1.02 (95% CI, 1.00-1.03). The C statistics for these logistic models, measuring the probability that outcomes are predicted better than chance, were 0.62 for predicting AHI of at least 10 and 0.60 for predicting ODI of at least 10.
The present study indicates that a statistical model that incorporates information on demographics, physical findings, and questionnaire responses does not robustly discriminate different levels of OSAS severity. The study does, however, identify a number of clinical parameters that are associated with OSAS severity as measured by PSG variables. The results were similar when considering these variables as continuous or categorical outcomes. These clinical parameters include indices of obesity, African American race, OSA-18 total score, and PSQ total score. Associations of these variables with PSG variables were not significantly modified by obesity or race. Indeed, even a multivariable model that included the most significant variables explained less than 3%, of the variance in the PSG outcomes.
The severity of OSAS was measured in this study using the AHI and ODI, 2 common metrics derived from PSG. The AHI is a measure of respiratory event frequency, whereas the ODI was used as a measure of intermittent hypoxemia. Each index is a “count” per sleep hour of the number of respiratory disturbances or desaturation events, respectively, with higher numbers considered to represent more severe OSAS. The weak association between these indices and other clinical parameters is consistent with prior studies in children.6- 9 However, studies to date have mostly looked at the predictive value of clinical parameters in distinguishing primary snoring from OSAS, not at predicting OSAS severity. Both individual studies as well as reviews have concluded that neither a single nor combination of clinical parameters can distinguish primary snoring from OSAS.6- 9 The present study of a large and well-characterized sample also indicates that clinical parameters, even when collected using standardized approaches and instruments, are also poor predictors of different levels of OSAS severity in children.
Some prior studies have included an analysis of the association between clinical history or symptoms and OSAS severity and have agreed with the findings of our study.11,19 Carroll et al11 studied 83 children and, as part of the analysis, defined OSAS as an AHI greater than 5 or greater than 10. As the AHI cutoff was raised, the proportion of reported symptoms increased, but the predictive ability of these symptoms in distinguishing primary snoring from OSAS did not change significantly. Similarly, Goodwin et al19 showed that the frequency of snoring, excessive daytime sleepiness, and learning problems increased with OSAS severity with an AHI of 1 to 5 and with hypoxemia. However, Goodwin et al19 studied the impact of increasing OSAS severity on clinical outcomes rather than the value of clinical parameters in predicting OSAS severity. These studies,11,19 in keeping with our results, highlight the difficulty in identifying a set of clinical parameters that can distinguish among children with different severities of OSAS.
Our results indicate that assessment of tonsillar size and palate position by physical examination provide limited information on the severity of OSAS and raise questions about the current clinical practice of using this information as part of surgical decision-making. Howard and Brietzke20 compared tonsil size, Friedman palate position scores, and preoperative AHI in 34 children and also found no correlation between these examination findings and PSG outcomes. They postulated that this was because most children fall in the middle group (size or score II-III), and this corresponds to a wide range of OSAS severity. They reported on some patients with size 1+ tonsils and severe OSAS and others with size 4+ tonsils without evidence of OSAS. Equally, our clinical examination during wakefulness was unlikely to quantify the dynamic interactions between the adenoids, tonsils, and soft palate and thus could not represent the 3-dimensional aspect of airway restriction during sleep.21 It is likely, therefore, that our clinical examination assessed some, but not all, sources of upper airway obstruction in these children.
Multiple linear regression analysis showed that the best model included only 2 clinical parameters (race and PSQ total score) for AHI and 3 (race, BMI z score and OSA-18 total score) for ODI. However, when the models were tested for their ability to predict OSAS severity, they performed poorly. Similarly, multivariable logistic regression showed that race (African American vs non–African American) made it more likely (OR, 1.7) to have an AHI and ODI of at least 10, but that this had limited predictive ability (C statistic close to 0.5). There are several possible explanations for our inability to predict OSAS severity more accurately. The symptoms of OSAS predominantly occur at night when the child may not be closely observed, and this may contribute to an inaccuracy of the reported symptoms. Questionnaires are also inherently subjective, with some parents overreporting and others underreporting symptoms. As mentioned herein, examination findings also show no correlation with OSAS severity. Most children with different levels of OSAS severity seem to have similar examination findings when evaluated in the clinic, and these findings may not correlate with upper airway narrowing while the child is asleep. This reflects research that shows that OSAS severity is influenced by multiple factors other than anatomic factors, including abnormal upper airway neuromuscular tone.22 The poor correlations we observed may also reflect the modestly truncated range of values of OSAS severity in this research sample because no child had an AHI greater than 30.
The findings of this study indicate that there is unique information contained within the PSG and other clinical assessment tools. A key challenge is to identify which measures are needed to improve decision making, including approaches to treatment, perioperative treatment, and long-term follow-up. There remains a debate about whether all children with a sleep disturbance who are candidates for AT should undergo PSG or whether this should be limited to high-risk groups.23- 25 Identifying children with severe OSAS facilitates perioperative planning and may reduce the risk of surgical complications but currently can only be done using PSG. The role of PSG compared with the use of clinical parameters and symptom measures to provide long-term prognostic information requires further investigation. The growing recognition of the importance of “patient-centered research,” in which outcomes such as QOL are prioritized over the results of objective physiological tests, supports the need for further research in this area.
There are a number of strengths to this study. It recruited a large diverse patient population from 6 centers across the United States and used rigorous, standardized approaches for data collection. The PSGs were scored blinded to all other clinical data by a central sleep reading center with high levels of scoring reliability; all children provided a standard medical history and underwent examination, and caregivers completed validated questionnaires. There are also several limitations. The sample had a truncated distribution of AHI levels, and thus we could not compare a “normal” to a “diseased” group. There is also no universal definition of severe OSAS in children. We therefore used an AHI of 10 to define severe OSAS.25 This is a commonly used definition but has not been shown to correlate clinically with severity or disease burden. Children with an obstructive apnea index greater than 20, AHI greater than 30, or percentage of sleep time at an oxyhemoglobin saturation of less than 90% for more than 2% of total sleep time were excluded as part of the study method and were not analyzed. The sample was also limited to children 5.0 to 9.9 years of age. It is possible, but unlikely, that different associations between clinical parameters and PSG findings would be observed in younger or older children or in children with very severe OSAS and hypoxemia who were excluded from this study. However, it is worth noting that only 43 of 1244 children who had screening PSGs (3.5%) were excluded from the CHAT study because of OSAS severity.12
This study of more than 450 children with OSAS shows that some clinical parameters are correlated with more severe OSAS, but none individually or in combination explain a substantial proportion of the variance in either the AHI or ODI, nor do they discriminate the presence or absence of more severe OSAS defined by PSG. Our findings further confirm that African American race and obesity are associated with more severe OSAS in children ages 5.0 to 9.9 years and indicate that, of the instruments examined, the OSA-18 and PSQ most significantly correlated with PSG outcomes. However, the weak associations indicate that OSA severity cannot be accurately predicted by traditional clinical measures or commonly used instruments.
Corresponding Author: Ron B. Mitchell, MD, Department of Otolaryngology–Head and Neck Surgery, UT Southwestern and Children’s Medical Center Dallas, 2350 N Stemmons Freeway, ENT Administration, Sixth Floor, F6.212, Dallas, TX 75207 (Ron.Mitchell@UTSouthwestern.edu).
Submitted for Publication: June 14, 2014; final revision received September 17, 2014; accepted October 4, 2014.
Published Online: December 4, 2014. doi:10.1001/jamaoto.2014.3049.
Author Contributions: Dr Mitchell had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Mitchell, Garetz, Moore, Marcus, Arens, Chervin, Amin, Ellenberg, Redline.
Acquisition, analysis, or interpretation of data: Mitchell, Garetz, Moore, Rosen, Marcus, Katz, Arens, Chervin, Paruthi, Elden, Ellenberg, Redline.
Drafting of the manuscript: Mitchell, Garetz, Moore, Katz, Arens, Redline.
Critical revision of the manuscript for important intellectual content: Mitchell, Garetz, Moore, Rosen, Marcus, Arens, Paruthi, Amin, Elden, Ellenberg, Redline.
Statistical analysis: Moore, Ellenberg.
Obtained funding: Marcus, Arens, Amin, Ellenberg, Redline.
Administrative, technical, or material support: Mitchell, Arens, Chervin, Paruthi, Ellenberg, Redline.
Study supervision: Mitchell, Rosen, Chervin, Elden.
Conflict of Interest Disclosures: Dr Rosen has been a consultant about childhood narcolepsy for Jazz Pharmaceuticals and a short-term expert consultant for Natus Medical. Ms Marcus has received research support from Philips Respironics and Ventus. Dr Chervin is named in or has developed patented and copyrighted materials, owned by the University of Michigan, and designed to assist with assessment or treatment of sleep disorders; these materials include the Pediatric Sleep Questionnaire Sleep-Related Breathing Disorder scale, used in the research now reported. Dr Chervin serves on the boards of the American Academy of Sleep Medicine and the International Pediatric Sleep Society; is a section editor for UpToDate and a book editor for Cambridge University Press; has received support for biomedical innovation education or research from Philips Respironics and Fisher Paykel; and has consulted for MC3 and Zansors. Dr Ellenberg has received payments from Bristol-Myers Squibb for data monitoring center service; from Merck, Chelsea, Salix, and GSK for general consulting; and from Merck, Janssen, and 23andMe for internal lectures. No other disclosures are reported.
Funding/Support: This study was funded by National Institutes of Health grants: HL083075, HL083129, UL1 RR024134, UL1 RR024989. Dr Redline and Brigham and Women’s Hospital have received a research grant from ResMed Foundation and research equipment from Philips Respironics and ResMed Inc.
Role of the Funder/Sponsor: The funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.