Scatterplot of Epworth Sleepiness Scale and apnea-hypopnea index findings for all subjects who underwent full-night polysomnography (n = 50). Note the apparent lack of association between these variables.
Scatterplot of Epworth Sleepiness Scale and apnea-hypopnea index for all subjects who underwent split-night polysomnography (n = 46). Note the apparent lack of association between these variables.
Weaver EM, Kapur V, Yueh B. Polysomnography vs Self-reported Measures in Patients With Sleep Apnea. Arch Otolaryngol Head Neck Surg. 2004;130(4):453-458. doi:10.1001/archotol.130.4.453
While obstructive sleep apnea syndrome is defined by both polysomnographic (PSG) abnormalities and symptoms, severity is quantified primarily by the apnea-hypopnea index (AHI) alone.
To determine the correlation between standard PSG indices (AHI and others) and self-reported sleepiness, mental health status, and general health in patients with sleep apnea.
University-affiliated outpatient sleep laboratory.
Ninety-six consecutive patients with PSG-confirmed sleep apnea (AHI ≥5).
Patients completed a questionnaire that included the Epworth Sleepiness Scale, Medical Outcomes Study 36-Item Short-Form Health Survey (SF-36) mental health domain, and self-rated health on the evening of diagnostic PSG. Spearman correlation coefficients were computed. This sample had 85% power to detect a correlation of 0.3 or greater. The associations between PSG indices and self-reported measures were further assessed with multivariable regression techniques, adjusting for age, sex, body mass index, comorbidity, and PSG type.
The PSG parameters correlated poorly with self-reported measures (15 correlations; range of magnitude, 0.004-0.24; mean, 0.09). AHI was not associated with self-reported sleepiness or general health, and it was associated with the SF-36 Health Status mental health domain only on multiple linear regression (P = .04) but not on multiple logistic regression (adjusted odds ratio, 1.02; 95% confidence interval, 1.00-1.04; P = .09).
In general, PSG measures, and AHI in particular, correlated poorly with self-reported measures in a clinical sleep laboratory sample. After adjustment for potentially confounding variables, weak associations were found between some PSG indices and selected self-reported measures. These findings suggest that sleep apnea disease burden should be quantified with both physiologic and subjective measures.
Polysomnography (PSG) is considered the gold standard for the diagnosis of obstructive sleep apnea syndrome (OSAS), the estimation of its severity, and measurement of treatment response.1 While OSAS involves both a PSG abnormality and symptoms, its severity is often defined by the apnea-hypopnea index (AHI) alone.2 Improvement of symptoms is an important outcome, especially for patients with mild OSAS, with which serious medical complications are less likely to occur.3- 6
Clinically, we have noted discordance between the severity indicated by PSG results and the degree of symptoms reported by some patients with OSAS. Thus, we sought to determine the correlation between the severity of abnormalities as measured by standard PSG indices and the severity of self-reported sleepiness, symptoms of depression, and general health status in a cross-sectional analysis of a clinical OSAS population.
We retrospectively reviewed the records of all patients (N = 96) who had diagnostic PSG findings consistent with OSAS (AHI ≥5) in the University of Washington Sleep Laboratory between July 1, 1998, and January 31, 1999. Age, sex, height, weight, and self-reported comorbid conditions (specifically, angina, myocardial infarction, congestive heart failure, stroke, and chronic obstructive pulmonary disease) were recorded for each patient on the evening of PSG. The comorbidity score (range, 0-5) was the number of comorbid conditions listed above. Comorbidity score was considered missing if the patient answered "don't know" or did not respond to the comorbidity questionnaire (n = 24). Body mass index was calculated as weight in kilograms divided by the square of height in meters. This study was approved by the University of Washington Human Subjects Review Committee.
All patients underwent overnight, monitored, in-laboratory diagnostic PSG. When split-night PSG studies were performed (combined diagnostic and continuous positive airway pressure titration; n = 46), only data from the diagnostic portion (mean ± SD sleep time, 118 ± 72 minutes) was analyzed for this study. The PSG included recordings of sleep state parameters (4-lead electroencephalogram, bilateral electro-oculogram, and submental and bilateral leg electromyogram), breathing (oronasal airflow by thermistors and thoracic and abdominal excursion by strain gauge), oximetry, electrocardiogram, and infrared video. All studies were manually scored by trained technicians and confirmed by a single board-certified sleep physician (V.K.).
Sleep stages7 and arousals8 were scored in standard fashion. Apnea was defined as 75% or greater reduction of airflow for 10 seconds or longer. Apnea index was defined as the number of apneas per hour of sleep. Hypopnea was defined as 25% reduction or more of airflow for 10 seconds or longer associated with either cortical arousal or 3% oxyhemoglobin desaturation. Apnea-hypopnea index was defined as the number of apneas and hypopneas per hour of sleep and was clinically categorized as normal (0-4.9), mild (5-15), moderate (15.1-30), or severe (>30).2 Arousal index was defined as the number of arousals per hour of sleep. The lowest oxyhemoglobin saturation was the lowest recorded during sleep, and hypoxemic burden was defined as the percentage of sleep time with an oxyhemoglobin saturation of less than 90%. Each of these PSG indices was analyzed as a continuous variable, and AHI was also analyzed as an ordinal variable categorized clinically and as 33rd percentiles.
During the evening of PSG, each patient had completed a self-administered clinical questionnaire that included the Epworth Sleepiness Scale (ESS),9 the mental health domain of the Medical Outcomes Study 36-Item Short-Form Health Survey questionnaire (SF-36),10 and self-rated health.11 The clinical questionnaire did not include other parts of the SF-36. Patients were blind to PSG results (not yet measured) at the time of questionnaire. Eighty-four patients completed all questions.
The ESS was analyzed as a continuous variable, as a clinically categorized ordinal variable (normal, 0-10; excessive daytime sleepiness, 11-15; and severely excessive daytime sleepiness, 16-24), and as a dichotomized variable (normal, 0-10; excessive daytime sleepiness, 11-24).9
The SF-36 mental health domain measures symptoms of depression and anxiety, which are common complaints in patients with sleep apnea.12 The score ranges from 0 to 100, with 100 representing the best score. This variable was analyzed as a continuous variable and as a dichotomized variable with a cut point 1 SD above the mean for clinically depressed people (depressed, 0-6710; normal, 68-100).
Self-rated health was a single item that asked, "In general would you say your health is . . . ?" and was scored on a 5-point Likert scale ranging from excellent (1) to poor (5). It has been shown to be a broad predictor of clinically important outcomes.11,13,14 Self-rated health was analyzed as an ordinal variable (scale, 1-5) and as a dichotomized variable (excellent/very good vs good/poor).
Contingency tables were analyzed to demonstrate basic associations. Spearman correlations were tabulated for the self-reported measures and PSG parameters. Spearman correlations are presented because the assumptions of normal distributions for Pearson correlations were not met. With 96 subjects, this study had 85% power to detect an important correlation coefficient (≥0.3) at the 2-tailed significance level of .05.15
Multiple linear regression was used to further assess for significant associations between each self-reported measure (ESS or SF-36 mental health domain) as the dependent (continuous) variable and each PSG index as the independent (continuous) variable, adjusting for age, sex, body mass index, comorbidity score, PSG type (split-night vs full-night), and presence of periodic limb movement disorder. The AHI was also analyzed separately as an ordinal variable of clinical categories and as an ordinal variable of 33rd percentiles. Residual diagnostics were analyzed visually to confirm assumptions of linearity and homogeneity of variance (no heteroscedasticity). Each adjustment variable is a potential confounder because each may be associated with PSG indices and self-reported measures. Adjustment for the presence of periodic limb movement disorder did not affect the results, so only analyses without this adjustment are shown. Adjustment for the presence of individual comorbid conditions (dummy variable model) was not significantly different from adjustment for comorbidity score; the latter is reported.
Multivariable logistic regression was used to further assess for significant associations between each dichotomized self-reported measure (ESS, SF-36 mental health domain, or self-rated health) as the dependent variable and each PSG index as the independent variable, adjusting for age, sex, body mass index, comorbidity score, PSG type (split-night vs full-night), and presence of periodic limb movement disorder. Multivariable ordinal logistic regression was then used to further evaluate the association between each ordinal self-reported measure (ESS categories or self-rated health) as the dependent variable and each PSG index as the independent variable, adjusting for the same potential confounders as above. Ordinal logistic regression produces a common odds ratio that estimates the odds of the dependent ordinal variable increasing by 1 higher category for each unit increase in the independent variable. The common odds ratio is the same between any 2 consecutive categories of the dependent variable, and it is not dependent on the category cut points. The assumption of proportional odds was confirmed by an approximated score test for each model.
All data were analyzed with Intercooled Stata 7.0 software (Stata Corp, College Station, Tex). A P value lower than .05 was considered significant.
The sample population was middle aged, obese, predominantly male, and had mild comorbidity (Table 1). The study sample manifested severe sleep apnea based on AHI and excessive subjective daytime somnolence based on ESS (Table 2). The patients had a significant deficit in the SF-36 mental health domain relative to the general US population norms (Table 2). They scored slightly worse on self-rated health relative to the general US population norms (Table 2). There were broad distributions for age, body mass index, each PSG index, and each self-reported measure. When stratified across AHI clinical categories, there was no trend of association between AHI category and any self-reported measure (Table 3).
Approximately half of the patients underwent full-night PSG, and the other half had split-night PSG. The full-night and split-night PSG sample populations were similar in age, comorbidity score, ESS, SF-36 mental health domain, and self-rated health (P>.05 for all). Patients qualified for split-night PSG by demonstrating sufficiently severe OSAS in the first few hours of the PSG to justify titration of continuous positive airway pressure therapy. Consistent with this qualifying criterion, these patients had significantly worse PSG parameters and OSAS risk factors (male sex and obesity; P<.05 for all) than those who did not qualify for a split-night PSG.
Most of the Spearman correlations between the various standard PSG indices were significant, with a mean correlation magnitude of 0.45 (Table 4). Most of the correlations between the various self-reported measures were significant, with a mean correlation magnitude of 0.26. However, nearly all of the correlations between the standard PSG indices and the self-reported measures were not significant, with a mean correlation magnitude of 0.09 (Table 4). The AHI correlated particularly poorly with all self-reported measures (mean correlation magnitude, 0.02). Symptoms (ESS and SF-36 mental health domain) correlated poorly with the PSG indices (mean correlation magnitude, 0.06).
These poor overall correlations persisted when split-night and full-night PSG were analyzed separately. Figure 1 and Figure 2 depict the poor association between ESS and AHI in each PSG group. For split-night PSG, the mean correlation between all PSG indices and all self-reported measures was 0.13; between AHI and all self-reported measures, it was 0.08. For full-night PSG, the mean correlation between all PSG indices and all self-reported measures was 0.13; between AHI and all self-reported measures, it was 0.09. Since these correlations were slightly higher than before stratifying on PSG type, this variable was adjusted in all multivariable regression analyses.
Of the 15 relationships between PSG indices and self-reported measures, the only statistically significant correlation was self-rated health with hypoxemic burden (correlation, 0.24; P = .02). The relationship of self-rated health to arousal index (correlation, 0.19; P = .07) and lowest saturation (correlation, 0.19; P = .07) approached significance, as did ESS and lowest saturation (correlation, 0.21; P = .06).
Thirteen of the 14 multiple linear regression models of each self-reported measure (continuous variable ESS or SF-36 mental health domain) and each PSG index revealed no significant association. The only significant association was between the SF-36 mental health domain and AHI (continuous variable): AHI regression coefficient, –0.18 (P = .04; 95% confidence interval [CI], –0.36 to 0.00). Thus, for every increase in AHI by 10 events per hour, there was a decrease in the SF-36 mental health domain score of 1.8, when all confounding variables were held constant. The SF-36 mental health domain was not associated with the AHI clinical category (P = .39) or 33rd percentile category (P = .40).
Twenty of the 21 multivariable logistic regression models of each dichotomized self-reported measure (ESS, SF-36 mental health domain, or self-rated health) and each PSG index demonstrated no significant association. The only significant association was between the dichotomized SF-36 mental health domain and the arousal in dex (adjusted odds ratio, 1.02; P = .05; 95% CI, 1.00-1.04). This model suggests that for each increase in arousal index of 10 arousals per hour, there is a 22% increased odds of being depressed when all confounding variables are constant.
Twelve of the 14 multivariable ordinal logistic regression models of each ordinal self-reported measure (ESS categories or self-rated health) and each PSG index showed no significant association. The only significant associations were between self-rated health and arousal index (adjusted common odds ratio, 1.02; P = .04; 95% CI, 1.00-1.03) and between self-rated health and lowest saturation (adjusted common odds ratio, 0.94; P = .02; 95% CI, 0.90-0.99). For example, for each increase in arousal index of 10 arousals per hour, the odds of a worse self-rated health increased by 19% when all other modeled variables were held constant. For each decrease in lowest saturation of 5%, the odds of a worse self-rated health increased by 34%, with all confounding variables constant.
Among all the regression models, AHI was significant only with the SF-36 mental health domain and only on multiple linear regression (Table 5). The AHI clinical category or 33rd percentile category was not significantly associated with any self-reported measure. Among the confounding variables in all the models, sex and comorbidity were most commonly significant (data not shown).
Symptoms and health-related quality of life deficits are important elements of OSAS.2,12,16 Among subjects at risk for OSAS, the symptoms and subjective complaints are independent predictors of important clinical and psychosocial outcomes such as amount of sick leave, worse self-rated health, impaired work performance, and divorce rate.17 Clinicians routinely record the qualitative complaints of patients with OSAS, but in research and clinical practice, the quantitative PSG indices are used as the primary assessment of disease severity. The importance of measuring, researching, and reporting the subjective effects of OSAS is gaining attention.16
For any given subject, it may be difficult to predict the severity of OSAS on PSG from the symptoms and vice versa. In large epidemiologic studies, only a minority of individuals with an AHI greater than 5 complain of subjective sleepiness. Although a correlation between subjective sleepiness and AHI is present, it is modest.18 In a clinical sleep medicine practice, one can see discordance between the severity of subjective complaints and PSG findings in some patients with OSAS.16 Our data confirmed that these subjective complaints and objective measures of OSAS severity were, at best, weakly associated in our clinical laboratory sample.
While the standard physiologic measures (PSG) of OSAS correlate significantly with the ESS18 and SF-36 mental health domain19 in population-based studies, the present study and others12,20- 27 suggest that the association may not hold in sleep laboratory–based clinical populations. Limiting analyses to subjects with sleep apnea may remove the major driver of the association between quantitative objective and subjective measures of disease burden. That is, subjects from the general population without apnea have no PSG abnormalities and tend to have few symptoms, while those with OSAS have PSG abnormalities and tend to have clinically important symptoms. When both subgroups are included, the association between PSG and symptoms is observed. When the analysis is limited to 1 subgroup, namely, those with OSAS, the association may be lessened. There were few subjects without OSAS in our clinical laboratory sample, and all had clinically important symptoms, which reflects a natural selection bias in the clinical setting. Because sleep laboratory patients without sleep apnea (ie, no PSG abnormality) were symptomatic, including them in the analysis reduced the association between PSG severity and symptoms severity even further (data not shown).
Descriptions of community-based populations help us understand OSAS, but descriptions of these relationships in clinical populations help clinicians understand individual patients. Since the severity of OSAS is defined by AHI,2 but patients often self-assess severity by symptoms (like sleepiness and depression), understanding the relationship in individual patients is useful to clinicians. After all, it is patients, not the "general population," who seek evaluation and treatment. While PSG does measure the specific physiologic derangements of OSAS and may estimate future medical risk of OSAS,3 in individual patients it appears not to estimate some of the day-to-day effects often associated with OSAS. As a whole, our patients experienced these day-to-day effects but not in correlation with the severity of their OSAS as measured by PSG. There may be 2 (relatively) distinct effects of OSAS, namely, long-term medical risk on one hand and day-to-day symptoms and quality of life deficits on the other. The physiologic measures of disease may not estimate both consequences well. Subjective outcomes are best determined through patient inquiry. Some have attributed the discordance between physiologic and subjective experience to a differential susceptibility of individual patients to the physiologic effects of OSAS.28
A second explanation for the discrepancy is that many factors besides OSAS influence sleepiness, depression, and health status. While OSAS severity may play a role, it may be diluted by other factors like poor sleep hygiene, life stressors, and comorbidity. However, our patients reported a mean nightly sleep time of 6.9 hours, which suggests fair sleep hygiene. The discrepancies largely persisted even after adjusting for comorbidity and presence of periodic limb movement disorder.
Another interpretation of these findings may simply be that the ESS, SF-36 mental health domain, and self-rated health may not measure important aspects of OSAS. However, sleepiness is a diagnostic criterion for OSAS,2 and the ESS is a common measure of subjective sleepiness. In interviews of patients with sleep apnea, investigators have found that sleepiness and depression are 2 important issues to these patients.29,30 Self-rated health has been associated with the presence of sleep-disordered breathing19 and with sleepiness31 in community-based studies and with OSAS symptoms in an obesity trial-based study.17
Our analyses did reveal some weak associations between self-reported measures and PSG indices. No single PSG index stood out as a consistent predictor of self-reported measures. It is notable that AHI was a poor predictor (Table 5). It makes sense that arousal index, a measure of sleep fragmentation, may correlate best with symptoms. However, other studies support only a weak association.20,25,32
The incompleteness of the self-reported measures is a limitation of this study. While sleepiness, depression, and self-rated health are important to patients with OSAS, they do not measure the effects of OSAS on function and quality of life. It will be useful to study the correlation between PSG and quality of life using psychometrically validated instruments that measure general and OSAS-specific symptoms, functional status, health status, bother, and satisfaction in patients and their sleep partners. Quality of life instruments specific to OSAS will be more sensitive than generic instruments for evaluating the burden of OSAS on patients.
Another potential limitation is that almost half of the PSGs were performed as split-night studies. These abbreviated diagnostic studies in severe cases may not fully measure the extent of OSAS and may thus confound the relationship between severity of OSAS and subjective measures. This minor source of confounding, however, does not explain the lack of association. Even when full-night and split-night subjects were analyzed separately, the correlation point estimates were poor in both groups (individual correlations not shown). In the regression analyses, adjustment for PSG type did not change the overall results.
The incomplete comorbidity adjustment is another limitation of these results. Our comorbidity adjustment was similar to that of other studies that have included a comorbidity adjustment.19,27 Standard comorbidity indices (eg, the Charlson Index33 and Kaplan-Feinstein Index34) are based on mortality outcomes and thus may not be valid for application in subjective outcomes. An optimal comorbidity index has not been well characterized for adjustment in quality of life studies. As with any observational study, it is possible that other important confounding variables were not analyzed.
The cross-sectional nature of this study tells us only about the assessment of OSAS burden. It will be important to investigate whether a change in physiologic measures after treatment correlates with a change in symptoms and quality of life. If the changes in measures do not correlate well, as was found in a recent study using the SF-36,25 then one should be very careful about what primary outcome to use in determining the best treatment strategies for OSAS. The physiologic measures alone may misread the outcome of most importance to many patients.
In conclusion, these results suggest that there are only weak associations between the severity of PSG index abnormalities and the degree of self-reported sleepiness, mental health, and general health among our OSAS patient population. The primary measure of OSAS severity, AHI, was not associated with self-reported sleepiness or general health. These conclusions imply that it may be prudent for clinical investigators to assess physiologic and subjective severity of OSAS distinctly when studying patients with OSAS. Future prospective longitudinal studies using validated disease-specific quality of life instruments will help discern the association between PSG index outcomes and quality of life outcomes.
Corresponding author and reprints: Edward M. Weaver, MD, MPH, 1660 S Columbian Way (112-OTO), Seattle, WA 98108 (e-mail: email@example.com).
Submitted for publication May 19, 2003; accepted September 3, 2003.
This research was supported by the Veterans Health Administration and the Robert Wood Johnson Clinical Scholars Program (Dr Weaver) and Career Development Award CD-98318 from the Health Services Research and Development Service of the Veterans Health Administration, Department of Human Affairs (Dr Yueh).
This work was presented at the Associated Professional Sleep Societies 14th annual meeting; June 21, 2000; Las Vegas, Nev.
The views expressed in this article are those of the authors and do not necessarily represent the views of the Department of Veterans Affairs.