To review the published literature on uvulopalatopharyngoplasty (UPPP) and assess the methodological quality of the research and compare it with a similar article published in 1995; and to determine what, if any, improvement in the methodological quality of the research resulted during the ensuing 10 years.
Methodological and statistical evaluation of the published literature on UPPP. Thirty articles representing the clinical studies on UPPP and related procedures written from January 1996 to August 2005 were reviewed. Only articles reporting polysomnography data were included.
Overall, the articles demonstrated fair methodological and statistical quality. Compared with the previous review by Schechtman et al, there was a slight increase in the number of articles that discussed statistical power and reported confidence intervals. There were increases in the mean sample size, the percentage of randomized controlled studies, the number of end points, and the use of validated subjective outcome measures; longer mean follow-up time; and more complete reporting of age and sex information. There was no increase in the percentage of published studies that used a prospective study design. None of the studies that required minimum acceptable baseline values of objective sleep parameter measures for enrollment indicated the use of separate screening and baseline assessments. There were 7 different definitions of sleep apnea and 17 different definitions of success in treatment.
There has been an overall improvement in the quality of the articles published on UPPP since 1995. Several areas still need improvement: use of more prospective studies, decrease in number of end points, use of separate screening and baseline assessments, and consensus in the definitions of sleep apnea and success.
Evidence-based medicine is the application of clinical research findings to clinical care. The ability to successfully apply the findings of research studies to clinical situations depends on the quality of the studies available. Several researchers1,2 have expressed concern about the methodological and statistical problems that are prevalent in clinical research studies. These problems are prevalent in the sleep apnea literature.3
Schechtman et al3 reviewed 37 articles on uvulopalatopharyngoplasty (UPPP) that were published from January 1966 to December 1994. They identified 9 key methodological and statistical problems in these articles and discussed ways to improve the quality of studies performed on treatments for sleep apnea. The 9 problems included inadequate sample size and little statistical power, failure to report confidence bounds, uncontrolled studies, inadequate follow-up, results with uncertain generalizability, failure to assess quality of life (QOL), multiple end points, missing data and missing or inconsistent definitions, and biased baseline data. This article3 has been cited 7 times in other articles published since 1995.
The purpose of our study was to review articles that have been written on UPPP since 1995 and the publication of the Schechtman et al3 article and to determine what, if any, improvement in the methodological quality of the research has resulted.
A methodological and statistical evaluation of the current literature on UPPP was performed. The literature search protocol was similar to that used by Schechtman et al.3 A search of the published medical literature was performed using the MEDLINE bibliographic database. Medical subject headings included uvulopalatopharyngoplasty and sleep apnea syndromes. All articles were written in English, included only adult subjects (age >18 years), and were published from January 1996 through August 2005. Reviews, editorials, and letters were excluded from the review. Articles were excluded if they contained information about snorers who may not have had sleep apnea, or if they lacked appropriate baseline data (ie, apnea index, apnea-hypopnea index, or respiratory disturbance index). Articles were also excluded if (1) the patient population numbered less than 10, (2) they did not report postoperative sleep study results, or (3) the patients had already been described in another study. Thirty articles representing the clinical studies on UPPP and related procedures remained: 12 on UPPP,4-15 5 on laser-assisted UPPP,16-20 3 on UPPP combined with tongue reduction,21-23 6 on UPPP combined with maxillofacial surgery,24-29 1 on UPPP without tonsillectomy,30 1 comparing UPPP with lateral pharyngoplasty,31 1 comparing UPPP with transpalatal advancement pharyngoplasty,32 and 1 on modified UPPP.33
A standard data collection form was created to capture information on the key methodological problems identified by Schechtman et al.3 We reviewed the articles separately using the data collection form. After independent review, answers were evaluated, and where there was a discrepancy between us, consensus was reached through discussion.
Subjective outcome measures included measures of snoring, sleepiness, general well-being, and reports of patient symptoms. Validated subjective outcome measures were defined as measures that have demonstrated construct validity and internal consistency. We defined an end point as any measure of outcome that was included in the statistical analysis of outcome. We did not include variables that were evaluated but not reported.
Inadequate sample size and little statistical power
Among the 30 articles, the mean sample size for each article was 55 (median size, 46; range of sample size, 13-277). Among the 30 articles, only 3 (10%) discussed statistical power.
Although 26 of the 30 UPPP articles (87%) reported P values, only 4 (13%) presented confidence bounds.34 Of these 4, none incorporated confidence bounds in the interpretation of the results.
Eight studies (27%) used a control group to compare one treatment with another. Of the 8 prospective studies that used control groups, only 2 (25%) included randomization for treatment assignment.
Among the 30 UPPP articles, 27 (90%) provided some information about the length of follow-up, 13 (43%) provided fixed follow-up times, 4 (13%) provided mean follow-up times, 3 (10%) provided minimum follow-up times, 4 (13%) provided a range of follow-up times, and 3 (10%) provided both mean follow-up times and a range of follow-up times. One article reported the follow-up time as “a couple of months,” which we assumed to be 2 months. Four articles provided both short- and long-term follow-up times. Using the shorter time in the articles with both short- and long-term follow-ups, the overall mean follow-up time was 5.1 months and the median was 4 months; the range was 1 to 18 months in the 27 articles that provided the necessary data.
Results with uncertain generalizability
Among the 30 UPPP articles, 13 (43%) were definitely prospective, 11 (37%) were definitely retrospective, 2 (7%) were probably prospective, 1 (3%) was probably retrospective, and 3 (10%) were indeterminate. A mean of 26% of patients was lost to follow-up (range, 0%-76%). Seventeen of the 30 studies (57%) had more than 20% of patients lost to follow-up.
Qol measures and multiple end points
Twenty of the articles (67%) used subjective outcome measures of improvement, such as measures of snoring and excessive daytime sleepiness. Of these 20 articles, 10 (50%) used validated subjective measures.
In the 30 UPPP articles, the mean number of end points was 7.3, and the range of end points was 1 to 33. Five articles (17%) reported 1 or 2 end points, 14 (47%) had 3 to 5 end points, 5 (17%) had 6 to 9 end points, and 6 articles (20%) had at least 10 end points.
Missing data and missing or inconsistent definitions
As mentioned in the “Inadequate Follow-up” subsection, 3 articles (10%) did not provide mean follow-up data, and 6 articles (20%) did not specify whether the study was prospective or retrospective. Three articles (10%) did not provide sufficient information to determine conclusively whether the study was prospective or retrospective; however, there were 3 other articles in which the status could be determined from the context, although there was no clear statement as to whether the research was prospective or not. Two articles (7%) did not provide information on the age of the population evaluated, and 1 article (3%) did not provide information on the sex distribution of the sample.
The definition of obstructive sleep apnea (OSA) was not specified in 16 articles (53%). Among the articles that defined OSA, there were 7 different definitions (Table 1). Of the 30 articles, 26 articles (87%) defined criteria for success in treatment. Among these articles, there were 17 different definitions (Table 2). Three articles had separate criteria for “success” and “cure.”
Required minimum acceptable baseline values of outcome measures for enrollment were defined in 13 of the 30 studies (43%). Of these 13, none indicated the use of separate screening and baseline assessments.
Comparison of studies published in the periods 1966-1994 and 1995-2005
In Table 3, we compare our findings in 2005 with those of Schechtman et al.3 Compared with their study, the percentage of articles that discussed statistical power increased, although the 95% confidence interval (CI) around this increase suggests that this difference is not statistically significant (95% CI, −3.4 to 35.2. The mean sample size increased, and thus the power of the studies increased. The percentage of articles reporting CIs increased, although the 95% CI around this increase suggests that this difference also is not statistically significant (95% CI, −2.6 to 23.8). Confidence intervals are important to report because they provide information about the precision of the results; CIs are also helpful in the interpretation of clinical research results.34 Both the percentage of controlled studies and that of randomized controlled studies increased, although the 95% CIs demonstrate that this difference is not statistically significant (95% CIs, −3.4 to 35.2 and −5.0 to 55.0, respectively). The mean follow-up time, use of prospective studies, and the mean number of end points increased, although none of these increases were statistically significant (95% CIs, −7.7 to 25.5, −17.0 to 31.0, and −4.1 to 3.1, respectively). There was a statistically significant increase in the use of subjective outcome measures (95% CI, 16.4-50.2), especially in the use of validated subjective outcome measures. There was also a statistically significant improvement in the reporting of age and sex information (95% CIs, 7.8-42.8 and 20.1-54.3, respectively). Slightly more articles cited in the current study defined OSA than in the earlier study,3 although this difference was not significant (95% CI, −18.2 to 29.6). It should be noted that there is still no consensus on the definition of sleep apnea. In both reviews, none of the studies that required minimum acceptable baseline values of outcome measures for enrollment indicated the use of separate screening and baseline assessments.
The purpose of this study was to assess the quality of articles that have been published on UPPP in the decade since the 1995 study by Schechtman et al3 and to determine if there has been an improvement in the methodological quality of the research. Overall, it seems that there has been modest improvement in the quality of the published literature. Notable areas include more randomized studies and the use of subjective outcome measures in assessing the efficacy of treatment modalities. It is well known that the results of objective tests often correlate poorly with symptoms of sleep apnea and the functional impairments associated with it.35 Although objective measures of sleep apnea severity are important, disease-specific health status and QOL measures need to be evaluated to give clinical relevance to results. Although there has been an increase in the use of validated subjective outcome measures (eg, the Epworth Sleepiness Scale), the relative infrequent use of validated QOL measures remains a weakness in the UPPP literature. Several disease-specific QOL measures are now available. The OSA Patient-Oriented Severity Index,36 the Calgary Sleep Apnea QOL index,37 and the Functional Outcomes of Sleep Questionnaire38 have been shown to be valid disease-specific, health-related QOL measures for OSA.
Several methodological areas still need improvement. There was no appreciable increase in the proportion of the studies that were prospective. Well-performed retrospective cohort studies are very valuable; however, they are more prone to bias than prospective studies. Also, the mean percentage of patients lost to follow-up was 26%, with 57% of the studies having more than 20% of patients lost to follow-up. This high number of patients lost to follow-up highlights the inherent challenge in longitudinal studies. Complete follow-up is difficult to achieve and requires a lot of effort, but it is necessary to minimize bias and achieve a high level of confidence in the results. Patients lost to follow-up introduce bias into the study because the reason for failure to follow up may be linked to the outcome, resulting in a study population that may not be typical of the larger population. Consequently, it may be difficult to assess the generalizability of results. This problem is more prevalent in retrospective studies because they require follow-up as an entry criterion since patients missing are excluded.
There has been an increase in the mean number of end points. Large numbers of end points increase the incidence of type I or “false-positive” errors (the probability of wrongly claiming significance). At the P = .05 level of significance, a statistical test has a 5% probability of claiming significance when none exists. When more than 1 statistical test is performed, the probability of wrongly claiming significance in at least 1 of the tests exceeds 5%. In fact, when 10 statistically independent tests are performed, the chance of at least 1 test being significant when in fact there is no statistically significant difference is 40%.39 Given this, significant results in some of the studies with large numbers of end points may have occurred by chance alone. Multiple end points are not necessarily bad, but they have to be managed appropriately, for example by (1) defining the primary end point, (2) statistical correction, and/or (3) demonstrating consistency across all end points.
None of the studies that required minimum baseline values of sleep parameter measures for enrollment indicated the use of separate screening and baseline assessments. This problem is almost universal in the sleep apnea literature. When a minimum baseline value of an objective measure is required as a criterion for enrollment in a study, it is important to perform a baseline assessment separate from the screening assessment because results on screening assessments may reflect day-to-day variability. Setting minimum baseline values for enrollment biases the baseline data toward higher values because patients with lower values are not included in the study. If the screening values are also used as baseline values, baseline values will be biased estimates of the true values. Thus, posttreatment values will tend to be lower than pretreatment values even when there is no therapeutic effect. This effect is due to regression toward the mean.
Other areas that need improvement include the definition of OSA and the definition of criteria for treatment success. Most of the articles we reviewed did not define OSA. In addition, there were 7 different definitions for OSA. This suggests that there has been no improvement in this area over the past 10 years. Similarly, there were 17 different definitions of treatment success. These problems are prevalent in the sleep apnea literature and are not limited to the UPPP literature. This highlights the need for consensus on the definition of OSA and the definition of success when evaluating the efficacy of therapeutic modalities for OSA.
This study has its limitations. There were occasional disagreements between the numbers of end points initially reported by each of us on independent review. However, consensus was reached on further review by both of us. To match the report by Schechtman et al,3 we reviewed only UPPP articles that included polysomnography outcomes. There may be high-quality studies focusing on subjective outcomes, which we did not review. Therefore, this review may underestimate the use of patient-based subjective measures in sleep apnea research.
In summary, there has been an improvement in the overall quality of the articles published on UPPP since 1995. However, certain areas still need improvement. Clinical researchers, journal reviewers, and editors should insist on higher methodological and statistical standards in an effort to improve the care of patients with OSA. In addition, expert opinion and thought leaders should exclude from their reviews articles and suggestions that do not demonstrate high standards.
Correspondence: Jay F. Piccirillo, MD, Clinical Outcomes Research Office, Department of Otolaryngology–Head and Neck Surgery, Washington University School of Medicine, Campus Box 8115, 660 S Euclid Ave, St Louis, MO 63110 (firstname.lastname@example.org).
Submitted for Publication: March 27, 2007; final revision received November 27, 2007; accepted December 3, 2007.
Author Contributions: Both authors had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Megwalu and Piccirillo. Acquisition of data: Megwalu and Piccirillo. Analysis and interpretation of data: Megwalu. Drafting of the manuscript: Megwalu. Critical revision of the manuscript for important intellectual content: Megwalu and Piccirillo. Statistical analysis: Megwalu and Piccirillo. Administrative, technical, and material support: Megwalu. Study supervision: Piccirillo.
Financial Disclosure: None reported.
RR The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial: survey of 71 “negative” trials. N Engl J Med
690- 694PubMedGoogle ScholarCrossref
JF Methodological and statistical problems in sleep apnea research: the literature on uvulopalatopharyngoplasty. Sleep
659- 666PubMedGoogle Scholar
SF Prediction of uvulopalatopharyngoplasty response using cephalometric radiographs. Am J Otolaryngol
179- 184PubMedGoogle ScholarCrossref
DW Uvulopalatopharyngoplasty: the Naval Medical Center, Portsmouth, experience. Am J Otolaryngol
174- 177PubMedGoogle ScholarCrossref
et al. Upper airway changes in snorers and mild sleep apnea sufferers after uvulopalatopharyngoplasty (UPPP). Chest
1595- 1603PubMedGoogle ScholarCrossref
et al. Uvulopalatopharyngoplasty for the obstructive sleep apnoea syndrome: value of polysomnography, Mueller manoeuvre and cephalometry in predicting surgical outcome. Clin Otolaryngol Allied Sci
504- 510PubMedGoogle ScholarCrossref
N Results of uvulopalatopharyngoplasty after diagnostic workup with polysomnography and sleep endoscopy: a report of 136 snoring patients. Eur Arch Otorhinolaryngol
91- 95PubMedGoogle Scholar
AK A three-centre prospective pilot study to elucidate the effect of uvulopalatopharyngoplasty on patients with mild obstructive sleep apnoea due to velopharyngeal obstruction. Clin Otolaryngol Allied Sci
95- 103PubMedGoogle ScholarCrossref
R Efficacy of uvulopalatopharyngoplasty in unselected patients with mild obstructive sleep apnea. Otolaryngol Head Neck Surg
179- 182PubMedGoogle ScholarCrossref
PIvan der Meche
FG Long-term results of uvulopalatopharyngoplasty for obstructive sleep apnea syndrome. Laryngoscope
(3, pt 1)
469- 475PubMedGoogle ScholarCrossref
K Evaluation of uvulopalatopharyngoplasty in treatment of obstructive sleep apnea syndrome. Acta Otolaryngol Suppl
52- 56PubMedGoogle ScholarCrossref
K Comparison of surgery and nasal continuous positive airway pressure treatment for obstructive sleep apnea syndrome. Acta Otolaryngol Suppl
46- 50PubMedGoogle Scholar
E Effect of UPPP with respect to site of pharyngeal obstruction in sleep apnoea: follow-up at 18 months by overnight recording of airway pressure and flow. Clin Otolaryngol Allied Sci
38- 43PubMedGoogle ScholarCrossref
H Nasal-CPAP, surgery, and conservative management for treatment of obstructive sleep apnea syndrome: a randomized study. Chest
114- 119PubMedGoogle ScholarCrossref
et al. Efficacy of laser-assisted uvulopalatoplasty in obstructive sleep apnea. Otolaryngol Head Neck Surg
643- 647PubMedGoogle ScholarCrossref
C Laser-assisted uvulopalatoplasty for the treatment of mild, moderate, and severe obstructive sleep apnea. Laryngoscope
79- 85PubMedGoogle ScholarCrossref
DJ A cost-effective and rational surgical approach to patients with snoring, upper airway resistance syndrome, or obstructive sleep apnea syndrome. Laryngoscope
726- 734PubMedGoogle ScholarCrossref
P Laser-assisted uvulopalatoplasty and tonsillectomy for the management of obstructive sleep apnea syndrome. Laryngoscope
1175- 1181PubMedGoogle ScholarCrossref
LM Combined temperature-controlled radiofrequency tongue reduction and UPPP in apnea surgery. Ear Nose Throat J
640- 644PubMedGoogle Scholar
NJ Combined uvulopalatopharyngoplasty and radiofrequency tongue base reduction for treatment of obstructive sleep apnea/hypopnea syndrome. Otolaryngol Head Neck Surg
611- 621PubMedGoogle ScholarCrossref
M Eight years of follow-up: uvulopalatopharyngoplasty combined with midline glossectomy as a treatment for obstructive sleep apnoea syndrome. Acta Otolaryngol Suppl
175- 178PubMedGoogle ScholarCrossref
RB Staged surgical treatment of obstructive sleep apnea syndrome: a review of 35 patients. J Oral Maxillofac Surg
382- 385PubMedGoogle ScholarCrossref
P Obstructive sleep apnea syndrome: fifty-one consecutive patients treated by maxillofacial surgery. Am J Respir Crit Care Med
(2, pt 1)
641- 649PubMedGoogle ScholarCrossref
A A protocol for uvulopalatopharyngoplasty, mortised genioplasty, and maxillomandibular advancement in patients with obstructive sleep apnea: an analysis of 40 cases. J Oral Maxillofac Surg
892- 897PubMedGoogle ScholarCrossref
A Usefulness of uvulopalatopharyngoplasty with genioglossus and hyoid advancement in the treatment of obstructive sleep apnea. Arch Otolaryngol Head Neck Surg
435- 440PubMedGoogle ScholarCrossref
M The role of the Genial Bone Advancement Trephine system in conjunction with uvulopalatopharyngoplasty in the multilevel management of obstructive sleep apnea. Otolaryngol Head Neck Surg
73- 79PubMedGoogle ScholarCrossref
RL Outcomes of hyoid suspension for the treatment of obstructive sleep apnea. Arch Otolaryngol Head Neck Surg
440- 445PubMedGoogle ScholarCrossref
H The effect of uvulopalatopharyngoplasty without tonsillectomy using local anaesthesia: a prospective long-term follow-up. J Laryngol Otol
542- 547PubMedGoogle ScholarCrossref
ID Lateral pharyngoplasty versus uvulopalatopharyngoplasty: a clinical, polysomnographic and computed tomography measurement comparison. Sleep
942- 950PubMedGoogle Scholar
HJ Transpalatal advancement pharyngoplasty outcomes compared with uvulopalatopharyngoplasty. Otolaryngol Head Neck Surg
211- 217PubMedGoogle ScholarCrossref
PC Modified uvulopalatopharyngoplasty: the extended uvulopalatal flap. Am J Otolaryngol
311- 316PubMedGoogle ScholarCrossref
JA The use of predicted confidence intervals when planning experiments and the misuse of power when interpreting results. Ann Intern Med
200- 206PubMedGoogle ScholarCrossref
AJ Uvulopalatopharyngoplasty for obstructive sleep apnea in adults: clinical correlation with polysomnographic results. Ear Nose Throat J
63- 66PubMedGoogle Scholar
KB Obstructive sleep apnea treatment outcomes pilot study. Otolaryngol Head Neck Surg
833- 844PubMedGoogle ScholarCrossref
MA Development of a disease-specific health-related quality of life questionnaire for sleep apnea. Am J Respir Crit Care Med
494- 503PubMedGoogle ScholarCrossref
et al. An instrument to measure functional status outcomes for disorders of excessive sleepiness. Sleep
835- 843PubMedGoogle Scholar