The topic of the mortality differences between weight categories has sometimes been described as controversial.1 The appearance of controversy may arise in part because studies of body mass index (BMI; calculated as weight in kilograms divided by height in meters squared) and mortality have used a wide variety of BMI categories and varying reference categories, which can make findings appear more variable than when standard categories are used and also can make it difficult to compare and synthesize studies. A report2 in 1997 from the World Health Organization Consultation on Obesity defined BMI-based categories of underweight, normal weight, preobesity, and obesity. The same cutoff BMI values were adopted by the National Heart, Lung, and Blood Institute in 1998.3
In this study, we used the National Heart, Lung, and Blood Institute's terminology with categories of underweight (BMI of <18.5), normal weight (BMI of 18.5-<25), overweight (BMI of 25-<30), and obesity (BMI of ≥30). Grade 1 obesity was defined as a BMI of 30 to less than 35; grade 2 obesity, a BMI of 35 to less than 40; and grade 3 obesity, a BMI of 40 or greater. These standard categories have been increasingly used in published studies of BMI and mortality, but the literature reporting these results has not been systematically reviewed.
The purpose of this study was to compile and summarize published analyses of BMI and all-cause mortality that provide hazard ratios (HRs) for standard BMI categories. We followed the guidelines in the Meta-analysis of Observational Studies in Epidemiology (MOOSE) statement4 for reporting of systematic reviews.
Articles were identified by searches of PubMed and EMBASE through September 30, 2012. Details of search strategies
appear in eTable 1. No language restrictions were applied. All articles were reviewed for inclusion by 1 reviewer (K.M.F.). An independent review of all articles was conducted by a second set of reviewers (B.K.K., H.O., and B.I.G.). The articles were reviewed to identify those that used standard BMI categories in prospective, observational cohort studies of all-cause mortality among adults with BMI measured or reported at baseline. Studies that addressed these relationships only in adolescents, only in institutional settings, or only among those with specific medical conditions or undergoing specific medical procedures were excluded. We included multiple articles from a given data set only when there was little overlap between articles by sex, age group, or some other factor.
In some cases, authors used standard BMI categories for overweight and obesity but had used a slightly broader reference BMI category of less than 25 or a slightly narrower reference BMI category of 20 to less than 25 rather than the standard normal BMI category of 18.5 to less than 25. We included these articles but have noted the cases in which the reference BMI category was less than 25 or 20 to less than 25. We classified studies that included a mix of self-reported and measured weight and height according to the preponderant type.
Abstracted items included sample size, number of deaths, age at baseline, length of follow-up, HRs and 95% confidence intervals, sex, age, type of weight and height data (measured or self-reported), country or region, source of study sample, adjustment factors, exclusion and inclusion criteria, and sensitivity analyses. Authors of screened articles were queried for additional information when necessary. In studies that only presented results stratified by smoking or health condition, we selected results for nonsmokers or never smokers or for those without the health condition. We selected the most complex model available for the full sample and used a variety of sensitivity analyses to address issues of possible overadjustment or underadjustment.
We categorized HRs into 2 age groupings either as limited solely to people aged 65 years or older or as a mixed-age category (eg, aged 25-64 years or 40-80 years). We classified articles as adequately adjusted, possibly overadjusted, or possibly underadjusted. We categorized HRs by adjustment level, by whether the data were measured or self-reported, by whether the analysis was performed separately for men and women or for both sexes combined, and by region (North America, Europe, and other).
We used a random-effects model5 to summarize the results overall and within subgroups and based statistically significant heterogeneity on a P value of less than .05. We calculated the quantity I2 to describe the degree of heterogeneity with values of 25%, 50%, and 75% considered low, moderate, and high, respectively.6 We also used a sequential approach similar to that described by Patsopoulos et al7 to assess consistency of findings when heterogeneity was reduced. All analyses were performed with SAS version 9.3 (SAS Institute Inc).
For sensitivity analyses, we examined the effects on HRs of incorporating results from a recent large pooled study for overweight.8 For comparative purposes, we also constructed approximate HRs relative to normal weight from several recent large studies9-14 that had used finer BMI groupings and thus did not meet our inclusion criteria. To do this, we averaged HRs from the finer BMI groupings over groups corresponding to the standard BMI categories, weighting the HRs by the number of deaths.
As shown in the eFigure, the primary search strategy for PubMed yielded 4142 articles, of which 128 met our criteria. A second PubMed search yielded 2892 additional articles, of which 13 met our criteria. A search of EMBASE yielded 2 additional eligible articles. In total, 143 eligible studies were identified.
After exclusion of 41 articles with overlapping data sets and of 5 articles lacking sufficient information, 97 articles remained for analysis; all of which had been identified through systematic search procedures. The selected studies are shown in eTable 2 with additional information in eTable 3 regarding exclusions and adjustment factors. Regions of origin of participants included the United States or Canada (n = 41 studies), Europe (n = 37), Australia (n = 7), China or Taiwan (n = 4), Japan (n = 2), Brazil (n = 2), Israel (n = 2), India (n = 1), and Mexico (n = 1). The tabulated studies included more than 2.88 million participants and more than 270 000 deaths.
Not all studies reported the specific categories of interest. There were 93 studies for the BMI category of 25 to less than 30 (overweight), 61 studies for the BMI category of 30 or greater (obesity), and 32 studies for the BMI categories of 30 to less than 35 (grade 1 obesity) and 35 and greater (grades 2 and 3 obesity).
We considered the results adequately adjusted if they were adjusted for age, sex, and smoking and not adjusted for factors in the causal pathway between obesity and mortality, or if they had reported or demonstrated that adjustments or exclusions to avoid bias had shown little effect on their findings. A number of studies (for example15-29) reported qualitatively that such adjustments had little or no effect without showing quantitative details.
Quiz Ref IDOther studies (for example30-32) demonstrated little effect through a series of sensitivity analyses. We considered the available full sample results from such studies to also be adequately adjusted. Otherwise, we considered studies as possibly overadjusted if they adjusted for factors such as hypertension that are considered to be in the causal pathway between obesity and mortality or as possibly underadjusted if they did not adjust for age, sex, and smoking. We classified 53 studies as adequately adjusted, 34 studies as possibly overadjusted, and 10 studies as possibly underadjusted.
Summary HRs are shown in Table 1 overall, by age group, and by whether data were measured or self-reported. Quiz Ref IDThe summary HRs were 0.94 (95% CI, 0.91-0.96) for overweight, 1.18 (95% CI, 1.12-1.25) for obesity (all grades), 0.95 (95% CI, 0.88-1.01) for grade 1 obesity, and 1.29 (95% CI, 1.18-1.41) for grades 2 and 3 obesity. Plots of HRs for these categories are shown in Figures 1, 2, 3, 4, 5, 6, 7, 8.33 -110 Additional details are shown
in eTables 4-7, which show summary HRs by age, sex, region, and measurement type.
Table 1. Summary Random-Effects Hazard Ratios (HRs) of All-Cause Mortality for Overweight and Obesity Relative to Normal Weight
Results for studies that we considered adequately adjusted are shown in Table 2. This more select group showed the same general pattern of overweight associated with reduced mortality, grade 1 obesity not significantly associated with increased mortality, and the higher grades of obesity significantly associated with increased mortality. The summary HRs were 0.94 (95% CI, 0.90-0.97) for overweight, 1.21 (95% CI, 1.12-1.31) for obesity (all grades), 0.97 (95% CI, 0.90-1.04) for grade 1 obesity, and 1.34 (95% CI, 1.21-1.47) for grades 2 and 3 obesity. For overweight, the results from possibly overadjusted studies and from adequately adjusted studies were similar (eTable 8). However, for obesity, the possibly overadjusted studies tended to have lower HRs than the adequately adjusted studies.
Table 2. Summary Hazard Ratios (HRs) of All-Cause Mortality for Overweight and Obesity Relative to Normal Weight From Studies Considered Adequately Adjusted
Between-study heterogeneity was statistically significant in most categories. According to Higgins et al,6 this test may have “excessive power when there are many studies, especially when those studies are large.” Heterogeneity (as indicated by the value of I2) was less for studies with measured height and weight and was lower for studies limited to those older than 65 years. The value of I2 was reduced by limiting findings to adequately adjusted studies with measured data.
Higher levels of heterogeneity, however, do not necessarily lead to dissimilar results that would affect the conclusions. For example, the summary HR for overweight for older ages (≥ 65 years) was identical (0.90) for measured height and weight (I2 = 31.2%) and for self-reported height and weight (I2 = 71.0%). For adequately adjusted studies, we sequentially excluded HRs within age and measurement categories as needed to reduce the I2 value to below 25%. Within the 4 age-measurement groups, this required exclusion of 9% to 22% of studies for measured data and 14% to 39% of studies for self-reported data.
For overweight, excluding these studies led to a uniformly lower HR of 0.89 for both age groups and for both measured and self-reported data. For obesity, the effects of excluding these studies were more variable and led to an overall increase of the summary HR from 1.21 to 1.24. Corresponding values were from 0.97 to 1.05 (neither significantly different from 1) for grade 1 obesity and from 1.34 to 1.39 for grades 2 and 3 obesity. Thus, heterogeneity appeared to have had little effect on the conclusions of the meta-analysis.
The excluded studies varied across outcome categories; inspection of the excluded studies did not suggest specific reasons why they had contributed to heterogeneity. Taken together, the findings suggest that contributors to heterogeneity across all studies include adjustment levels, type of measurement data, and age group. Some degree of heterogeneity may also result from the variation in BMI levels within the broad BMI categories used, as well as from variations in the type of cohorts studied.
For the overweight category only, we also repeated analyses including the results from a study that pooled data from 19 cohorts. After excluding ever smokers and those with a history of cancer or heart disease, Berrington de Gonzalez et al8 found a HR of 1.11 (95% CI, 1.07-1.16) for men and 1.13 (95% CI, 1.09-1.16) for women with a BMI of 25 to 29.9 relative to those with a BMI of 20 to less than 25 (Amy Berrington de Gonzalez, DPhil, written communication, June 16, 2011).
Our analysis included published studies using 6 of the same cohorts, representing about 60% of the original Berrington de Gonzalez et al8 sample. Excluding those studies from our analysis and substituting the above results from Berrington de Gonzalez et al did not change the summary HR for overweight.
We also repeated the analyses after excluding the studies that had used slightly different reference categories. Excluding studies with a reference BMI category of less than 25 had no effect on the HRs for overweight and decreased the HR for obesity by 0.02. Excluding studies with a reference BMI category of 20 to less than 25 increased the HR for overweight by 0.005 and had no effect on the HR for obesity.
Beyond these slight differences in the reference category, studies that used nonstandard BMI categories were excluded. However, we were able to construct approximate HRs from some recent large studies that had used nonstandard BMI categories (eTable 9). This approach does not allow for construction of appropriate standard errors or confidence intervals. The approximate HRs were consistent with our findings from our analyses of individual studies, showing similar minor variation.
This study presents comprehensive estimates (derived from a systematic review) of the association of all-cause mortality in adults with current standard BMI categories used in the United States and internationally. Estimates of the relative mortality risks associated with normal weight, overweight, and obesity may help to inform decision making in the clinical setting.
The most recent data from the United States show that almost 40% of adult men and almost 30% of adult women fall into the overweight category with a BMI of 25 to less than 30.111 Comparable figures for Canada are 44% of men and 30% of women112 and for England are 42% of men and 32% of women.113
According to the results presented herein, overweight (defined as a BMI of 25-<30) is associated with significantly lower mortality overall relative to the normal weight category with an overall summary HR of 0.94. For overweight, 75% of HRs with measured weight and height and 67% of HRs with self-reported weight and height were below 1. These results are broadly consistent with 2 previous meta-analyses114,115 that used standard categories. In a pooled analysis of 26 observational studies, McGee et al114 found summary relative risks of all-cause mortality for overweight of 0.97 (95% CI, 0.92-1.01) for men and 0.97 (95% CI, 0.93-0.99) for women relative to normal weight.
Recent estimates for the prevalence of obesity (defined as a BMI of ≥ 30) among adults are 36% in the United States,111 24% in Canada,112 and 26% in England.113 Obesity was associated with significantly higher all-cause mortality relative to the normal weight BMI category with an overall summary HR of 1.18. Corresponding estimates for obesity from McGee et al114 were 1.20 (95% CI, 1.12-1.29) for men and 1.28 (95% CI, 1.18-1.37) for women. In the United States and Canada, more than half of those who are obese fall into the grade 1 category (BMI of 30-<35). We did not find significant excess mortality associated with grade 1 obesity, suggesting that the main contribution to excess mortality in obesity comes from higher levels of BMI.
Quiz Ref IDOur findings are consistent with observations of lower mortality among overweight and moderately obese patients.116-119 Possible explanations have included earlier presentation of heavier patients,120 greater likelihood of receiving optimal medical treatment,121-123 cardioprotective metabolic effects of increased body fat,124,125 and benefits of higher metabolic reserves.118
The results presented herein provide little support for the suggestion126 that smoking and preexisting illness are important causes of bias. Most studies that addressed the issue found that adjustments or exclusions for these factors had little or no effect. However, overadjustment for factors in the causal pathway appears to decrease HRs for obesity but not for overweight.
An important source of bias appears to be the errors in self-reported weight and height data. Such errors have been shown to vary by age, sex, race, measured values, and data collection method.127,128 The systematic error of self-reported data rather than measured data can result in substantial misclassification of individuals into incorrect BMI categories,129 create errors that are difficult to correct,130 and lead to upward bias in the estimates.131 We found a generally lower summary HR and less heterogeneity in studies using measured data than in studies using self-reported data. The differences were more pronounced in analyses stratified by sex than in analyses that combined both men and women. Because the errors in self-reported data tend to differ by sex, there may be an offsetting effect when analyses combine men and women.
Publication bias can potentially affect systematic reviews. Studies that find little or no association of overweight or obesity with mortality risk sometimes only mention these results in passing without providing details. For example, He et al132 did not include terms for overweight or obesity in their models, reporting only that overweight and obesity were not associated with increased mortality. Studies of BMI and mortality sometimes selectively report analyses of certain subgroups, an approach that can lead to bias.133,134
The study by Berrington de Gonzalez et al8 and the overlapping study by Adams et al1 found results similar to ours in their full sample but based their final results on a subgroup with less than half of their original sample, arguing that this subgroup provided more valid results than the full sample. The validity of this assertion has not been demonstrated, and such large-scale exclusions may introduce additional bias, particularly when using self-reported data. Other studies15,16,18,19,22-25,27,28,30-32,135 have shown little or no effect of similar exclusions.
Strengths and Limitations
One of the strengths of our study is the large sample size and number of studies included, which make the findings robust to the effects of any single study. Additionally, we used a comprehensive search strategy and prespecified standard categories. Although standard BMI categories were developed by the World Health Organization and by the National Institutes of Health in the 1990s, not all studies of BMI and mortality use standard categories as part of their analyses. The combination of flexible categorization and selective reporting can lead to wide variations in HRs even within a single data set.136 Categorization of BMI has both advantages and disadvantages.137,138 However, the use of predefined standard groupings avoids issues of post hoc and ad hoc selection of categories and reference categories.
Quiz Ref IDOur study also has limitations. It addresses only all-cause mortality and not morbidity or cause-specific mortality. It addresses only findings related to BMI and not to other aspects of body composition such as visceral fat or fat distribution. Our census of these articles may be incomplete. Our coding and data abstraction procedures may have introduced errors. Our information on age was limited. Because of publication bias and selective reporting, null or negative HRs may have been less likely to be published. Geographical coverage was limited.
Quiz Ref IDRelative to normal weight, obesity (all grades) and grades 2 and 3 obesity were both associated with significantly higher all-cause mortality. Grade 1 obesity was not associated with higher mortality, suggesting that the excess mortality in obesity may predominantly be due to elevated mortality at higher BMI levels. Overweight was associated with significantly lower all-cause mortality. The use of predefined standard BMI groupings can facilitate between-study comparisons.