Logistic regression is used frequently in cohort studies and clinical
trials. When the incidence of an outcome of interest is common in the study
population (>10%), the adjusted odds ratio derived from the logistic regression
can no longer approximate the risk ratio. The more frequent the outcome, the
more the odds ratio overestimates the risk ratio when it is more than 1 or
underestimates it when it is less than 1. We propose a simple method to approximate
a risk ratio from the adjusted odds ratio and derive an estimate of an association
or treatment effect that better represents the true relative risk.
RELATIVE RISK has become one of the standard measures in biomedical
research. It usually means the multiple of risk of the outcome in one group
compared with another group and is expressed as the risk ratio in cohort studies
and clinical trials. When the risk ratio cannot be obtained directly (such
as in a case-control study), the odds ratio is calculated and often interpreted
as if it were the risk ratio. Subsequently, the term relative
risk commonly refers to either the risk ratio or the odds ratio. However,
only under certain conditions does the odds ratio approximate the risk ratio. Figure 1 shows that when the incidence of
an outcome of interest in the study population is low (<10%), the odds
ratio is close to the risk ratio. However, the more frequent the outcome becomes,
the more the odds ratio will overestimate the risk ratio when it is more than
1 or underestimate the risk ratio when it is less than 1.
Logistic regression is a widely used technique to adjust
for confounders, not only in case-control studies but also in cohort studies.1 However, logistic regression yields an odds ratio
rather than a risk ratio, even in a cohort study. Under the same rule, when
the outcome of interest is common in the study population (though it could
be rare in the general population), the adjusted odds ratio from the logistic
regression may exaggerate a risk association or a treatment effect. For instance,
a previous study assessed the performance of neonatal units in Hospital A
and Hospital B by comparing neonatal mortality in very low birthweight neonates
between these 2 hospitals.2 At first glance,
Hospital A had a lower mortality rate than Hospital B (18% vs 24%, risk ratio,
18%:24% [0.75]). However, after adjusting for clinical variables and initial
disease severity using logistic regression, the adjusted odds ratio of Hospital
A vs Hospital B was 3.27 (95% confidence interval, 1.35-7.92). Can one therefore
conclude that neonates with very low birthweight in Hospital A had 3 times
the risk of death than those in Hospital B? Probably not, because the outcome
(neonatal death) was common in this study population. To provide a measure
that more accurately represents the concept of relative risk, correction of
the odds ratio may be desirable.
A modified logistic regression with special macro functions has been
developed to address this issue.3 However,
it is mathematically complex and uses a General Linear Interactive Modeling
System (Numerical Algorithms Group, Oxford, England). Consequently, this method
is rarely used. Another alternative is to use the Mantel-Haenszel method,4 which can adjust for 1 or 2 confounders and still
provide a risk ratio in a cohort study. However, this method becomes inefficient
when several factors, especially continuous variables, are being adjusted
for simultaneously. We herein propose an easy approximation with a simple
formula that can be applied not only in binary analysis5
but also in multivariate analysis.
In a cohort study, P0 indicates the incidence of the outcome
of interest in the nonexposed group and P1 in the exposed group;
OR, odds ratio; and RR, risk ratio: OR=(P1/1−P1)/(P0/1−P0); thus, (P1/P0)=OR/[(1−P0)+(P0×OR)]. Since RR=P1/P0,
the corrected
Graphic Jump Location
We can use this formula to correct the adjusted odds ratio obtained from logistic regression and derive
an estimate of an association or treatment effect that better represents the
true relative risk. It can also be used to correct the lower and upper limits
of the confidence interval by applying this formula to the lower and upper
confidence limits of the adjusted odds ratio. In the above example, after
the odds ratio is corrected (where OR=3.27 and P0=0.24), the risk
ratio becomes 2.12 (95% confidence interval, 1.25-2.98), ie, very low birthweight
neonates in Hospital A had twice the risk of neonatal death than those in
Hospital B.
To examine the validity of this correction method in various scenarios,
we simulated a series of hypothetical cohorts based on predetermined risk
ratios (called true RR). Each cohort consists of 1000 subjects with 1 binary
outcome (0,1), 1 exposure variable (0,1), and 2 confounders. Both confounders
have 3 levels (1,2,3). The true risk ratio is kept constant across strata
of the confounders. As expected, with an increase in incidence of outcome
and risk ratio, the discrepancy between risk ratio and odds ratio increases
(Table 1). The corrected risk
ratio, which is calculated based on the odds ratio from logistic regression
after having adjusted for the confounders, is very close to the true risk
ratio. This procedure can be applied to both unmatched and matched cohort
studies. It can further be used in cross-sectional studies, in which the prevalence
ratio rather than the risk ratio will be generated. It enables us to obtain
a corrected prevalence ratio very close to the one obtained from a complex
statistical model6 (data not shown).
Due to the differences in underlying assumptions between Mantel-Haenszel
risk ratio and logistic regression odds ratio, some discrepancy between the
Mantel-Haenszel risk ratio and the corrected risk ratio is expected (detailed
discussion of which is beyond the scope of this work). More importantly, the
validity of the corrected risk ratio relies entirely on the appropriateness
of logistic regression model, ie, only when logistic regression yields an
appropriate odds ratio will the correction procedure provide a better estimate.
Therefore, in a cohort study, whenever feasible, the Mantel-Haenszel estimate
should be used.
In summary, in a cohort study, if the incidence of outcome is more than
10% and the odds ratio is more than 2.5 or less than 0.5, correction of the
odds ratio may be desirable to more appropriately interpret the magnitude
of an association.
1.Hosmer DW, Lemeshow S. Applied Logistic Regression. New York, NY: John Wiley & Sons Inc; 1989.
2.Tarnow-Mordi W, Ogston S, Wilkinson AR.
et al. Predicting death from initial disease severity in very low birthweight
infants: a method for comparing the performance of neonatal units.
BMJ.1990;300:1611-1614.Google Scholar 3.Wacholder S. Binomial regression in GLIM: estimating risk ratios and risk differences.
Am J Epidemiol.1986;123:174-184.Google Scholar 4.Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies
of disease.
J Natl Cancer Inst.1959;22:719-748.Google Scholar 5.Sinclair JC, Bracken MB. Clinically useful measures of effect in binary analyses of randomized
trials.
J Clin Epidemiol.1994;47:881-889.Google Scholar 6.Lee J. Odds ratio or relative risk for cross-sectional data?
Int J Epidemiol.1994;23:201-202.Google Scholar