Evidence reviews for the US Preventive Services Task Force (USPSTF) use an analytic framework to visually display the key questions that the review will address to allow the USPSTF to evaluate the effectiveness and safety of a preventive service. The questions are depicted by linkages that relate to interventions and outcomes. A dashed line is used to reflect the natural progression of disease between an intermediate outcome and a health outcome. Further details are available from the USPSTF procedure manual.7
FDA indicates US Food and Drug Administration; HDI, Human Development Index; KQ, key question.
aIncluding hand search of Nelson et al,1 2010 Evidence Report,2 Crandall et al,9 and Marques et al.8
bBecause of overlap in studies across populations and results sections, only article counts are reported. Citation counts by KQ are not unique; studies may contribute to multiple KQs.
cNot included in individual study counts at the bottom level of the diagram.
dKQ1 and KQ3: 1 study (1 article).
eKQ2a: Accuracy of clinical risk assessment tools for identifying osteoporosis, 38 studies (41 articles); accuracy of bone measurement tests used to identify low bone mass and osteoporosis, 11 studies (11 articles); accuracy of fracture risk prediction instruments, 5 systematic reviews supplemented by 13 studies; accuracy of bone measurement tests used to predict fracture, 23 studies (24 articles); calibration of fracture risk prediction instruments, 14 studies (14 articles); reclassification risk, 10 studies (10 articles).
fKQ2b: 2 studies (2 articles).
gKQ4a: Alendronate, 7 studies (7 articles); zolendronic acid, 2 studies (2 articles); risedronate, 4 studies (4 articles); etidronate, 2 studies (2 articles); ibandronate, 0 studies; raloxifine, 1 study (2 articles); estrogen, 0 studies; denosumab, 4 studies (5 articles); parathyroid hormone, 2 studies (2 articles). KQ4b: 4 studies (5 articles).
hAlendronate, 16 studies (16 articles); zolendronic acid, 4 studies (4 articles); risedronate, 6 studies (6 articles); etidronate, 2 studies (2 articles); ibandronate, 7 studies (7 articles); raloxifine, 6 studies (12 articles); estrogen, 0 studies; denosumab, 4 studies (5 articles); parathyroid hormone, 2 studies (2 articles).
Customize your JAMA Network experience by selecting one or more topics from the list below.
Viswanathan M, Reddy S, Berkman N, et al. Screening to Prevent Osteoporotic Fractures: Updated Evidence Report and Systematic Review for the US Preventive Services Task Force. JAMA. 2018;319(24):2532–2551. doi:10.1001/jama.2018.6537
Osteoporotic fractures cause significant morbidity and mortality.
To update the evidence on screening and treatment to prevent osteoporotic fractures for the US Preventive Services Task Force.
PubMed, the Cochrane Library, EMBASE, and trial registries (November 1, 2009, through October 1, 2016) and surveillance of the literature (through March 23, 2018); bibliographies from articles.
Adults 40 years and older; screening cohorts without prevalent low-trauma fractures or treatment cohorts with increased fracture risk; studies assessing screening, bone measurement tests or clinical risk assessments, pharmacologic treatment.
Data Extraction and Synthesis
Dual, independent review of titles/abstracts and full-text articles; study quality rating; random-effects meta-analysis.
Main Outcomes and Measures
Incident fractures and related morbidity and mortality, diagnostic and predictive accuracy, harms of screening or treatment.
One hundred sixty-eight fair- or good-quality articles were included. One randomized clinical trial (RCT) (n = 12 483) comparing screening with no screening reported fewer hip fractures (2.6% vs 3.5%; hazard ratio [HR], 0.72 [95% CI, 0.59-0.89]) but no other statistically significant benefits or harms. The accuracy of bone measurement tests to identify osteoporosis varied (area under the curve [AUC], 0.32-0.89). The pooled accuracy of clinical risk assessments for identifying osteoporosis ranged from AUC of 0.65 to 0.76 in women and from 0.76 to 0.80 in men; the accuracy for predicting fractures was similar. For women, bisphosphonates, parathyroid hormone, raloxifene, and denosumab were associated with a lower risk of vertebral fractures (9 trials [n = 23 690]; relative risks [RRs] from 0.32-0.64). Bisphosphonates (8 RCTs [n = 16 438]; pooled RR, 0.84 [95% CI, 0.76-0.92]) and denosumab (1 RCT [n = 7868]; RR, 0.80 [95% CI, 0.67-0.95]) were associated with a lower risk of nonvertebral fractures. Denosumab reduced the risk of hip fracture (1 RCT [n = 7868]; RR, 0.60 [95% CI, 0.37-0.97]), but bisphosphonates did not have a statistically significant association (3 RCTs [n = 8988]; pooled RR, 0.70 [95% CI, 0.44-1.11]). Evidence was limited for men: zoledronic acid reduced the risk of radiographic vertebral fractures (1 RCT [n = 1199]; RR, 0.33 [95% CI, 0.16-0.70]); no studies demonstrated reductions in clinical or hip fractures. Bisphosphonates were not consistently associated with reported harms other than deep vein thrombosis (raloxifene vs placebo; 3 RCTs [n = 5839]; RR, 2.14 [95% CI, 0.99-4.66]).
Conclusions and Relevance
In women, screening to prevent osteoporotic fractures may reduce hip fractures, and treatment reduced the risk of vertebral and nonvertebral fractures; there was not consistent evidence of treatment harms. The accuracy of bone measurement tests or clinical risk assessments for identifying osteoporosis or predicting fractures varied from very poor to good.
Screening to prevent osteoporotic fractures may reduce fracture-related morbidity and mortality.1-4 Screening involves clinical fracture risk assessment, bone measurement testing (eg, dual-energy x-ray absorptiometry [DXA]), or both. Pharmacologic treatments for osteoporosis inhibit osteoclastic bone resorption (antiresorptive agents) or stimulate osteoblastic new bone formation (anabolic agents).5
In 2011, the US Preventive Services Task Force (USPSTF) recommended screening for osteoporosis in women 65 years and older and in younger women whose fracture risk is equal to or greater than that of a 65-year-old white woman who has no additional risk factors (B recommendation).6 The USPSTF concluded that the evidence was insufficient to assess the balance of benefits and harms of screening in men.6 To inform an updated recommendation, the evidence about the benefits and harms of screening and treatment to prevent osteoporotic fractures in community-dwelling adults relevant to US primary care was reviewed.
Detailed methods, calibration and reclassification outcomes, evidence tables, sensitivity analyses, and contextual information are available in the full evidence report at https://www.uspreventiveservicestaskforce.org/Page/Document/UpdateSummaryFinal/osteoporosis-screening1. The analytic framework and key questions (KQs) that guided the review are shown in Figure 1.
PubMed, the Cochrane Library, and Embase were searched for English-language articles published from November 1, 2009, through October 1, 2016, with active surveillance through March 23, 2018. ClinicalTrials.gov, Drugs@FDA.gov, HSRProj, Cochrane Clinical Trials Registry, and the World Health Organization International Clinical Trials Registry Platform were also searched. To supplement systematic electronic searches (eMethods 1 in the Supplement), studies included in relevant existing systematic reviews1,8,9 and reference lists of pertinent articles, and studies suggested by reviewers, were reviewed.
Two investigators independently reviewed titles, abstracts, and full-text articles using prespecified inclusion criteria for each KQ (eTable 1 in the Supplement), with disagreements about inclusion resolved by discussion. For KQ1, KQ2, and KQ3 (benefits and harms of screening), studies for which the majority of participants were community-dwelling adults with no known low-trauma fractures or metabolic bone disease were included. For KQ4 and KQ5 (benefits and harms of treatment), studies were included if the majority of participants had an increased fracture risk.
Eligible screening tests included bone tests (eg, DXA, quantitative ultrasound) and clinical risk assessments for osteoporosis or fracture risk if externally validated and publicly available. Eligible treatments included US Food and Drug Administration (FDA)–approved pharmacotherapy (specifically, bisphosphonates, estrogen agonists/antagonists, estrogen- and/or progestin-based hormone therapy, parathyroid hormone , and RANK ligand inhibitors [eg, denosumab]). Eligible outcomes included diagnostic or predictive accuracy (as measured by area under the curve [AUC]), incident fractures, fracture-related morbidity or mortality, all-cause mortality, and harms.
Randomized clinical trials (RCTs) and systematic reviews were eligible for all KQs; observational study designs were also eligible for accuracy of screening (KQ2) and harms of screening and treatment (KQ3 and KQ5). Only studies published in English and conducted in countries categorized as “very high” by the 2015 Human Development Index were included.10
For each included study, 1 investigator extracted information about design, population, intervention, and outcomes, and a second investigator reviewed for completeness and accuracy. Two independent investigators assessed the quality of each study as good, fair, or poor, using predefined criteria developed by the USPSTF (eMethods 2 in the Supplement)7 and others for assessing the risk of bias of diagnostic tests,11 prognostic tests,12 trials,13 observational studies,14 and systematic reviews.11,15 Individual study quality ratings are provided in eTables 2 through 59 in the Supplement.
Findings were qualitatively synthesized for each KQ in tabular and narrative formats. Studies were included if they met all study selection criteria and were fair or good quality; this included studies from the prior review that informed the USPSTF 2011 recommendation that continued to meet the study selection criteria for this update. When at least 3 independent and similar RCTs were available,16 random-effects models using the inverse-variance weighted method of DerSimonian and Laird was used to estimate pooled effects for pooled AUCs or relative risks.17 Statistical heterogeneity was assessed using the I2 statistic.13 All quantitative analyses were conducted using OpenMetaAnalyst or Comprehensive Meta Analysis.18,19 The strength of evidence for each outcome was assessed based on the Agency for Healthcare Quality and Research Methods Guide for Effectiveness and Comparative Effectiveness Reviews,20 which specifies the assessment of study limitations, directness, consistency, precision, and reporting bias for each intervention comparison and major outcome of interest.
A total of 168 articles of good or fair quality were included (Figure 2). Several cohorts of study participants contributed to multiple publications; as a result, the total number of participants cannot be calculated accurately.
Key Question 1. Does screening (clinical risk assessment, bone density measurement, or both) for osteoporotic fracture risk reduce fractures and fracture-related morbidity and mortality in adults?
The Screening for Osteoporosis in Older Women for the Prevention of Fracture (SCOOP) trial randomized 12 483 women aged 70 to 85 years in the United Kingdom to screening with the Fracture Risk Assessment Tool (FRAX) or usual care (details not reported).21 In this fair-quality trial, participants in the intervention group who were identified as high risk based on FRAX-generated 10-year hip fracture risk were invited to undergo DXA testing. The investigators recalculated the FRAX risk for those who undertook DXA screening and communicated the results to the participant’s general practitioner, who then offered treatment as appropriate.21
At 5 years’ follow-up, comparing the intervention group with usual care, no difference was reported for the primary outcome of any osteoporotic fracture (12.9% vs 13.6%; hazard ratio [HR], 0.94 [95% CI, 0.85-1.03]), for all clinical fractures (15.3% vs 16.0%; HR, 0.94 [95% CI, 0.86-1.03]), or for mortality (8.8% vs 8.4%; HR, 1.05 [95% CI, 0.93-1.19]). However, a statistically significant difference in hip fracture incidence was observed (2.6% vs 3.5%; HR, 0.72 [95% CI, 0.59-0.89]).
Key Question 2a. What is the accuracy and reliability of screening approaches to identify adults who are at increased risk for osteoporotic fracture?
Studies of tests to identify osteoporosis (as defined by bone mineral density [BMD] T-score ≤−2.5), predict osteoporotic fracture, or both, were included. The results below focus primarily on pooled results; nonpooled results are available in the full evidence report.
Thirty-eight studies reported on the diagnostic accuracy of 16 clinical risk assessment instruments for identifying osteoporosis (eTable 60, eFigures 1-7 in the Supplement). In women, pooled AUC estimates ranged from 0.65 (95% CI, 0.60-0.71; I2 = 97.8%; 10 studies [16 780 participants]) for the Osteoporosis Risk Assessment Instrument [ORAI] to 0.76 (95% CI, 0.63-0.90; I2 = 98.5%; 4 studies [2692 participants]) for the Osteoporosis Self-Assessment Tool for Asians. AUCs from individual studies have a wider range in women (0.3222-0.8723) than in men (0.6224-0.8925). In men, the pooled AUC for the Osteoporosis Self-assessment Tool (OST) was 0.76 (95% CI, 0.71-0.80; I2 = 93.2%; 7 studies [7798 participants]); for the Male Osteoporosis Risk Estimation Score, the pooled AUC was 0.80 (95% CI, 0.71-0.88; I2 = 97.6%; 3 studies [4828 participants]). AUCs for FRAX could not be pooled but ranged from 0.5826 to 0.82.26
AUCs in younger women (<65 years) varied from 0.5826 to 0.85.27 One study found the accuracy of using the FRAX threshold associated with the 2011 USPSTF recommendation (10-year risk of major osteoporotic fracture ≥9.3%) was modestly better than chance (AUC, 0.60) and inferior to accuracy using the OST (AUC, 0.72) and Simple Calculated Osteoporosis Risk Estimation (AUC, 0.75) instruments in identifying women aged 50 to 64 years with osteoporosis (femoral neck T-score ≤−2.5).28 Instruments that assess more clinical risks did not report higher AUCs than instruments measuring fewer risks.
Thirty-five studies reported other measures of diagnostic accuracy (ie, sensitivity, specificity), but the instrument score threshold used to assess diagnostic accuracy varied considerably across studies. eTable 60 in the Supplement presents sensitivity and specificity estimates for the most commonly reported threshold. Even with a common threshold, results for the same instrument varied widely; as an example, the sensitivity of the ORAI instrument ranged from 50%29 to 100%30 and specificity from 10%30 to 75%.29
Seven studies in women and 3 studies in men compared calcaneal quantitative ultrasound to centrally measured DXA for identifying osteoporosis. Reported AUCs varied from 0.69 to 0.90 (eTable 61 in the Supplement). For women, the pooled AUC estimate was 0.77 (95% CI, 0.72-0.81; I2 = 82.3%; 7 studies [1969 participants]; eFigure 8 in the Supplement). For men, the pooled AUC estimate was 0.80 (95% CI, 0.67-0.94; I2 = 98.2%; 3 studies [5142 participants]) (eFigure 9 in the Supplement). Similar findings were observed for digital x-ray radiogrammetry, peripheral DXA, and radiographic absorptiometry were observed.
One good-quality systematic review of 45 studies supplemented by 13 additional fair- or good-quality studies reported on the accuracy of 12 different clinical risk assessments for predicting incident fracture (Table 1). Pooled results are reported herein.
The discriminative ability of FRAX for predicting future fracture varied by sex, site of fracture prediction, and whether BMD was used in the risk prediction. For women, pooled estimates based on 10 to 17 studies with 62 054 and 190 795 participants ranged somewhat higher (0.66-0.79) (eFigures 14-17 in the Supplement). In men, pooled estimates of AUC from 3 to 44 studies (13 970-15 842 participants) ranged from 0.62 to 0.76 (depending on inclusion of BMD in the prediction model) (eFigures 10-13 in the Supplement). Within that range, pooled estimates were higher for predicting hip fracture than for major osteoporotic fracture and higher when BMD was included in the prediction model. For cohorts of men and women combined, pooled estimates for the prediction of major osteoporotic fracture based on 3 studies (66 777 participants) were similar (AUC without BMD, 0.67 [95% CI, 0.66-0.67; I2 = 47.1%]; AUC with BMD, 0.69 [95% CI, 0.69-070; I2 = 70.3%]) (eFigures 18 and 19 in the Supplement). Two studies predicting hip fracture in combined cohorts of men and women reported similar AUC estimates as women-only cohorts.54,55
In women, the pooled AUC for risk assessment with BMD was 0.68 (95% CI, 0.64-0.71; I2 = 84.8%; 3 studies [6534 participants]) for predicting major osteoporotic fracture (eFigure 20 in the Supplement) and 0.73 (95% CI, 0.66-0.79; I2 = 97.3%; 4 studies [7809 participants]) for predicting hip fracture (eFigure 21 in the Supplement).
Across 9 fracture risk assessment instruments (the Women’s Health Initiative algorithm,63 OST,65 Simple Calculated Osteoporosis Risk Estimation,67 Fracture and Immobilization Score,68 Fracture Risk Score,70 Fracture Risk Calculator,71 ORAI,74 QFracture,60 and Osteoporosis Index of Risk),75 AUC estimates ranged from 0.53 to 0.82 for major osteoporotic fracture8,36,46,47,66 and from 0.80 to 0.89 for hip fracture.8,63,64,73 A tenth instrument, the Canadian Association of Radiologists and Osteoporosis Canada, did not provide AUC estimates76 but reported a sensitivity for predicting fracture of 0.54 (95% CI, 0.52-0.56) among women and 0.31 (95% CI, 0.24-0.38) among men.77 The reported specificities were 0.75 (95% CI, 0.74-0.75) for women and 0.86 (0.85-0.87) for men.
Twenty-three studies evaluated the accuracy of various bone measurement tests for predicting fracture (Table 2). In general, no meaningful differences in accuracy by type of bone test or by sex were observed. AUC estimates were generally higher for prediction of hip fracture than for prediction of fractures at other sites.
Key Question 2b. What is the evidence to determine screening intervals for osteoporosis and low bone density?
Two studies included participants with widely varying baseline BMD. Both suggest no advantage to repeated bone measurement testing (at 8 years86 and 3.7 years87 apart) (eTable 62 in the Supplement).87 However, 3 studies that developed prognostic models suggested that the optimal screening interval varies by baseline BMD.88-90 Age and use of hormone replacement therapy also influence optimal screening intervals.88,89
Key Question 3. What are the harms of screening for osteoporotic fracture risk?
One trial, SCOOP (previously described in KQ1),21 assessed the effect of screening on anxiety (State-Trait Anxiety Inventory) and quality of life (EuroQol 5-Dimension tool and the Short-Form Health Survey 12 [physical and mental health]) and found no differences between participants allocated to screening vs usual care (variance not reported, P > .10 for all outcomes).
Key Question 4a. What is the effectiveness of pharmacotherapy for the reduction of fractures and related morbidity and mortality?
Eleven RCTs reported outcomes related to the effect of various bisphosphonates on fracture indicence.91-101
Among women, bisphosphonates (as a class) were associated with fewer vertebral fractures compared with placebo (2.1% vs 3.8%; relative risk [RR], 0.57 [95% CI, 0.41-0.78]; I2 = 0.0%; 5 RCTs [5433 participants];) (eFigure 22 in the Supplement).91-95
One RCT in 1199 men reported fewer radiographic vertebral fractures for zoledronic acid compared with placebo (1.5% vs 4.6%; RR, 0.33 [95% CI, 0.16-0.70]).101
Among women, a pooled analysis of RCTs reporting nonvertebral fractures observed an association with fewer fractures in the treatment group compared with placebo (8.9% vs 10.6%; RR, 0.84 [95% CI, 0.76-0.92]; I2 = 0.0%; 8 RCTs [16 438 participants]) (eFigure 23 in the Supplement).91,93-95,99,100,102
The same trial of zoledronic acid in men previously described for vertebral fractures also reported on nonvertebral fractures; no between-group differences in incidence were observed (0.9% vs 1.3%; RR, 0.65 [95% CI, 0.21-1.97]).101
Among women, the pooled estimate suggested no statistically significant association between treatment with bisphosphonates and incidence of hip fracture (0.70% vs 0.96%; RR, 0.70 [95% CI, 0.44-1.11]; I2 = 0.0%; 3 RCTs [n = 8988]) (eFigure 24 in the Supplement). Only 191 of the 3 studies91,102,103 was powered to detect differences in hip fractures.
No studies reported on hip fractures in men.
Raloxifene (60 mg/d) reduced radiographic vertebral fracture (7.5% vs 12.5%; RR, 0.64 [95% CI, 0.53-0.76]) compared with placebo in 1 RCT of 7705 women.104,105 Treatment with raloxifene (60 mg/d or 120 mg/d) did not have an effect on incidence of nonvertebral or hip fracture.
A recently completed systematic review on the benefits and harms of estrogen therapy, with and without progestin, in primary care populations incorporated information from the Women’s Health Initiative and other similar trials.106 Women taking only estrogen had lower risks for total osteoporotic fractures (HR, 0.72 [95% CI, 0.64-0.80]) compared with women taking placebo. Women taking estrogen plus progestin therapy also had lower risks for fractures (RR, 0.80 [95% CI, 0.68-0.94]) compared with women taking placebo.
One large study107 (7868 women) demonstrated a statistically significant difference between denosumab and placebo in incident vertebral fractures (2.3% vs 7.2%; RR, 0.32 [95% CI, 0.26-0.41]), nonvertebral fractures (6.1% vs 7.5%; RR, 0.80 [95% CI, 0.67-0.95]), and hip fractures (0.7% vs 1.1%; RR, 0.60 [95% CI, 0.37-0.97]). Three smaller RCTs of denosumab reported no effect of treatment on incident clinical, osteoporotic, or vertebral fractures.
Among 2061 women without a prevalent fracture at baseline, parathyroid hormone produced a significant (0.7% vs 2.1%; RR, 0.32 [95% CI, 0.14-0.75]) reduction in new radiographic vertebral fractures compared with placebo.108 No studies met the inclusion criteria to assess the effects of teriparatide on vertebral fractures in men.
In an RCT of 2532 women with and without prevalent fractures at baseline,108 no significant difference in new nonvertebral fractures was observed between treatment and placebo (5.6% vs 5.8%; RR, 0.97 [95% CI, 0.71-1.33]).
One trial of men reported a reduction in nonvertebral fractures in both treatment groups of teriparatide (doses of 20 μg [the FDA-approved dose] [n = 151 men] or 40 μg [n = 139 men] compared with placebo [n = 147 men]),109 although the results did not reach statistical significance because of a small number of fractures and early termination of the study for safety concerns (20 μg vs placebo: 1.3% vs 2.0%; RR, 0.65 [95% CI, 0.11-3.83]; 40 μg vs placebo: 0.7% vs 2.0%; RR, 0.35 [95% CI, 0.04-3.35]).
Key Question 4b. How does the effectiveness of pharmacotherapy for the reduction of fractures and related morbidity and mortality vary by subgroup?
One trial each offered further analyses on subgroups for alendronate,91 risedronate,103 raloxifene,110 and denosumab.111,112 None reported differences in effectiveness by age, baseline BMD, prior fractures, or a combination of risk factors.
Key Question 5. What are the harms associated with pharmacotherapy?
When comparing medication with placebo, there was no significant association between use of bisphosphonates and discontinuation (RR, 0.99 [95% CI, 0.91-1.07]; I2 = 0.0%; 20 RCTs [17 369 participants]) (eFigure 25 in the Supplement), serious adverse events (RR, 0.98 [95% CI, 0.92-1.04]; I2 = 0.0%; 17 RCTs [11 745 participants]) (eFigure 26 in the Supplement), or upper gastrointestinal events (RR, 1.01 [95% CI, 0.98-1.05]; I2 = 0.0%; 13 RCTs [20 485 participants]) (eFigure 27 in the Supplement) for any individual bisphosphonate drug or overall as a class.
Two studies did not report a statistically significant risk of atrial fibrillation with bisphosphonates compared with placebo. One study was in women (alendronate: 2.5% vs 2.2%; RR, 1.14 [95% CI, 0.83-1.56]),113 and 1 study was in men (zoledronic acid: 1.2% vs 0.8%; RR, 1.45 [95% CI, 0.46-4.56]).101 Two studies of women reported no cases of atrial fibrillation.114,115 A case-control study using a Danish registry reported a relative risk of atrial fibrillation of 0.75 (95% CI, 0.49-1.16; 3.2% vs 2.9%) for new users of bisphosphonates.116
Rare outcomes were not generally observed in the included evidence. Specifically, 3 studies (1 in men and 2 in women) reported that they found no cases of osteonecrosis of the jaw.101,114,115 No studies included in the review reported atypical femur fracture outcomes or kidney failure.
Pooled estimates of women followed up from 1 to 4 years and randomized to raloxifene or placebo found no significant association between raloxifene use and discontinuation of treatment because of adverse events (12.6% vs 11.2%; RR, 1.12 [95% CI, 0.98-1.28]; I2 = 0.0%; 6 RCTs [6438 participants]) (eFigure 28 in the Supplement). The pooled analysis suggested a possible association between raloxifene use and deep vein thromboses (0.7% vs 0.3%; RR, 2.14 [95% CI, 0.99-4.66]; I2 = 0.0%; 3 RCTs [5839 participants]) (eFigure 29 in the Supplement), an association between use and hot flashes (11.2% vs 7.6%; RR, 1.42 [95% CI, 1.22-1.66]; I2 = 0.0%; 5 RCTs [n = 6249 participants]) (eFigure 30 in the Supplement), but no association between use and leg cramps (8.0% vs 4.8%; RR, 1.41 [95% CI, 0.92-2.14]; I2 = 67.1%; 3 RCTs [n = 6000]) (eFigure 31 in the Supplement). No significant association between raloxifene and coronary heart disease (1.0% vs 1.1%; HR, 0.88 [95% CI, 0.56-1.40]),117 stroke (0.9% vs 1.2%; RR, 0.69 [95% CI, 0.40-1.18]),118 or endometrial cancer (0.2% vs 0.2%; RR, 1.01 [95% CI, 0.29-3.48])119 was observed.
A recently completed review on the benefits and harms of estrogen therapy, with and without progestin, in primary care populations found that compared with women receiving placebo, women receiving estrogen with or without progesterone experienced a higher rate of gallbladder events, stroke, and venous thromboembolism over 5-year follow-up106 and an increased risk of urinary incontinence during follow-up of 1 year. In addition, women receiving estrogen plus progestin, compared with women receiving placebo, were found to have a higher risk of invasive breast cancer, coronary heart disease, and probable dementia over 5-year follow-up.
Pooled estimates suggested no significant association between denosumab use and discontinuation because of adverse events (2.4% vs 2.1%; RR, 1.14 [95% CI, 0.85-1.52]; I2 = 0.0%; 3 RCTs [8451 participants]) (eFigure 32 in the Supplement) or serious adverse events (23.8% vs 23.9%; RR, 1.12 [95% CI, 0.88-1.44]; I2 = 14.1%; 4 RCTs [8663 participants]) (eFigure 33 in the Supplement). Although treatment groups had higher rates of serious infections than placebo groups, confidence intervals for the pooled estimate spanned the null effect (4.0% vs 3.3%; RR, 1.89 [95% CI, 0.61-5.91]; I2 = 40.1%) (eFigure 34 in the Supplement).
Among 2532 postmenopausal women in 1 study,108 the treatment group had higher rates of discontinuation because of adverse events when compared with the placebo group (30.2% vs 24.6%; RR, 1.22 [95% CI, 1.08-1.40]). Hypercalcemia, hypercalciuria, nausea, and vomiting were more common in the treatment group compared with placebo.
In 1 RCT among 437 men,109 both the 20-µg and 40-µg treatment groups had a higher proportion of withdrawals than the placebo group (9.2% and 12.9%, respectively, vs 4.8%), which was statistically significantly higher in the 40-µg treatment group than in the placebo group (RR, 2.72 [95% CI, 1.17-6.31]) but not in the group receiving the FDA-approved dose of 20 µg (RR, 1.94 [95% CI, 0.81-4.69]). Cancers were reported in 2 groups (3/147 in the placebo group and 3/151 in the 20-µg treatment group), but none were reported as osteosarcomas.
Table 3 and Table 4 summarize the strength of evidence and findings from this review. This updated review for the UPSPTF incorporates new evidence on the direct link between screening for osteoporosis and health outcomes. One trial (SCOOP) addressed the morbidity, mortality, and harms associated with screening to prevent osteoporotic fractures (KQ1, KQ3). The trial found evidence of benefit for a secondary outcome only, the incidence of hip fractures (low strength of evidence of benefit). For all other outcomes (osteoporosis-related fractures, clinical fractures, mortality, anxiety, quality of life; insufficient strength of evidence), the trial did not report statistically significant differences in benefits or harms. The release of guidelines during trial recruitment122 and observation123 may have changed standards for usual care; differences between study groups may have been attenuated as a result. The use of the 10-year risk of hip fracture rather than the risk of major osteoporotic fracture as the threshold for DXA testing may have increased the likelihood of effectiveness for preventing hip rather than other fractures (given that risks of hip and other fractures are correlated but not identical21). The discrepancy in results between the hip fracture outcomes and other fractures points to the need for caution in interpreting the results.
Although results from studies of accuracy of bone measurement tests or clinical risk assessments for identifying osteoporosis or predicting fractures vary, in general they report no more than moderate accuracy (KQ2), and this evidence was graded as low to moderate. On average, clinical risk assessment tools to identify osteoporosis performed better in men than in women. FRAX performed only better than chance in younger women. Predictions of hip fractures were more accurate than prediction of fractures at other sites or composite fracture outcomes. Sixteen clinical risk assessment tools for the identification of osteoporosis were found, and although these instruments had many risk variables in common (eg, age, weight, hormone therapy use), there was considerable heterogeneity in the patient populations studied and the anatomical sites used to measure bone density. The evidence for clinical risk assessments varies in the incorporation of BMD and number of risks. In general, tools incorporating BMD had higher accuracy than tools without BMD. The accuracy of tools with more clinical variables was similar to the accuracy of tools with fewer risk factors, suggesting that future research could focus on simpler instruments that can be easily incorporated into clinical practice. Future study into the optimal thresholds to trigger further diagnostic evaluation (eg, DXA testing) or to begin treatment is also critical, because valid and reliable cutoffs for high and low risk categories are necessary for clinical decision making.
Pharmacotherapy treatment studies in women show that multiple classes of medications (bisphosphonates, parathyroid hormone, raloxifene, and denosumab) reduce the risk of vertebral and nonvertebral fractures (KQ4); this evidence was graded as low to moderate for reducing fractures. Two of 3 studies of bisphosphonates that reported hip fractures were not powered to detect effects on hip fractures; the pooled evidence did not demonstrate a statistically significant benefit. Evidence for benefit in men is limited to 1 trial of a bisphosphonate, which demonstrated a large reduction in radiographic vertebral fractures. No studies demonstrated reductions in risk of clinical vertebral fractures or nonvertebral fractures for men. No studies reporting on hip fractures, fracture-related morbidity, or mortality were identified.
Although several trials reported on harms (KQ5), they varied substantially in definitions used. No consistent evidence of harms with bisphosphonates (strength of evidence graded as moderate) was identified and no bisphosphonate trials reported rare harms, such as osteonecrosis of the jaw, atypical femur fractures, or kidney failure. The evidence on harms in men was very limited but was consistent with harms for women when available.
This review has several limitations. First, it focused on treatment with prescription medications only; it does not address other interventions that might reduce the risk of osteoporotic fractures, such as functional assessment, safety evaluations, vision examinations, exercise or physical therapy, vitamin supplementation, and diet interventions. Further, this review did not consider comparative effectiveness of pharmacologic treatment.
Second, treatment studies included in this review relied on BMD T-scores to enroll participants into trials. Risk factors beyond bone density, such as microarchitectural deterioration of bone tissue and decline in bone quality, contribute to osteoporotic fractures: therefore, approaches that rely on BMD measurement wholly or in part may not be the most accurate approaches for identifying patients at highest risk for osteoporotic fractures.
Third, included studies on diagnosing osteoporosis or predicting fractures are heterogeneous with respect to prevalence of baseline fractures, baseline BMD, prior treatment, and length of study follow-up (which was sometimes shorter than the time horizon of the risk prediction instrument); most meta-analyses demonstrated high statistical heterogeneity (I2 > 80%), suggesting that the variance can be explained by heterogeneity rather than chance. Fourth, the evidence base is sparse on screening interval, screening in men and premenopausal women, and long-term studies on the harms of screening and treatment.
In women, screening to prevent osteoporotic fractures may reduce hip fractures, and treatment reduced the risk of vertebral and nonvertebral fractures; there was not consistent evidence of treatment harms. The accuracy of bone measurement tests or clinical risk assessments for identifying osteoporosis or predicting fractures varied from very poor to good.
Corresponding Author: Meera Viswanathan, PhD, RTI International, 3040 E Cornwallis Rd, Research Triangle Park, NC 27709 (firstname.lastname@example.org).
Accepted for Publication: April 25, 2018.
Author Contributions: Dr Viswanathan had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Viswanathan, Berkman, Nicholson, Kahwati.
Acquisition, analysis, or interpretation of data: Viswanathan, Reddy, Berkman, Cullen, Middleton, Kahwati.
Drafting of the manuscript: Viswanathan, Reddy, Berkman, Cullen, Middleton, Nicholson, Kahwati.
Critical revision of the manuscript for important intellectual content: Viswanathan, Reddy, Kahwati.
Statistical analysis: Viswanathan, Kahwati.
Obtained funding: Viswanathan.
Administrative, technical, or material support: Viswanathan, Reddy, Cullen, Middleton, Nicholson, Kahwati.
Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest and none were reported.
Funding/Support: This research was funded under contract HHSA-290-2012-00015-I, Task Order 6, from the Agency for Healthcare Research and Quality (AHRQ), US Department of Health and Human Services, under a contract to support the USPSTF.
Role of the Funder/Sponsor: Investigators worked with USPSTF members and AHRQ staff to develop the scope, analytic framework, and key questions for this review. AHRQ had no role in study selection, quality assessment, or synthesis. AHRQ staff provided project oversight, reviewed the report to ensure that the analysis met methodological standards; and distributed the draft for peer review. Otherwise, AHRQ had no role in the conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript findings. The opinions expressed in this document are those of the authors and do not reflect the official position of AHRQ or the US Department of Health and Human Services.
Additional Contributions: We acknowledge the following individuals for their contributions to this project: AHRQ staff, Tina Fan, MD, and Tracy Wolff, MD; current and former members of the US Preventive Services Task Force who contributed to topic deliberations; Evelyn Whitlock, MD (formerly at Kaiser Permanente Research Affiliates Evidence-based Practice Center [EPC]); Jennifer S. Lin, MD (Kaiser Permanente Research Affiliates EPC); and RTI International–University of North Carolina EPC staff: Kathleen N. Lohr, PhD, Lynn Whitener, DrPH, Linda Lux, MPA, Andrew Kraska, BA, Janice Handler, BA, Stephanie Scope, BS, Carol Woodell, BSPH, Rachel Weber, PhD, Linda J. Lux, MPA, and Loraine Monroe. USPSTF members, peer reviewers, and federal partner reviewers did not receive financial compensation for their contributions.
Additional Information: A draft version of the full evidence report underwent external peer review from 5 content experts (Rosanne Leipzig, MD, Mount Sinai Medical Center; Diana Pettiti, MD, Arizona State University; Margaret Gourlay, MD, University of North Carolina at Chapel Hill; Carolyn Crandall, MD, University of California, Los Angeles; Mary Roary, PhD, National Institutes of Health) and 4 anonymous reviewers. Comments from reviewers were presented to the USPSTF during its deliberation of the evidence and were considered in preparing the final evidence review.
Editorial Disclaimer: This evidence report is presented as a document in support of the accompanying USPSTF Recommendation Statement. It did not undergo additional peer review after submission to JAMA.
Create a personal account or sign in to: