Kaplan-Meier curves for the cumulative incidence of the 6 composite safety measures. A, Composite cardiovascular events. B, Upper or lower gastrointestinal tract bleeding. C, Composite fracture. D, Any of the individual safety events resulting in hospitalization. E, Any of the individual safety events leading to immediate death or a hospitalization with death. F, All-cause mortality. P values were determined with the log-rank test. Coxibs indicates selective cyclooxygenase-2 inhibitors; nsNSAIDs, nonselective nonsteroidal anti-inflammatory drugs.
Solomon DH, Rassen JA, Glynn RJ, Lee J, Levin R, Schneeweiss S. The Comparative Safety of Analgesics in Older Adults With Arthritis. Arch Intern Med. 2010;170(22):1968-1978. doi:10.1001/archinternmed.2010.391
The safety of alternative analgesics is unclear. We examined the comparative safety of nonselective NSAIDs (nsNSAIDs), selective cyclooxygenase 2 inhibitors (coxibs), and opioids.
Medicare beneficiaries from Pennsylvania and New Jersey who initiated therapy with an nsNSAID, a coxib, or an opioid from January 1, 1999, through December 31, 2005, were matched on propensity scores. We studied the risk of adverse events related to analgesics using incidence rates and adjusted hazard ratios (HRs) from Cox proportional hazards regression.
The mean age of participants was 80.0 years, and almost 85% were female. After propensity score matching, the 3 analgesic cohorts were well balanced on baseline covariates. Compared with nsNSAIDs, coxibs (HR, 1.28; 95% confidence interval [CI], 1.01-1.62) and opioids (1.77; 1.39-2.24) exhibited elevated relative risk for cardiovascular events. Gastrointestinal tract bleeding risk was reduced for coxib users (HR, 0.60; 95% CI, 0.35-1.00) but was similar for opioid users. Use of coxibs and nsNSAIDs resulted in a similar risk for fracture; however, fracture risk was elevated with opioid use (HR, 4.47; 95% CI, 3.12-6.41). Use of opioids (HR, 1.68; 95% CI, 1.37-2.07) but not coxibs was associated with an increased risk for safety events requiring hospitalization compared with use of nsNSAIDs. In addition, use of opioids (HR, 1.87; 95 CI, 1.39-2.53) but not coxibs raised the risk of all-cause mortality compared with use of nsNSAIDs.
The comparative safety of analgesics varies depending on the safety event studied. Opioid use exhibits an increased relative risk of many safety events compared with nsNSAIDs.
In the United States, 1 in 5 adults received a prescription for an analgesic in 2006, accounting for 230 million prescription purchases1; however, the comparative safety of these drugs is unclear. Although the cardiovascular safety of nonselective nonsteroidal anti-inflammatory drugs (nsNSAIDs) and selective cyclooxygenase-2 inhibitors (coxibs) has been called into question,2 there is little comparable information about the third major analgesic group, opioids. The US Food and Drug Administration recently required manufacturers of major opioids to put in place a Risk Evaluation and Mitigation Strategy because of their uncertain risk to benefit ratio.3 However, whether opioids are riskier than nsNSAIDs or coxibs is unknown.
Questions of comparative safety are of paramount importance to patients and prescribers.4 Most people realize that no drug is absolutely safe but want information to guide them to the safest option. Although much research focuses on the risks of a drug for a given organ system (eg, cardiovascular risk), patients want to know about composite safety, that is, the likelihood that a given drug will cause a severe adverse event of any type. No accepted definitions exist for composite safety, and thus it is difficult to measure. Adverse events that lead to hospitalization or death are considered severe in the setting of a randomized controlled trial and might serve as possible measures of composite safety.
Assessing composite safety of analgesics in randomized controlled trials presents a major challenge. Most trials set target enrollment numbers on the basis of efficacy rather than safety measures and follow up patients only for short periods. Moreover, trials often enroll a relatively healthy cohort, and thus the true safety profile of most drugs in a typical population is not known until after it is marketed. It is unlikely that a head-to-head safety trial that includes opioids and other analgesics will ever be completed. Postmarketing surveillance data from usual care cohorts provide an opportunity to examine comparative safety across a wide range of events and can complement safety data from randomized controlled trials. However, imbalance in baseline population characteristics confounds many postmarketing surveillance studies. Propensity score–matched analyses may provide better balance of confounders and facilitate relatively straightforward comparative safety analyses.
We used propensity score methods to balance baseline characteristics and examined the comparative safety of the 3 most common types of analgesics: nsNSAIDs, coxibs, and opioids. Using a very large health care utilization (claims) database, we were able to identify subjects with very similar measured baseline characteristics but who received different analgesics.
We studied Medicare beneficiaries from Pennsylvania and New Jersey who qualify for pharmaceutical assistance programs for low-income older adults. During the study period (January 1, 1999, through December 31, 2005), these programs provided insurance coverage for all medications without restriction. The study cohort consisted of eligible adults who had recorded diagnoses for osteoarthritis or rheumatoid arthritis on 2 separate visits (eTable 1). After their second diagnosis, eligible subjects entered the cohort at the time of their first new analgesic prescription dispensing (index date).
To identify incident (new) use of an nsNSAID, a coxib, or an opioid, we excluded persons dispensed these drugs in the 180 days before the index date. We further excluded persons with a diagnosis of a malignant neoplasm, use of hospice services in the preceding 365 days, and dispensing of analgesics from 2 categories simultaneously, either as a combination product or 2 separate medications. In addition, we required that subjects demonstrate consistent use of health care system services in the preceding 365 days.
After application of the eligibility and exclusion criteria, we balanced exposure groups using propensity score matching.5 A propensity score is the estimated probability of receiving one treatment exposure vs another. Two separate propensity scores were estimated using multivariate logistic regression models, one predicting the use of coxibs compared with nsNSAIDs and the other predicting use of opioids compared with nsNSAIDs. Because the opioid user group was substantially larger than the other groups, we used only coxib-nsNSAID matched pairs in which the nsNSAID user was also successfully matched to an opioid user. By performing two 1:1 matches with a common referent group (nsNSAIDs), we sought to achieve a cohort balanced among 3 exposure groups. Matching was accomplished using a “greedy” matching routine.6
The propensity scores were estimated on the basis of confounders that we measured in claims data (demographics, diagnoses, surgical procedures, and pharmacy dispensings; see eTable 1 for a listing of variables and codes). These include prior cardiovascular diagnoses and medication use, osteoporosis and fracture diagnoses and medications associated with their risk, gastrointestinal (GI) tract diagnoses and treatments, and diagnoses associated with liver or renal disease. These variables were determined for the 365 days before the index date.
A study protocol was developed before analyses were performed. The study was approved by the Partners Healthcare Institutional Review Board.
The safety events consisted of all clinically significant unintended health effects related to analgesics. We defined them on the basis of claims data diagnoses and procedure codes (see eTable 1). Cardiovascular events included myocardial infarction, stroke, heart failure, revascularization, and out-of-hospital cardiac death.7- 10 Gastrointestinal tract outcomes included upper and lower GI tract bleeding and bowel obstruction.11 Acute kidney injury included hospitalizations for acute renal failure requiring dialysis.12 Hepatic toxic effects included inpatient and outpatient events. Fractures included hip, pelvis, wrist, and humerus13 but not spine fractures because new vertebral compression fractures cannot be reliably identified in claims data. Trauma diagnoses denoting a fall were identified, although recorded diagnoses likely represent only a small percentage of all falls.
After identifying each of these specific events, we created the following 3 composite measures: any of the specific cardiovascular events, any of the specified fractures, and upper or lower GI tract bleeding. In addition, we identified the following 3 severe safety events: (1) any of these adverse events leading to a hospitalization, (2) any of these safety events leading to an acute care hospitalization and subsequent death or out-of-hospital cardiac death, and (3) any death (all-cause mortality).
We categorized exposure by class of analgesic (coxib, opioid, or nsNSAID) (see eTable 2). Data on medication use came from pharmacy dispensing claims records, which included the drug name, dosage, and days supplied. Subjects were considered exposed to 1 of the 3 types of analgesics from the day after the first dispensing through 15 days after the last available dose. In sensitivity analyses, this extension period was shortened to 7 days and 0 days. No distinctions in exposure status were based on dosage or duration of use. Combination products with acetaminophen were included. Subjects were allowed to enter the analyses once only. In addition, if a second type of analgesic was received, the subject was censored without any extension.
We compared the baseline characteristics of the propensity score–matched cohorts and assessed balance among measured covariates. We calculated incidence rates with 95% confidence intervals (CIs) for all individual and composite safety events in each of the 3 exposure categories. We then constructed Kaplan-Meier event-free survival curves and inspected 2-way log-rank tests. We estimated the relative hazard for each individual and composite safety event using Cox proportional hazards regression models. Because we matched the groups on propensity scores containing potential baseline confounders, the regression models contained only the analgesic exposures of interest, with nsNSAID as the reference exposure. We tested the proportional hazards assumption for each exposure of interest with respect to each of the main safety outcomes via the Kolmogorov supremum test of Lin et al.14
To compare the occurrence of different outcomes across analgesic types, we calculated the numbers needed to treat to observe an excess safety event (hereinafter referred to as the numbers needed to harm). The number needed to harm estimates the number of subjects that would be required to use a coxib or opioid, rather than an nsNSAID, to observe 1 excess safety outcome.14,15
We performed a wide range of sensitivity analyses to examine whether the primary findings were robust. First, we dropped users of rofecoxib and valdecoxib, the 2 coxibs removed from the market, and their matches and reran the Cox proportional hazards regressions. Second, the variables with at least 5% imbalance across exposure groups at baseline were included in the Cox proportional hazards regressions to account for possible residual confounding by these factors. Third, we assessed only outcomes after the seventh day after the index date to reduce the chance that the analgesic was being prescribed for a preexisting condition. Fourth, in a manner similar to an intention-to-treat analysis, we ended outcome assessment after 60 days and considered subjects to be members of their index exposure category for 60 days, even if prescribing had ended or treatment had changed. Finally, we calculated a high-dimensional propensity score, which included 500 empirically derived covariates that are likely confounders from administrative data; this score has been shown to further reduce confounding in pharmacoepidemiology studies.16
The original cohort of subjects with osteoarthritis or rheumatoid arthritis who started analgesic therapy after the arthritis diagnosis included 163 714 potentially eligible subjects. This was reduced to 36 414 (22.2%) after excluding persons without a year of continuous follow-up and those who had used a different analgesic prescription in the preceding 365 days. This was further reduced to 23 647 (14.4%) after applying the requirements for no prior malignant neoplasm, hospice, or nursing home use. Finally, after propensity score matching, the cohort had 12 840 members (7.8%).
The baseline characteristics of the 3 propensity score–matched cohorts were similar (Table 1). Almost 85% of the subjects were women, most were white, and the mean age was 80.0 years. More than 80% in each exposure category had osteoarthritis, but the percentage with rheumatoid arthritis varied from 13.4% of those using coxibs to 9.0% among opioid users. The number of acute care hospital days was higher in the opioid users category than in the other exposure categories. However, the comorbidity index, cardiovascular risk factors, history of GI tract disease, and use of gastroprotective medications were quite similar across exposures. A history of fracture and falls was more common among opioid users than the other exposures. The propensity score–matched cohorts were more similar than the nonmatched cohorts (see eTable 3).
We calculated incidence rates per 1000 person-years for the composite safety measures as well as the component adverse events (Table 2). Follow-up times varied by outcome but were generally similar across exposure groups. For any cardiovascular event, the mean follow-up time was 117 days for NSAIDs, 202 days for coxibs, and 137 days for opioids. The incidence rates were high for several of the safety events, with the rate for adverse events resulting in hospitalization being greater than 100 per 1000 person-years for all 3 cohorts. Opioid users experienced higher rates for most of the composite serious adverse events than did the other 2 exposure groups. This difference was extreme for fractures, in which the rate for opioid users was 101 per 1000 person-years (95% CI, 87-117) compared with 19 per 1000 person-years (14-25) for coxib users. The rate of gastrointestinal tract bleeding was reduced for coxib users (12 per 1000; 95% CI, 8-16) compared with 21 per 1000 (14-30) for nsNSAIDs.
The hazard ratios (HRs) for each of the end points and safety events were estimated with a Cox proportional hazards regression in the propensity score–matched cohorts (Table 3). We found no significant violation of the proportional hazards assumption for either exposure variable with respect to any of the main safety events (P > .10 for each). Compared with nsNSAIDs, coxibs (HR, 1.28; 95% CI, 1.01-1.62) and opioids (1.77; 1.39-2.24) exhibited elevated relative risk for cardiovascular events. Gastrointestinal tract bleeding risk was reduced for coxib users (HR, 0.60; 95% CI, 0.35-1.00) but was similar for opioid users (1.07; 0.65-1.76) compared with nsNSAID users. Coxib and nsNSAID use had a similar risk for fracture; however, fracture risk was elevated with opioid use (HR, 4.47; 95% CI, 3.12-6.41) compared with nsNSAID use.Use of opioids (HR, 1.68; 95% CI, 1.37-2.07) but not coxibs (1.12; 0.91-1.38) was associated with an increased risk of safety events requiring hospitalization compared with use of nsNSAIDs. Furthermore, we observed an increased risk of all-cause mortality among opioid users (HR, 1.87; 95 CI, 1.39-2.53) but not coxib users (1.16; 0.85-1.57) compared with nsNSAID users.
The Figure displays the event-free survival curves for the first 12 months of follow-up. The curves reflect the relative risks, with nsNSAID users generally experiencing the fewest adverse events and opioid users the most over time. The rates of the composite cardiovascular outcome were similar for the nsNSAID and coxib users during the first 3 months.
Sensitivity analyses are shown in Table 4. Removing rofecoxib and valdecoxib users reduced the composite cardiovascular risk among celecoxib users, and the risk was no longer elevated compared with nsNSAID use; other results were not substantially changed. Including the baseline covariates that remained imbalanced after matching (denoted in Table 1) in the Cox regression did not alter the resulting HRs substantially. The matched analysis using the high-dimensional propensity score resulted in slightly smaller cohorts than the main propensity score–matched analysis. The HRs from the high-dimensional propensity score analysis comparing opioid with nsNSAID users were closer to the null than in the main analysis. However, the only result for which qualitative interpretation would change is the risk for GI tract bleeding; the main analysis found a reduction in risk with coxibs compared with NSAIDs (Table 3), but the high-dimensional propensity score result was closer to the null and nonsignificant (HR, 0.79; 95% CI, 0.47-1.33). Certain sensitivity analyses tested models in which we assumed a shorter extension of the risk window after the end of the prescription period, a longer period between the index date and the start of outcome assessment, or a continuation of the first exposure for 60 days and a fixed 60-day follow-up. These secondary analyses also found significant risk with opioids compared with nsNSAIDs (eTable 4).
The numbers needed to harm (ie, to observe an excess safety event) based on the main analysis are shown in Table 5. These results estimate how many persons would need to be treated with a coxib or an opioid vs an nsNSAID to observe 1 excess adverse event. These figures suggest that if 47 (95% CI, 38-62) people were treated with an opioid vs an nsNSAID for 1 month, 1 extra fracture would be observed; this figure was reduced to 26 (95% CI, 18-42) after 1 year of treatment. For the composite cardiovascular outcome, 27 (95% CI, 17-57) people would need to be treated with a coxib and 17 (95% CI, 12-30) with an opioid to observe 1 extra event.
Analgesics are used daily by millions of people; however, current data do not allow patients or physicians to determine which type of agent is safest. We compared nsNSAIDs, coxibs, and opioids across a wide range of specific safety events and several composite safety events. In a cohort of older adults with arthritis, opioid users experienced the highest risk across most specific and severe safety events and nsNSAID users the lowest risk. The numbers needed to harm that we observed suggest that many of these risks are clinically relevant.
Although we found strong associations between given analgesic types and the risk of safety events, a single epidemiologic analysis cannot prove causation. By excluding patients with cancer and those receiving hospice care or residing in a nursing home and by matching subjects on the basis of a propensity score that included many important potential confounders, we were able to obtain cohorts that were well balanced with respect to observed patient characteristics. However, the exclusions we applied to our cohort generate concerns about generalizability. We drew on trial design and opted for a smaller, more homogeneous cohort in which there was less likelihood of residual confounding. Even so, small imbalances persisted and certain known confounders, including body mass index, tobacco and alcohol use, and use of aspirin and other over-the-counter NSAIDs, were unmeasured.
Confounding bias is the major barrier to using large administrative claims databases for comparative effectiveness research. Data sets that lack important clinical information make it difficult to eliminate bias from such unmeasured confounding; although propensity score matching can achieve remarkable balance among measured covariates, it cannot control for unmeasured factors. This issue may be especially important when examining the relation between opioids and fractures, in which functional status may confound the exposure-outcome association. Under the theory that adjusting for observable proxies for unmeasured confounders may reduce residual confounding bias, we applied high-dimensional propensity score adjustment in sensitivity analyses.17,18 This process led to a movement of all opioid point estimates toward the null, suggesting the possibility of residual confounding in the main analysis. Residual confounding of the opioid-related associations, whereby sicker patients were more likely to receive opioids, cannot be ruled out but is unlikely to account for the full deleterious effects observed.
Other important potential limitations include misclassification of exposures and end points. Although pharmacy claims data are a very useful source of prescription drug information, subjects may not use analgesics consistently as prescribed, and they may use over-the-counter agents not contained in our data set, such as acetaminophen or some NSAIDs. When misclassification of exposure is random, relative risk estimates are typically biased toward the null. We cannot be certain of the direction and extent of bias in our analyses. Furthermore, health care claims data are an imperfect source of end point data. We used algorithms with high specificity whenever available, but some of the end points we examined have no validated algorithms. There is also a possibility that dosages across analgesic type differed and might account for some of the excess risk. Because there is no clear standardized dosing metric, we cannot evaluate this possibility.
The association between opioids and cardiovascular outcomes has not been widely reported, but opioids are known to have many potential effects on the heart, and various cardiovascular adverse effects have been previously reported.19,20 One observational study of outcomes in acute coronary syndromes found that morphine was related to a 41% increase in mortality (95% CI, 26%-57%).21 However, residual confounding in our study cannot be ruled out. As an unexpected finding in this study, the potential relation between opioids and cardiovascular outcomes warrants further study in other databases. Similarly, opioids have been linked only with acute kidney injury in small case series.22,23 The relation between opioids and fractures is not a new finding, but the strength of the association we observed is larger than in previous reports.24- 26 The differences in results may be secondary to fundamental differences in methods, including differences in study populations, reference groups, exposure definitions, and covariate adjustment.
Although the links between given agents and safety events are important, patients are typically less interested in an individual safety event risk and more interested in overall safety. Overall indices of benefit and risk have been described in the Women's Health Initiative trial,27 but there are no accepted measures of overall safety used widely in epidemiologic studies. The 3 composite severe safety events that we defined—events leading to hospitalization, events associated with death, and all-cause mortality—mirror serious adverse events as defined by the US Food and Drug Administration.28 As more studies of comparative safety are conducted, more attention should be paid to defining and measuring overall safety. However, studies across multiple outcomes with several exposures raise concerns regarding multiple comparisons; these concerns need to be balanced with the desire to give a more complete picture of safety. The measure of the number needed to harm allows for a side-by-side comparison of risks associated with therapeutic strategies and states the absolute risk in a clinically meaningful metric. However, numbers needed to harm (or treat) are typically reported in the setting of a randomized controlled trial in which causation is clear; their interpretation should be approached more tentatively in an observational study such as ours.
In summary, we compared the safety of the 3 major analgesic groups—nsNSAIDs, coxibs, and opioids—among a group of older adults with arthritis who initiated treatment in a usual care setting. This comparison would likely never be made in the setting of a randomized controlled trial. We defined a wide range of relevant clinically significant specific adverse events and several composite safety measures. After propensity score matching, the cohorts were well balanced across most baseline characteristics. Incidence rate and HR calculations demonstrated that nsNSAIDs were safer than opioids in many respects. Opioid users experienced moderate risk early in treatment. By 1 year, the numbers needed to harm for opioid users were small and thus clinically relevant. Although nsNSAIDs pose certain risks, these analyses support the safety of these agents compared with other analgesics. The recent concerns raised about opioid use in nonmalignant pain syndromes appear warranted on the basis of these data.
This issue of the Archives includes 2 original investigations and 2 commentaries on pain management. We are publishing them together to bring greater attention to the difficult issues that practicing physicians face in treating pain. As readers will note, there has been an exponential increase in the use of opioid drugs, especially as treatment for chronic nonmalignant pain. In the original investigations, Solomon and coauthors demonstrate the serious side effects of these medications, and the commentary by Becker and O’Connor and that by Graf offer help to practicing internists on how to weigh the benefits and risks of these medications.—Mitchell H. Katz, MD
Correspondence: Daniel H. Solomon, MD, MPH, Division of Rheumatology, Brigham and Women's Hospital, Room PBB-B3, 75 Francis St, Boston, MA 02115 (firstname.lastname@example.org).
Accepted for Publication: April 2, 2010.
Author Contributions:Study concept and design: Solomon, Rassen, and Schneeweiss. Acquisition of data: Solomon and Levin. Analysis and interpretation of data: Solomon, Rassen, Glynn, Lee, Levin, and Schneeweiss. Drafting of the manuscript: Solomon, Rassen, Lee, and Schneeweiss. Critical revision of the manuscript for important intellectual content: Solomon, Rassen, Glynn, Levin, and Schneeweiss. Statistical analysis: Solomon, Rassen, Glynn, Lee, Levin, and Schneeweiss. Obtained funding: Solomon and Schneeweiss. Study supervision: Solomon and Schneeweiss.
Financial Disclosure: Dr Solomon reports serving as an unpaid member of the executive committee of a celecoxib trial sponsored by Pfizer and as an unpaid member of the Data Safety Monitoring Board for an analgesic trial sponsored by Pfizer.
Funding/Support: This project was funded under contract No. HHSA290200500161 from the Agency for Healthcare Research and Quality, US Department of Health and Human Services, as part of the Developing Evidence to Inform Decisions about Effectiveness (DEcIDE) program.
Role of the Sponsors: The study protocol and drafts of this manuscript were reviewed by the Agency for Healthcare Research and Quality.
Disclaimer: The authors of this article are responsible for its content. Statements in the article should not be construed as endorsement by the Agency for Healthcare Research and Quality or the US Department of Health and Human Services.