Flowchart of included studies. CKD indicates chronic kidney disease; HQOL, health-related quality of life; RCT, randomized controlled trial.
Forest plot of all reported 36-item short-form instrument domain data. CI indicates confidence interval; WMD, weighted mean difference.
Clement FM, Klarenbach S, Tonelli M, Johnson JA, Manns BJ. The Impact of Selecting a High Hemoglobin Target Level on Health-Related Quality of Life for Patients With Chronic Kidney DiseaseA Systematic Review and Meta-analysis. Arch Intern Med. 2009;169(12):1104–1112. doi:10.1001/archinternmed.2009.112
Treatment of anemia in chronic kidney disease (CKD) with erythropoietin-stimulating agents (ESAs) is commonplace. The optimal hemoglobin treatment target has not been established. A clearer understanding of the health-related quality of life (HQOL) impact of hemoglobin target levels is needed. We systematically reviewed the randomized controlled trial (RCT) data on HQOL for patients treated with low to intermediate (9.0-12.0 g/dL) and high hemoglobin target levels (>12.0 g/dL) and performed a meta-analysis of all available 36-item short-form (SF-36) RCT data.
We conducted a search to identify all RCTs of ESA therapy in patients with anemia associated with CKD (1966–December 2006). Inclusion criteria were (1) 30 or more participants, (2) anemic adults with CKD, (3) epoetin (alfa and beta) or darbepoetin used, (4) a control arm, and (5) reported HQOL using a validated measure. All available SF-36 data underwent meta-analysis using the weighted mean difference.
Of 231 full texts screened, 11 eligible studies were identified. The SF-36 was used in 9 trials. Reporting of these data was generally incomplete. Data from each domain of the SF-36 were summarized. Statistically significant changes were noted in the physical function (weighted mean difference [WMD], 2.9; 95% confidence interval [CI], 1.3 to 4.5), general health (WMD, 2.7; 95% CI, 1.3 to 4.2), social function (WMD, 1.3; 95% CI, −0.8 to 3.4), and mental health (WMD, 0.4; 95% CI, 0.1 to 0.8) domains. None of the changes would be considered clinically significant.
Our study suggests that targeting hemoglobin levels in excess of 12.0 g/dL leads to small and not clinically meaningful improvements in HQOL. This, in addition to significant safety concerns, suggests that targeting treatment to hemoglobin levels that are in the range of 9.0 to 12.0 g/dL is preferred.
Anemia is a common complication of chronic kidney disease (CKD) and is associated with adverse clinical outcomes and poor health-related quality of life (HQOL).1- 3 Treatment of anemia before the advent of erythropoietin-stimulating agents (ESAs) relied on routine blood transfusions. Although the main advantage of treatment of anemia in CKD with ESAs once focused on preventing the need for blood transfusions, over time, it has shifted to encompass other clinical considerations. Specifically, although the labeling for ESAs in most countries refers to their ability to improve anemia and reduce the need for transfusions,4 many studies have addressed the use of ESAs at varying doses and their effect on survival, cardiovascular morbidity, or HQOL.5
The use of ESAs has become commonplace in most developed countries, and the debate over their use now principally focuses on the optimal target hemoglobin level.1,2,5 This debate has intensified over the past 15 years, with the publication of several large randomized controlled trials (RCTs) testing the effect of using ESAs to achieve various hemoglobin targets.3,6 Many of these trials compared low to intermediate (hereinafter, low/intermediate) hemoglobin target levels, typically in the 95 to 11.5 g/dL range, with high hemoglobin target levels, typically higher than 13.0 g/dL. Perhaps unexpectedly, many of these RCTs have raised important safety concerns with high hemoglobin target levels, with findings of higher rates of adverse events, including mortality and vascular access thrombosis.7- 11 These concerns have led the US Food and Drug Administration, among other regulatory agencies, to change the labeling for ESAs to target hemoglobin levels lower than 12.0 g/dL.12 (To convert hemoglobin to grams per liter, multiply by 10.0.)
Given high baseline mortality rates and the difficulty in finding therapies that improve survival in patients with end-stage renal disease (ESRD),13 and the findings from the Hemodialysis Study14 and Dialysis Clinical Outcomes Revisited Study,15 HQOL improvement has become a focus of treatment of CKD. Broadly, HQOL is defined as the way a patient feels or functions, aspects of which can improve after successful treatment of a condition.16 Minimally, measurement of HQOL includes assessment of functional status, mental health or emotional well-being, social engagement, and symptom states.16 Although there are many different scales that have been used to measure HQOL, the 36-item short-form instrument (SF-36) is among the most commonly applied instrument to measure HQOL. It comprises 36 questions that assess 8 domains of HQOL: physical function, physical role, pain, general health, vitality, social function, emotional role, and mental health.17
Despite the safety concerns that have been associated with normalization of hemoglobin,3,13 knowing that patients with ESRD who are treated with hemodialysis have very poor baseline QOL has led some to suggest that targeting hemoglobin levels higher than 12.0 g/dL might still be reasonable in some patients with ESRD in an effort to improve QOL.2,5 A clearer understanding of the impact of higher-dose ESA on HQOL will enable rational comparison of putative HQOL benefits with potential safety issues. In this study, we aimed to summarize the available data from RCTs on HQOL for patients treated to different hemoglobin target levels and determine the magnitude of HQOL differences. Given that most studies that have measured HQOL in this area have used the SF-36, we also conducted a meta-analysis of SF-36 data for all RCTs comparing low/intermediate and high hemoglobin target levels.
We conducted a comprehensive and exhaustive search to identify all RCTs of ESA therapy in adults with anemia associated with CKD. No language restrictions were applied. MEDLINE (1966 through December 12, 2006), EMBASE (1988 through December 12, 2006), all evidence-based medicine reviews, and a variety of other gray literature sources (n = 48) were searched using different search terms for ESA combined with CKD search terms (eTable 1 provides search strategy used for MEDLINE; strategy adapted for other search engines). Each citation or abstract was independently screened by a subject specialist and 1 other reviewer. Any trial considered relevant by at least 1 reviewer was retrieved for further review. The reference lists of included trials and relevant reviews were also scanned for pertinent trials. In addition, we contacted ESA manufacturers in Canada and the authors of included studies for information about further studies.
The full text of each potentially relevant article was independently assessed by 2 reviewers for inclusion in the review using predetermined eligibility criteria. Studies were eligible for inclusion if they met the following 5 criteria: (1) it was a parallel RCT design with at least 30 participants in each treatment group, (2) the population was limited to anemic adults (age ≥ 18 years) with CKD (including hemodialysis-dependent and nonhemodialysis dependent), (3) epoetin (alfa and beta) or darbepoetin used, (4) the control arm was either a different agent or hemoglobin target, or no active treatment (eg, placebo), and (5) the study reported HQOL using a validated measure defined as one in which its reliability, internal consistency, and responsiveness had been published in the peer-reviewed literature. Because small studies are unlikely to publish long-term outcomes or HQOL, studies with less than 30 participants were excluded to improve the efficiency of the literature search. Initial disagreements on study selection were resolved through consensus.
From all included studies, data were extracted on trial characteristics (country, design, sample size, duration of follow-up, objective, score according to the criteria of Jadad et al18 [hereinafter, Jadad score]), participants (age, sex, diabetic status, cardiovascular status, previous hematopoietic hormone use), illness severity (hemoglobin levels, renal function, dialytic modality), therapeutic regimens (type, dose, schedule, route of administration, target hemoglobin level), control regimens and cointerventions, and HQOL outcomes. Trial quality was assessed using the Jadad score.18
Given that the RCTs were conducted over an 18-year period, several different HQOL measures were used by the different studies. Moreover, some studies used disease-specific measures, and some used generic HQOL measures. As such, we first report qualitatively the results of our systematic review.
We planned a priori to combine the results of HQOL measures for trials that compared the use of ESAs targeting either a low/intermediate hemoglobin target level (ie, a hemoglobin level of 9.0-12.0 g/dL) or high hemoglobin target level (ie, a hemoglobin level >12.0 g/dL). The a priori analysis plan was to combine all available HQOL data. The primary outcome was the change from baseline HQOL. Because most HQOL instruments report continuous data thus, a priori, the weighted mean difference (WMD) was to be used for each instrument separately. However, owing to the lack of standardized reporting and the fact that few studies used similar HQOL scales aside from the SF-36, only the SF-36 data were suitable for meta-analysis.
For this analysis, each domain of the SF-36 was summarized separately. When combining SF-36 domains for base case analyses, we included information from all studies in which any information for any SF-36 domain was reported. Given that several studies selectively reported information for only 1 domain, we completed 2 sensitivity analyses as follows: (1) we combined data only for studies reporting all domains of the SF-36, and (2) we included all studies in which any SF-36 data were reported; for domains that were either not reported or reported as “P > .05,” we assumed no change over time. If there were multiple time points reported per outcome, we included only the last time point. Owing to the differences expected between trials, we decided a priori to combine results using a random-effects model. Statistical heterogeneity was quantified using the I2 statistic. The I2 statistic approximates the percentage of total variation (within- and between-study) resulting from between-study variation.19 Where possible, changes in HQOL for each outcome measure were also compared with the minimal clinically important difference (MCID). For the SF-36, we conservatively considered 5 points to be the MCID.20,21
Figure 1 shows the flowchart for trial selection. From a total of 2289 citations, 231 potentially relevant articles were retrieved for review. Of these, 220 did not meet the selection criteria and were excluded. In particular, 23 excluded articles did not report HQOL, and 3 were excluded because they did not use a validated instrument.7,9,10 There was substantial initial agreement for study inclusion (κ = 0.86). A total of 11 primary articles met all 5 of the inclusion criteria.3,6,8,11,13,22- 27 Of these, 10 compared a low/intermediate target level with a high hemoglobin target level, and 1 compared placebo with a target level of 9.5 to 11.0 g/dL and a target level of 11.5 to 13.0 g/dL.22 Although these target levels overlap our definition of low/intermediate (9.0-12.0 g/dL) and high (>12.0 g/dL), we classified the 9.5 to 11.0 g/dL level as low/intermediate and the 11.5 to 13.0 g/dL level as high.
Table 1 summarizes the characteristics of included trials. The trials ranged in size from 78 to 1432 patients. Five trials considered patients with CKD undergoing hemodialysis, whereas 6 considered those with CKD who were not. Health-related quality of life was the specified primary end point in 2 studies.8,22 Other primary end points considered were time to first cardiovascular event or death,3,6,13 change in left ventricular function,23- 27 and rate of glomerular filtration rate decline.11The mean Jadad score was 4, indicating generally high-quality studies. Of note, 3 of the 12 trials were terminated early owing to futility (ie, the perception that use of the high hemoglobin target level was unlikely to have led to better outcomes if the trial were completed). Patients were blinded to the allocated hemoglobin target level in only 2 of 11 studies.24,25
A total of 10 instruments were used to report HQOL in the 11 included trials. An overview of the measures used and the minimal clinically important difference cited for each measure is provided in eTable 2. The most commonly used instrument was the SF-36, administered in 9 of the 11 trials. Reporting of these data was generally poor. There was a wide variety of reporting formats, with some trials reporting baseline data and others reporting only change from baseline. Only 5 trials reported data on all domains and scales measured.3,8,11,22,27
The findings of each trial are summarized in Table 2. Most trials compared the mean change from baseline in the measured HQOL instrument or domain between the treatment groups. For studies comparing low/intermediate and high target level groups, the differences reported between groups were generally small, were noted on a minority of measured domains, and were consistently below the cited minimal clinically important differences (eTable 2). In particular, the magnitude of the between-group differences rarely exceeded the 5-point threshold commonly considered clinically important for the SF-36.20,21
Given that the SF-36 was the most commonly used instrument, we undertook a WMD meta-analysis of each domain, comparing low/intermediate and high target level hemoglobin groups. To calculate a WMD, either both baseline and final scores or an overall mean change is required. As such, only 4 of the 9 trials that measured SF-36 reported useable data (Table 3). However, only the Correction of Hemoglobin and Outcomes in Renal Insufficiency (CHOIR) study group3 reported all 8 domains; the Cardiovascular Risk Reduction in Early Anemia Treatment With Epoetin Beta (CREATE) study group6 published 6 domains, whereas the Anemia Correction in Diabetes (ACORD) study group26 and Parfrey et al25 reported only 1 SF-36 domain. All authors were contacted to obtain additional data; 2 responded and provided unpublished data24,27 (Table 3). Thus, 6 studies, including 3 with complete data on all SF-36 domains, were included in the meta-analysis.
Figure 2 shows the forest plot of all reported SF-36 domain data. The WMDs ranged from −3.0 (favoring the low intermediate target level arm) in the emotional role domain to 3.2 (favoring the high target level arm) in the vitality domain. A change that would be considered clinically significant was not found in any of the domains (Table 4). Significant heterogeneity was found in the physical role (I2 = 68.6%; P = .02), social function (I2 = 72.6%; P = .01), and mental health (I2 = 70.5%; P = .02) domains.
When only studies reporting all domains of the SF-36 were included, smaller differences between arms were noted, with the low/intermediate target level arm being favored in more domains. When an assumption of no difference in the unreported domains is made, the mean change in domain scores was similar to the base case.
Although we found small improvements in HQOL in 4 of 8 SF-36 domains, the changes were well below the threshold for a minimal clinically important difference of a 5-point change (all were <3.2 points on a 100-point scale).21 It is often argued that targeting a high hemoglobin target level would be most likely to have an impact on vitality scores, the domain that captures the sentiment of “having more energy.” However, vitality scores were only 3.2 (95% confidence interval [CI], 1.87-4.44) points higher in the high target level group compared with the low/intermediate target level group. Higher scores were observed in the high hemoglobin target level arm in 3 domains (vitality, physical function, and general health) across all studies. However, again, the improvements were very small and would not be regarded as a clinically meaningful improvement (2.9 for physical function, 2.7 for general health, and 3.2 for vitality).
Although improvements in HQOL are commonly cited as a reason to target a higher hemoglobin level, numerous limitations of the existing literature should be noted. Health-related quality of life was not reported in 26 of 37 identified RCTs, and among the 11 RCTs that did measure HQOL, 9 used the SF-36, although only 1 study3 reported full data on the SF-36 (although 2 additional authors provided full SF-36 data24,27). Given that patient blinding is potentially difficult in this type of study, and that 9 of 11 studies were open label, it is possible that partial blinding of patients to their assigned group might have contributed to the small changes noted in these measures.
Our findings highlight the importance of distinguishing statistical significance from clinical significance. The minimal clinically important difference is defined as the smallest amount of benefit that patients can recognize and value.28 Thus, a clinically important change is relevant and important to patients, whereas a statistical change is simply a detectable change that may or not be noticeable to patients. Given that most of these trials were designed to measure differences in clinical outcomes, including survival, these RCTs may have been powered to detect statistically significant changes in measurements that are of no clinical relevance to patients.
If multiple time points were measured, our analysis considered the latest time point reported. For some domains of the SF-36, particularly vitality, a short-term improvement may be observed at the other time points, such as 6 or 12 months. However, the clinical relevance of short-term small improvements that are not sustained over longer periods of follow-up is unknown.
The use of ESAs has considerably altered the treatment of patients with anemia related to CKD. Although the initial use of ESAs was aimed at preventing the requirement for transfusion, much of the subsequent research around them has focused on determining whether using them to achieve a high hemoglobin target level is associated with clinical benefit. Despite concern regarding the safety of ESAs when targeted to a high hemoglobin level, some guidelines continue to advocate that hemoglobin target levels in excess of 12.0 g/dL may still be reasonable for selected patients; this seems to be the result of expected improvements in HQOL.2,5 Our analyses do not substantiate the basis for these guidelines. Our results continue to be very relevant because recent reports confirm that nearly 30% of American patients who undergo hemodialysis have a hemoglobin level in excess of 12.0 g/dL.29
There were several strengths and limitations of our study. We systematically reviewed the literature, including contacting manufacturers, to ensure that we identified all relevant RCTs. Despite this, it should be noted that many of the RCTs did not measure HQOL, and among those that did, selective reporting occurred, resulting in an incomplete data set with which to rigorously assess HQOL. To account for this, we conducted 2 sensitivity analyses, the results of which suggest that our base case analysis may overestimate the actual treatment effect.
Our findings and the strength of our conclusions are limited by the available evidence. Given that HQOL is one of the major theoretical benefits of ESA, the selective reporting of certain domains, the variation in instruments used to measure HQOL, and the lack of directly measured data on utility are major weaknesses of the available literature. However, we were able to include some unpublished data supplied by the authors of the original studies, thereby minimizing some of the reporting bias that may be present.
Last, there are methodological limitations with meta-analyses. However, we conducted the review according to a prespecified protocol, used a well-defined comprehensive literature search strategy designed by an expert librarian, performed quality assessment and data extraction with duplicate reviewers, and used rigorous statistical methodology that would be expected to reduce the extent of any bias.
Erythropoietin-stimulating agents are an important aspect of treatment for patients with CKD, especially for those undergoing hemodialysis or who are at risk of requiring blood transfusion without treatment. The cited goal of treatment to a higher hemoglobin target level is often to improve HQOL. However, our systematic review revealed a weak evidence base in support of that strategy, with poorly reported data and only 2 RCTs that examined HQOL as the primary study end point. Our study suggests that targeting a hemoglobin level in excess of 12.0 g/dL leads to small and not clinically meaningful improvements in HQOL. This, in addition to considerable safety concerns, would suggest that targeting treatment to hemoglobin levels that are in the range of 9.0 to 12.0 g/dL is reasonable.
Correspondence: Braden J. Manns, MD, MSc, Department of Nephrology, Foothills Medical Centre, 1403 29th St NW, Calgary, AB T2N 2T9, Canada (firstname.lastname@example.org).
Accepted for Publication: February 7, 2009.
Author Contributions:Study concept and design: Clement, Klarenbach, Tonelli, Johnson, and Manns. Analysis and interpretation of data: Clement, Klarenbach, Tonelli, and Manns. Drafting of the manuscript: Clement and Manns. Critical revision of the manuscript for important intellectual content: Clement, Klarenbach, Tonelli, Johnson, and Manns. Statistical analysis: Clement and Tonelli. Obtained funding: Manns. Administrative, technical, and material support: Clement and Johnson. Study supervision: Klarenbach, Tonelli, and Manns. Methodologic HQOL expertise: Johnson.
Financial Disclosure: None reported.
Funding/Support: Dr Clement is supported by a postdoctoral fellowship award from the Canadian Health Services Research Foundation and from the Alberta Heritage Foundation for Medical Research (AHFMR). Dr Johnson holds a Canada Research Chair in Diabetes Health Outcomes and holds a Health Scholar Award from the AHFMR. Drs Manns and Tonelli are supported by a Canadian Institutes for Health Research (CIHR) New Investigator Award. Drs Tonelli and Klarenbach are supported by Population Health Investigator awards from the AHFMR. Dr Klarenbach is supported by a Scholarship Award from the Kidney Foundation of Canada.
Additional Contributions: Tamara Durec, MLIS, designed the comprehensive literature search strategy. We gratefully acknowledge the researchers who provided unpublished SF-36 data.