eMethods. Additional Methodological Details
eTable. Drug-Biomarker Combinations Evaluated in the Study
Wang B, Canestaro WJ, Choudhry NK. Clinical Evidence Supporting Pharmacogenomic Biomarker Testing Provided in US Food and Drug Administration Drug Labels. JAMA Intern Med. 2014;174(12):1938-1944. doi:10.1001/jamainternmed.2014.5266
Genetic biomarkers that predict a drug’s efficacy or likelihood of toxicity are assuming increasingly important roles in the personalization of pharmacotherapy, but concern exists that evidence that links use of some biomarkers to clinical benefit is insufficient. Nevertheless, information about the use of biomarkers appears in the labels of many prescription drugs, which may add confusion to the clinical decision-making process.
To evaluate the evidence that supports pharmacogenomic biomarker testing in drug labels and how frequently testing is recommended.
Publicly available US Food and Drug Administration databases.
Main Outcomes and Measures
We identified drug labels that described the use of a biomarker and evaluated whether the label contained or referenced convincing evidence of its clinical validity (ie, the ability to predict phenotype) and clinical utility (ie, the ability to improve clinical outcomes) using guidelines published by the Evaluation of Genomic Applications in Practice and Prevention Working Group. We graded the completeness of the citation of supporting studies and determined whether the label recommended incorporation of biomarker test results in therapeutic decision making.
Of the 119 drug-biomarker combinations, only 43 (36.1%) had labels that provided convincing clinical validity evidence, whereas 18 (15.1%) provided convincing evidence of clinical utility. Sixty-one labels (51.3%) made recommendations about how clinical decisions should be based on the results of a biomarker test; 36 (30.3%) of these contained convincing clinical utility data. A full description of supporting studies was included in 13 labels (10.9%).
Conclusions and Relevance
Fewer than one-sixth of drug labels contained or referenced convincing evidence of clinical utility of biomarker testing, whereas more than half made recommendations based on biomarker test results. It may be premature to include biomarker testing recommendations in drug labels when convincing data that link testing to patient outcomes do not exist.
The ability to target and tailor drug therapies based on genetic information has created much hope that personalization of pharmacotherapy will revolutionize health care. This enthusiasm is supported by a vast literature that evaluates a variety of biomarkers with polymorphisms that have predictive significance and can potentially improve a drug’s efficacy or safety profile. On the basis of this literature, the US Food and Drug Administration (FDA) has included pharmacogenomic information in the physician prescribing information (drug labels) of more than 100 drugs. For example, the label for the oral anticoagulant warfarin suggests dosage adjustments based on a patient’s genotype for CYP2C9 and VKORC1, whereas the label for abacavir sulfate, an antiretroviral, recommends avoiding its use in patients who screen positive for the HLA-B*5701 allele.
Prescription drug labels are an important source of information about drug therapies for many health care professionals,1 and the information contained in them also appears in other frequently consulted references, such as UpToDate,2 Micromedex,3 and the Physicians' Desk Reference.4 Because only 1 in 10 US physicians reports being adequately informed about the appropriate use of pharmacogenomic biomarkers,5 the information and recommendations included in labels should be not only evidence based but also directly relevant to clinical decision making. However, despite their inclusion in drug labels, the use of many biomarkers does not appear to be clearly associated with health benefits. For example, although the label for warfarin contains a recommended dosing algorithm based on CYP2C9 and VKORC1 polymorphisms, the clinical utility of genotype-based dosing for this medication (ie, the ability to improve clinical outcomes compared with management without genetic testing6) remains unclear.7- 10 Furthermore, inclusion of potentially tenuous recommendations or associations within a drug’s label may encourage health care professionals to order tests or lead them to change therapies based on limited evidence, even if the label does not recommend explicit action based on biomarker status. This outcome is particularly relevant given the increasing cost associated with genetic tests, which is projected to increase at more than twice the rate of overall health care spending and reach $10 billion in the United States by 2015.11,12 We sought to determine the level of evidence that supports the use of pharmacogenomic biomarker testing in drug labels and how frequently their testing is recommended, as well as to examine how completely these supporting studies are cited in the labels.
Because all data came from public sources, this study was not human subject research and did not require IRB review or informed consent. We identified medications that contained biomarker information in their labels from the FDA Table of Pharmacogenomic Biomarkers in Drug Labeling, a publicly accessible database.13 This database contains all FDA-approved drugs with pharmacogenomic information in their labels and is updated on a regular basis by the agency. For each medication, we gathered the earliest available drug label that contained mention of its associated biomarker(s) from Drugs@FDA, another publicly accessible FDA database available through the agency's website that lists regulatory actions, including initial approvals and drug labeling changes.14 We focused on this particular label to evaluate the quality of the cited evidence supporting testing recommendations and accessibility of these supporting studies at the time when physicians could have first become aware of this pharmacogenomic information through the drug label. We could not gather the drug labels from this FDA database for 4 of the drugs in our study (nefazodone, propafenone, protriptyline, and thioridazine) and instead gathered their labels from other databases.15,16 Because information about a single biomarker may have been included in the label of several different drugs and because several drug labels contained information about more than one biomarker, the drug-biomarker combination was our unit of analysis.
Two of the authors (B.W. and W.J.C.) independently evaluated the label(s) for the quality of cited clinical evidence, the completeness of the citation of supporting studies, and the presence of recommendation for use of biomarker testing, with disagreements resolved by consensus (eMethods in the Supplement).
Using guidelines from the Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Working Group,17 we categorized the robustness of evidence for each biomarker to support its clinical validity (ie, the ability to predict phenotype) and clinical utility (ie, the improvement of clinical outcomes as a result of the use of the biomarker compared with management without genetic testing) as convincing, adequate, or incomplete (based on the criteria that the EGAPP Working Group uses to define data as inadequate) (Table 1). The EGAPP Working Group was formed by the Centers for Disease Control and Prevention and is charged with establishing a systematic, evidence-based approach to assessing genetic tests and other applications of genomic technology. To mirror how practicing health care professionals are likely to interpret the evidence cited within a label, we graded the quality of evidence that supports a biomarker’s clinical validity and utility based on the information presented within the drug labels and evaluated other relevant studies only when a citation was included in the label, allowing for their location on the PubMed/MEDLINE database.18 We did not gather or evaluate other sources of information physicians may use to inform their practice.
If the drug label contained evidence that addressed the clinical validity of a drug-biomarker association, including pharmacokinetic studies, but provided an insufficient description and citation to identify its design, we rated the evidence of clinical validity as incomplete. For example, we considered a label for drug X that listed clinical studies as having demonstrated a decreased hemoglobin level in patients with G6PD deficiency but that did not provide or cite information to identify the study design as incomplete evidence for clinical validity of the drug X–G6PD association. We also rated the evidence of clinical validity as incomplete if the drug label did not contain or cite any evidence that addressed the clinical validity of the drug-biomarker association.
For evidence in drug labels to support the clinical utility of biomarker testing, we required that it demonstrated improved patient outcomes when biomarker testing was incorporated into treatment decisions compared with when no testing was performed. For targeted therapies in which the medication was developed specifically to interfere with or manipulate a certain biomarker to have its intended effect (eg, trastuzumab was developed specifically to target the ERBB2 [formerly HER2 or HER2/neu] biomarker), we did not require the presence of a nontesting group for the evidence to support clinical utility but instead focused our assessment on studies that evaluated the drug’s efficacy in the intended patient population (ie, those who screened positive for the targeted biomarker). We erred on the side of considering evidence as convincing when characterizing the quality of a particular study design or when determining whether the evidence addressed the clinical validity and/or utility for each drug-biomarker combination (Table 1).
We graded the completeness of citation of supporting studies in drug labels as full, partial, or none. A grade of full was conferred for labels that included a sufficient citation of supporting studies in a separate references section or by name (eg, PREDICT-1 study for abacavir) such that the studies can be located in the PubMed/MEDLINE database. A grade of partial was given for labels that did not include citations for supporting studies but described the study design and/or results. For clinical validity, these results include variations in levels of a drug based on status of the associated biomarker. A grade of none was reserved for labels that made no description of the supporting studies. When the completeness of description of supporting studies in drug labels differed between clinical validity and clinical utility, we used the higher grade for our analysis.
We evaluated whether the label recommended incorporation of the biomarker test result in therapeutic decision making. These recommendations were categorized as being based directly on a drug's mechanism of action, as indicated by mention in the Indications or Mechanism of Action sections of the drug label (eg, trastuzumab's targeting of ERBB2 overexpression or busulfan's treatment of chronic myelogenous leukemia, a condition characterized by a translocation in the Philadelphia chromosome biomarker, were interpreted as implicit recommendations for biomarker testing before treatment initiation), being based on drug-biomarker associations (dosage adjustment, contraindication or avoidance, and follow-up laboratory testing), or being absent. A recommendation was considered to be present even when it advocated for alterations in therapy of another drug taken concurrently as a result of drug-drug interactions as long as the interaction was due to polymorphisms in the given biomarker.
We used descriptive statistics to summarize the supporting evidence and recommendations contained in the drug labels for each drug-biomarker combination. We then performed a Fisher exact test to determine whether biomarkers for targeted therapies (ie, medications developed specifically to interfere with or manipulate a biomarker to have its intended effect) demonstrated a different proportion of convincing data that supported clinical utility compared with nontargeted therapies and to compare the quality of cited evidence in labels with and without testing recommendations. Because numerous cancer agents were specifically developed to be targeted therapies, whereas most neuropsychiatric drug-biomarker combinations are discovered after approval and are related to adverse events, we performed a prespecified Fisher exact test to determine whether biomarkers for oncology drugs and neuropsychiatry drugs demonstrated a different proportion of convincing evidence supporting clinical utility in their labels compared with other biomarkers.
We identified 119 drug-biomarker combinations, representing 107 drugs and 39 unique biomarkers (eTable in the Supplement). Most of these drug-biomarker combinations (75 [63.0%]) are intended to reduce the occurrence of adverse drug events, whereas the remainder (44 [37.0%]) relate to the drugs’ efficacy. The most common clinical specialties covered by these biomarkers were oncology (37 [31.1%]), neuropsychiatry (33 [27.7%]), gastroenterology (9 [7.6%]), infectious disease (9 [7.6%]), and cardiovascular disease (8 [6.7%]) (Table 2).
Forty-three drug labels (36.1%) provided convincing evidence of the clinical validity of the biomarker (ie, the ability of the biomarker to predict the phenotype of interest), whereas 18 (15.1%) provided convincing evidence that the use of the biomarker has clinical utility (ie, the biomarker’s ability to improve clinical outcomes). Table 3 contains examples of our grading approach. Seventy-six drug labels (63.9%) did not provide convincing evidence of clinical validity or utility. Biomarkers for cancer drugs were much more likely to demonstrate convincing evidence supporting clinical utility in their labels compared with all other biomarkers (14 of 37 [37.8%] vs 4 of 82 [4.9%], P < .001), whereas neuropsychiatry biomarkers were less likely to demonstrate convincing clinical utility evidence in their labels than the remaining biomarkers (0 of 33 vs 18 of 86 [20.9%], P < .001). Targeted therapies consisted mainly of oncology drugs (26 of 34 [76.5%]). Seventeen targeted therapies (50.0%) contained convincing data that supported clinical utility in their labels compared with 1 of 85 (1.2%) nontargeted therapies (P < .001). Abacavir, whose label recommends it not be used in patients who screen positive for the HLA-B*5701 allele, was the only nontargeted therapy whose biomarker was supported by convincing clinical utility evidence.
Thirteen labels (10.9%) contained a full citation of supporting studies, whereas 36 (30.3%) made no mention of the scientific literature that supported the biomarker it discussed.
Sixty-one labels (51.3%) made recommendations about how clinical decisions should be based on the results of a biomarker test, with 34 (28.6%) being based directly on the drug's mechanism of action and 27 (22.7%) being based on drug-biomarker associations. Among biomarkers with neither convincing clinical utility nor validity data, 24 of 76 labels (31.6%) still contained testing recommendations. The drug label for the psychotropic drug iloperidone, for instance, recommended consideration of dose adjustments based on patients' CYP2D6 status despite lack of clinical utility and validity data. Similarly, among labels that made recommendations, only 18 of 61 (29.5%) provided convincing clinical utility data. Labels that made testing recommendations were more likely to provide convincing clinical utility data compared with labels that made no recommendations (36 [30.3%] vs 0 [0%], P < .001).
Overall, our analysis revealed deficiencies in the evidence provided in drug labels that supports the use of many pharmacogenomic biomarkers, with fewer than one-sixth of labels containing or citing convincing evidence for clinical utility and almost two-thirds even lacking convincing data for clinical validity. This deficiency was especially prominent among biomarkers of nontargeted therapies, which constituted more than 70% of our study sample.
The limited amount of convincing evidence for the clinical utility of pharmacogenomic biomarkers in the labels we reviewed is perhaps not surprising given the difficulty of demonstrating that a test alters clinical outcomes rather than simply predicts a disorder or phenotype. Nevertheless, the primary goal and potential of pharmacogenomics and personalized medicine is to change patient outcomes. In our opinion, it is premature to include testing recommendations in labels when such utility data are neither described nor cited in this resource, even if it exists elsewhere. It could be argued that high-quality data indicating that individuals with certain polymorphisms metabolize a drug differently may be relevant to patient care and therefore should be included in drug labels as the basis for biomarker testing. However, other than predictive value, the inclusion of this type of information does not provide guidance as to what health care professionals should actually do when they get the test results. The case of CYP2C19 testing for clopidogrel illustrates this well. Polymorphisms of this gene are associated with poorer drug metabolism and a higher risk of thrombotic events, prompting testing recommendations for these variants to be added to the drug’s label in 2010. However, the consequences of the clinical actions that might rationally be taken based on this information have still not been convincingly evaluated. If this drug’s dose were increased or an alternative drug with a less favorable risk-benefit profile were chosen on the basis of what is actually an uninformative biomarker result, it may lead to worse treatment outcomes while contributing to increasing expenditures at a time when our health care system is least able to afford it.
As a result, we believe that testing recommendations supported by clinical validity alone adds confusion, not clarity, to the clinical decision-making process, especially if the evidence is not clearly explicated or cited alongside the guidance. There may also be legal implications regarding the inclusion of testing recommendations in drug labels. Could prescribers be liable for adverse events that may have been predicted by biomarker testing, regardless of the evidence base for the recommendations? If not, at what level of evidence should the reasonableness standard to test apply?
A multipronged approach should be used to address the current situation. At minimum, an explicit statement about the quality of clinical utility evidence for each testing recommendation should be presented in the labels rather than charging health care professionals with the task of extrapolating quality based on the data presented. Alongside this statement should be complete references to help patients and health care professionals easily access the full studies for further assessment. More stringently, the FDA could issue regulations to only include information about pharmacogenomic biomarkers if compelling clinical utility information has been generated, although exceptions should be made for drug-biomarker combinations associated with efficacy or safety end points of particular significance, such as clopidogrel and CYP2C19.
The perceived lack of incentive for biomarker test developers and pharmaceutical manufacturers to conduct robust studies that describe the clinical utility of biomarker testing, which may demonstrate the lack of need for biomarker testing or reduce the number of patients eligible for a particular drug, poses another challenge. However, establishing the clinical utility of a biomarker to target a particular high-risk patient population with high-quality studies may increase drug sales and biomarker test orders, as was the case with abacavir.19,20 To provide further incentive, the FDA could waive user fees and/or prioritize the review of a subsequent drug application for manufacturers who conduct robust clinical utility trials in support of their approval applications. In addition, governmental agencies can directly support trials that investigate drug-biomarker combinations of particular clinical significance, as has been done for drugs for which manufacturers have lacked specific incentive to do so.21
In the meantime, physicians and other prescribers should be aware of the relative lack of evidence to support many treatment recommendations pertaining to biomarkers that are contained within the labels and should instead scrutinize the primary literature that supports these recommendations before taking clinical action. Our finding that more than two-thirds of labels that make testing recommendations do not contain convincing evidence to support clinical utility of the biomarker further reinforces the need for skepticism.
Our study has several limitations. For each medication, we evaluated the earliest accessible label that contained mention of each associated biomarker's polymorphisms to assess the evidence available to physicians at the time the biomarker was first included in the label. The initial labels of some of the drugs in our study were not accessible, which may have resulted in our evaluation of subsequent labels with updated pharmacogenomic evidence and recommendations. This discrepancy may have resulted in an overestimate of the strength of the cited evidence. Moreover, studies that found a lack of clinical validity or utility of drug-biomarker associations may not be included in the drug labels, which are drafted by manufacturers. However, because the FDA reviews and regulates the content of the labels, it should still require that major studies of such nature be included in subsequent labeling revisions. In addition, we assessed the quality of clinical evidence only of information presented within the drug labels and additional full studies adequately referenced in these documents; we did not evaluate other sources of evidence, including additional primary literature or FDA medical reviews. Our rationale for doing this was 2-fold: drug labels should be self-sufficient in presenting time-constrained prescribers with the evidence base and references that support incorporation of biomarker testing into clinical practice, and the content presented in the labels is often used to inform other widely used tertiary-level drug information resources. The results of our analysis are consistent with the 3 systematic reviews of drug-biomarker combinations that the EGAPP Working Group has conducted, and our analysis applied its evaluative framework: selective serotonin reuptake inhibitors and CYP450s,22UGT1A1 and irinotecan,23 and EGFR and cetuximab and panitumumab.24
In addition, although the FDA has generally required less robust evidence for approval of oncology drugs with the goal of accelerating access for patients, we applied the same methods in our evaluation of labels for biomarkers associated with these drugs to generate a standardized description of the evidence base available. Attention to the wide variation of clinical trial evidence submitted to the FDA to support the successful approval of novel agents,25 coupled with calls for more rigorous oncology trials to better reflect developments in treatment paradigms and clinical outcomes of different cancer types,26 further supports the need to take a uniform approach to understanding the existing scientific landscape. Despite the decreased rigor required for approval of certain cancer medications, our analysis revealed that biomarkers for oncology drugs still had a much higher level of convincing evidence base in their labels than biomarkers for drugs that treat other conditions. Future discussions and analyses should explore whether the current EGAPP Working Group rating criteria should be customized for different types of diseases or different prevalence of adverse effects. For example, drug-biomarker combinations that are designed to address rare adverse effects may be better evaluated with well-designed observational studies to generate evidence of clinical utility rather than relying exclusively on randomized clinical trials.
There is reason to be enthusiastic about the potential of biomarkers to enhance clinical care, and our analysis identified many examples of tests that have convincing evidence of their ability to meaningfully improve health care outcomes in patients with common conditions. However, other less evidence-based labeling recommendations highlight the need for clearer guidance on their optimal use. Until this problem is addressed, physicians are left with the challenging task of navigating a sea of guidance with varying foundations of clinical support in pursuit of practicing clinically sound and cost-conscious medicine, a challenge that will likely increase in cadence with the growth of the pharmacogenomics field.
Accepted for Publication: July 29, 2014.
Corresponding Author: Niteesh K. Choudhry, MD, PhD, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, 1620 Tremont St, Ste 3030, Boston, MA 02120 (firstname.lastname@example.org).
Published Online: October 13, 2014. doi:10.1001/jamainternmed.2014.5266.
Author Contributions: Dr Choudhry had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Wang.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Wang, Canestaro.
Administrative, technical, or material support: Wang.
Study supervision: Choudhry.
Conflict of Interest Disclosures: None reported.