|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Defining The Balance Of Risk And Benefit In The Era Of Genomics And Proteomics
The ability to measure the function of genes and proteins has spawned the construct of personalized medicine, in which patients own risks and preferences are used to choose diagnostic and therapeutic strategies. The complexity of clinical data required to guide personalized medicine calls for improvements in our system of clinical research, including (1) overhauling it to produce networks that can do adequate-size pragmatic trials; (2) synchronization of regulatory and payment systems to encourage adequate studies; and (3) an investment in education of providers and patients to improve the understanding of the probabilistic predictions forming the basis of personalized medicine.
Health care is caught between two unremitting forces: a growing array of new technologies with associated expectations about effects on life expectancy and freedom from disability, and the constantly increasing cost of delivering these technologies.1 Nobody wants to ration highly effective technology as a means of controlling costs. The preferred approach is to refrain from using ineffective technology, concentrating our spending on technology that improves longevity, quality of life, or productivity, coupled with a delivery system that maximizes the benefit and minimizes the risk to patients. As methods of measuring the benefit and risk of therapeutics have improved, the argument has shifted to a consideration of how the therapys marginal benefit/risk balance relates to its cost relative to other alternatives. The concept of personalized, prospective medicine improves the potential for successful, sophisticated evaluation of the balance of risk and benefit because it embodies the belief that technology is not simply effective or ineffective, but that it is likely to be more effective in some people and less effective or even harmful in others.2 In theory, the attributes of technology can be tailored to the needs of the individual.3 The anticipated effect of personalized medicine also fits with the increasing belief that people should be responsible for their own decision making to the greatest extent possible; if they are given information about the benefits and risks of medical products tailored to their biology and their preferences, one could reasonably hope that they would make more rational health care choices. As a consequence of recent technological advances, chronic disease increasingly drives health care costs and decisions. The construct of personalized health care is built around prevention of the consequences of chronic disease. The belief has been promulgated that if one can understand the composition of ones genes, proteins, and metabolites, in the era of sophisticated molecular imaging, knowledge of the presence of disease or propensity for its development, its prevention, and its treatment will be individualized and much more potent.4 Although this concept is alluring, predicting the impact of an intervention over many years is a different endeavor than measuring its short-term effect. The large increase in the combinations of personal characteristics and associated treatments will increase rather than decrease the need for clinical outcome data to determine whether potential variations in preventive efforts or treatment produce good or bad health outcomes. Indeed, our current knowledge about generalizable principles for assessing technologies and therapies suggests that we need to improve the system of evaluation of medical technologies (drugs, devices, and procedures) in three critical ways: (1) increase the amount of outcome-based clinical data from comparative randomized trials; (2) develop more harmonious approaches that encourage and incentivize several steps in approval and labeling of technologies for chronic disease; and (3) invest in educating providers, payers, and consumers to better understand the probabilistic nature of therapeutic decisions. In essence, as outlined more than a decade ago, personalized medicine means more need for technology assessment, not less.5
To a large extent, the current system of medical product approval by the Food and Drug Administration (FDA) and much of medical marketing is built upon an antiquated view of clinical therapeutics. This perspective holds that since the rational development of medical products is based on identification and characterization of biological targets derived from pathophysiologic reasoning, the health consequences of using those products should be predictable based on similar reasoning. In theory, given targets that are within a disease pathway and that can be modified, therapies that modify those targets can be developed and tested in specific subgroups, concentrating the power of the treatment effect on those with the most to gain. The therapys effect on biomarkers can be measured quickly with relatively small sample sizes; one can prove, for example, whether a drug lowers blood glucose, LDL cholesterol, or blood pressure or improves a psychological scale of depression in hundreds rather than thousands of patients. Based on pathophysiological reasoning and a few simplifying assumptions, one can create a rational process for estimating which treatments are most effective for specific "types" of patients identified through clinical or genetic characteristics and then for developing progressively more efficient ways to deliver therapies to them. As our knowledge of the evidence needed to understand the balance of risks and benefits with technology has advanced, the need to reexamine this construct has become apparent. Of course, many technical business issues come into play when a decision is made to develop a drug or device for marketing, including intellectual property rights, market size, competing products, pricing controls, and distribution channels. If these issues are resolved, as payers and consumers become increasingly sophisticated and our knowledge of the principles of therapeutics improves, the industry has three hurdles to consider in planning research and development: the FDAs decision to allow a product on the market, payers decisions to provide coverage, and the beliefs of consumers/patients and providers about whether the product should be used, given that it is available and covered.6
Eleven principles. In a recent review, a series of lessons from cardiovascular clinical trials pointed out the complexity of providing convincing evidence about personalized medicine.7 These lessons are summarized as follows: (1) Treatment effects are modest; (2) qualitative interactions are uncommon, but quantitative interactions are usual; (3) unintended targets are common; (4) interactions are unpredictable; (5) long-term effects deserve evaluation; (6) class effects can be uncertain; (7) most therapies produce a combination of helpful and harmful effects; (8) most beneficial therapies do not save money, but they are incrementally cost-effective; (9) applying the results of clinical trials is beneficial; (10) some areas of cardiovascular medicine are underserved; and (11) participation is imperative.8 The case of hormone replacement therapy (HRT) has been instructive in this regard. After years of increasingly strong recommendations for women to use HRT, the world was startled by the revelation that the most commonly prescribed combination of estrogen and progestin not only failed to protect women from vascular disease but actually increased the risk of vascular events.9 Perhaps more surprising to veteran clinical trialists was the demonstration that HRT did not improve quality of life except in the relatively small cadre of women with active hot flashes. Despite the widespread belief to the contrary and many (likely selectively published) clinical trials with small sample sizes, two adequate-size trials showed HRT to have no greater effect than placebo on broad and specific measures of quality of life, including cognitive functioning; most recently, a modest detrimental effect on cognitive functioning was documented.10 Despite evaluation of multiple subgroups, based on both biological and clinical findings, the investigators failed to find a subgroup with benefit for cardiovascular risk, but they did identify women with active hot flashes as benefiting with regard to quality of life. Because most technologies, particularly drugs, affect both intended and unintended biological targets, it is hazardous to depend on biological measurements to reliably predict clinical outcomes.11 A litany of therapies showing a beneficial effect on one aspect of disease-related biology have turned out to be neutral or detrimental (such as HRT, troglitazone, cisapride) when overall effects on risk and benefit were measured; in other cases, the long-term benefits of treatment far outstripped the predicted benefit, which was based on biological insight (such as statins, angiotensin-converting enzyme, or ACE, inhibitors, aspirin).12 Importantly, these findings do not detract from the importance of biomarkers to enhance our understanding of biology; treatments that lower blood pressure will almost always reduce stroke, and those that reduce LDL cholesterol will almost always reduce atherosclerosis. It is not usually the intended pathways that create the problem with surrogates in predicting the integrated balance of risk and benefit. Instead, the culprits are the unintended or previously unknown biological pathways. Modest effect on outcomes. While rare cases of major curative technologies can be cited, for the most part new therapies add incrementally if at all to the clinical outcome of patients with, or at risk of developing, disease. Since treatment effects on clinical outcomes are generally quite modest (less than 25 percent event-rate reductions on a relative scale), the sample sizes required to measure them are much larger than traditionally believed, and these relative benefits translate into small effects on absolute scale.13 This issue is exacerbated because common chronic diseases increasingly have many available therapies, making small, marginal differences highly likely. Because of the treachery of the play of chance, there is simply no substitute for enrolling adequate numbers of patients into randomized controlled trials to give reliable, reproducible answers about a technologys health outcomes. A prevalent approach to these issues has been to assume that breakthroughs in our understanding of biology will allow us to segregate patients with (or at risk of) a disease into two groups: those who will and those who will not benefit from a new therapy. The promise of genomics and proteomics is largely built on the theory that these technologies will enable us to judge the risks and benefits of therapies in much smaller, focused groups of patients. While a few notable successes have occurred, progress has been slow, and many heralded advances have not been replicated in validation studies.14 Indeed, a high proportion of genomic findings about disease susceptibility have been false positive, and given that most subgroup findings in clinical trials have been false positive, the use of multiple subgroups based on genomic or proteomic findings is cause for great concern.15 In two recent examples that could portend the future of personalized medicine, subgroups based on clinical characteristics have been used to make major decisions regarding coverage by the Centers for Medicare and Medicaid Services (CMS).16 In one case, implantable cardioverter defibrillators were funded by the CMS only if the duration of the QRS complex on the electrocardiogram (a measure of the duration of depolarization of the left ventricle) exceeded 12 milliseconds; in the other case, lung volume reduction surgery (LVRS) was funded only for patients with particular clinical characteristics. The latter case is particularly interesting because the CMS actually funded the multicenter clinical trial in a laudable effort to evaluate the technology in the interest of patients covered by the CMS. In both of these cases, there is little doubt that adequate randomized trials gave us much more confidence about who should be treated (and, accordingly, who should be reimubursed for treating them). However, an important question is raised by these examples: Would the subgroup findings be replicated if the trials were repeated? Without large, confirmatory studies, we run a significant risk of making policy decisions based on findings that occurred simply through random variation. Given the residual uncertainty with these relatively simple examples, if multiple genomic or proteomic variations are found that could explain clinical response variability, the amount of clinical data required to validate and assure the findings will likely increase by more than an order of magnitude. Similar effects. Another critical theory about therapeutics is that drugs within the same "class" essentially have the same effect. This approach allows the shortcut of developing follow-on drugs in the same class as the first that can be presumed to have the same effect on the clinical outcomes of interest. Unfortunately, multiple examples dispel the notion that we can simply accept the class effect without more detailed evaluation.17 There are two basic reasons for this finding: First, different molecules aimed at the same target can have different effects on that target and on other targets that could alter the efficacy of the drug; second, the differences in structure and function of different molecules could alter the balance of risk and benefit through either greater or lesser toxic effects. In addition to antibiotics and thiazide diuretics, beta-blocking drugs and statins are among the oldest classes of dramatically beneficial drugs. Except for their obvious effect in blocking beta receptors and inhibiting the pathway involving HmG CoA reductase in multiple tissues, the biology of the human response to beta blockade and statins is complex and poorly understood. However, not all of the trials comparing beta-blockers with placebo have shown a reduction in mortality; a large trial funded by the National Institutes of Health (NIH) failed to demonstrate a reduction in mortality with bucindolol.18 Most recently, a head-to-head trial of two beta-blockers, both of which had been shown independently to reduce the risk of death, showed carvedilol to be superior to short-acting metoprolol in reducing mortality.19 Despite the fact that one statin was removed from the market because of an unacceptably high rate of rhabdomyolysis (kidney damage) in the absence of evidence of a superior effect on clinical outcomes, no head-to-head comparison of the statins has been completed.20 Assessment of therapies in combination. The basis of the development of each medical product has been an assessment of the effect of each therapy in isolation, and this "purified" assessment is obviously critical. However, we know that when two therapies are combined, the total effect is not easily predictable, yet the number of drugs taken concurrently by patients continues to expand. Sorting out whether combinations are additive, synergistic, neutral, or detrimental requires sample sizes more than double the number required to determine the effect of an individual treatment. This issue will become especially important as drug/device combinations become more common and as the industry begins to package multiple drugs in the same pill to deal with peoples difficulty with taking multiple pills.21 Long-term effects. In the current state of therapeutics, many therapies are administered for decades. In the future, as individual disease susceptibility is defined by genomics and proteomics, therapies might be given for a lifetime. We now have many examples of surprising beneficial or detrimental effects in the long term that were not predicted based on short-term results. The recent reports of a reduction in prostate cancer by finasteride, balanced by an increase in the severity of the cancer in those who develop it, points out both the importance of and uncertainty about the long-term effects of drugs taken for chronic disease.22 In the case of nonsteroidal anti-inflammatory drugs (NSAIDs), it remains unclear whether the long-term effects are beneficial or detrimental and whether different NSAIDs or COX-2 inhibitors might yield different long-term outcomes.23 Development of imaging. One of the most dramatic effects of our national investment in technology has been the proliferation of imaging technology and diagnostic tests aimed at stratifying risk, which is a critical element of the strategy of personalized medicine. Intuitively, better and more accurate tests should be beneficial. At least in the short run, however, the story is not so clear. More accurate diagnostic tests that lead to ill-advised intervention can be both expensive and detrimental to health.24 The use of anti-arrhythmic drugs based on Holter monitoring provides a classical example; the Holter monitor provided an accurate report of ventricular dysrhythmias, but the drugs used to the treat the dysrhythmias increased the risk of death rather than decreasing it.25 Similar issues have been raised with the Swan-Ganz catheter, which measures intracardiac pressures and cardiac output but could lead to increased mortality because of the excessive use of inotropic drugs in monitored patients.26 As we enter an era of risk stratification using genomic and proteomic arrays as well as high-resolution imaging using multiarray computed tomography (CT) and magnetic resonance imaging (MRI), often combined with molecular imaging, appropriate methods for understanding the downstream effects of such powerful technology remain elusive, and these methods have not yet been developed to a refined state. Complexity of decision making. The complexity of decision making in an environment of multiple effective therapies with staggering costs and potential downstream effectiveness or harm should not be underestimated. Studies of human decision making have underscored a point known to marketers for years: Decision making is influenced, and in many cases dominated, by factors other than a rational weighing of the probability that decisions will be aligned with stated values.27 Yet little effort has been expended to ensure that decisionmakers are able to grasp scientific results. Labels on medical products contain far more information than providers can process, and common communication strategies do a poor job of portraying the balance of risk and benefit in an unbiased manner.28 A large segment of the provider community has poor numeracy skills, as do even highly educated patients.29
The essential message when the promise of personalized medicine is combined with our improved understanding of principles of therapeutics is as follows: If our goal is to offer patients/consumers choices for preventing and treating chronic disease that are tailored to their individual needs based on the best scientific assessment of the value of new therapies, our current system of technology evaluation is outdated and inadequate. In fact, instead of decreasing the need for large clinical outcome studies, the demands of personalized medicine will further exacerbate the current state of affairs in which science outdistances policy. To provide the data needed for rational choices based on genetic, proteomic, and clinical data, the FDA, payers, and consumers can no longer be regarded as relatively independent links in the chain of therapeutics; efforts must be made to join them without sacrificing the ingenuity and creativity that have been the hallmark of American medicine. The balance of risk and benefit of therapies that have been marketed based on biomarker findings will need to be confirmed or refuted by adequately powered clinical outcome trials. Subgroup findings will require replication in independent samples. Head-to-head comparisons will be needed to discern which therapies truly have a more favorable balance of risk and benefit, and these comparisons will require large sample sizes evaluated over prolonged time periods. Overhauling U.S. clinical research. To meet this critical need, as discussed in detail by others, we need to overhaul our national system of clinical research. Recent deliberations of the Clinical Research Roundtable have pointed out the shortage of personnel, funding, and efficient systems to deliver research results that can inform the choices that individuals and delivery systems must make among technologies.30 The NIH has put considerable energy into a "Roadmap" that promises to reengineer the clinical research infrastructure to develop large networks of providers, patients, and families who have the education and the capacity to increaseby an order of magnitudethe amount of research being done.31 Such an improvement in efficiency will only be achieved if these networks use common data standards and nomenclature via electronic data exchange networks that allow interoperability of clinical, billing, and research data, to reduce cost per unit of research data.32 Need for pragmatic trials. To obtain the kind of evidence that modern clinical research demands to reliably support decisions, many more trials aimed at providing generalizabile comparative clinical outcome results will be needed. The funding for these "pragmatic" trials now is woefully inadequate, as opposed to "explanatory" trials intended to elucidate pathophysiogical principles.33 Regulatory and payment systems must be better synchronized so that the discovery and implementation of strategies that truly improve health are adequately rewarded, while technologies and strategies that produce harm or inadequate benefit compared with alternatives are weeded out. Given the increasing dominance of chronic disease, these systems need to consider mechanisms that will encourage early marketing of technology (to produce incentives for continued development) followed by ongoing assessment (to ensure accurate depiction of true effects). One example of such an approach should be the increased use of provisional approvals for chronic disease therapies. The first phase of approval might be granted based on biomarkers and preliminary evidence of safety and efficacy, but it might be accompanied by the requirement to perform longer-term studies with clinical outcome measures as endpoints. Such a system is in place for priority drugs that provide hope for a disease that has no effective treatment, but it might stimulate more appropriate technology development as a more routine system. Some have argued that postmarketing surveillance will be adequate to understand the longer-term implications of therapeutic choices, but the experience with previously mentioned examples such as HRT, anti-arrhythmic drugs, beta-blockers, and COX-2 inhibitors points out that this approach will not succeed. While postmarketing surveillance, particularly if improved from its current version, can detect adverse effects of drugs that are not part of the natural history of the disease under treatment or device failures, it cannot answer the complex questions needed to provide decisionmakers with the information they need for comparative decisions.34 Especially when the effects are related to a chronic disease in and of themselves, such as cardiovascular effects of anti-inflammatory drugs, only long-term randomized comparisons will be sufficient. Two-track system. Conceptually, it is reasonable to expand this construct of provisional approvals followed by longer-term studies to all therapies given chronically or used to treat a chronic disease. The evaluation of a chronic disease therapy might then be thought of as being a two-track system. The first track provides short-term evidence about whether the therapy is on balance beneficial in a relevant population. Since the true effects of the therapy cannot have been measured over a long period of time, the assessment must rely on biomarkers, short-term outcomes, and proof of absence of unusual adverse reactions. Labeling and promotion would reflect the uncertainty inherent at this level of measurement, and payment would be at a lower level or limited to providers and systems with a high level of expertise. The second track, which could be started simultaneously with the first track, would measure the broad swath of outcomes, including survival, freedom from adverse events, quality of life, and cost. The populations studied would be broad, including all relevant age ranges with the mixture of comorbidities likely to be seen in routine practice. Success in this arena would lead to broad labeling, higher rates of payment, and incorporation into performance measures as part of the delivery of high-quality health care. This latter step would set up a reward system that could reduce marketing costs for successful chronic disease therapies and technologies, since failure to meet a professionally defined performance measure could lead to reduced payment and labeling of the omission as a medical error. Educating the public. A massive effort is needed to educate the public about how to understand when a therapy or diagnostic strategy is likely to provide a positive balance of risk and benefit. This effort must begin by reshaping providers ability to understand and transmit quantitative information about the risks and benefits of technologies and approaches. Although our understanding of decision making and the human processing of quantitative information is evolving rapidly, little of this subject is taught in health professions schools, nor has the system evolved to enhance the degree to which information truly informs consumers and decisionmakers of what they need to know to make complex, prospective decisions.35 Professional standards. The growing effort to create professional standards through clinical practice guidelines and performance measures has provided a route to actualize a more rational use of diagnostic and therapeutic technology.36 Early efforts to pay differentially for adherence to performance measures provide an approach to focusing providers and industry on proving value, since quantitative demonstration of outcome benefit in adequate randomized trials provides a route to improve outcomes and an impetus to fund and use beneficial products. The system is not broken. People are living longer and experiencing much greater freedom from disability than ever before. Yet if our regulatory and delivery systems can catch up with the rapid pace of technology evaluation, even greater progress can be made. Eschewing unhelpful or harmful technologies will leave room for rewarding those who produce and deliver technologies that are beneficial. In this new era of efforts to personalize medical decision making, it is critical to understand that knowledge of genes and proteins only serves as a substrate for much greater production of clinical outcomes data and a renewed national priority to improve the ability of health care providers and the public to deal with the derivative knowledge.
Robert Califf is director of the Duke Clinical Research Institute and the Donald F. Fortin Professor of Cardiology, Division of Cardiology, Department of Medicine, at the Duke University Medical Center in Durham, North Carolina. This work was supported in part by Agency for Healthcare Research and Quality (AHRQ) Centers for Education and Research on Therapeutics (CERTs) Cooperative Grant no. U18HS10548. The author thanks Penny Hodgson and John Daniel for their editorial assistance in preparing this paper.
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||