QUICK SEARCH:   [advanced]
Author:
Keyword(s):
Year:  Vol:  Page: 

   

 

Health Affairs, 24, no. 1 (2005): 128-132
doi: 10.1377/hlthaff.24.1.128
© 2005 by Project HOPE
 
New Online
 * Pay Cuts For Medicare Docs
 * Access To Care Woes
 * Public Coverage More Efficient
 * Empowering Consumers
This Article
* Abstract Freely available
* Reprint (PDF)
* Submit a response to this article
* Alert me when this article is cited
* Alert me when eLetters are posted
* Alert me if a correction is posted
Services
* E-mail this article to a friend
* Similar articles in this journal
* Similar articles in ISI Web of Science
* Similar articles in PubMed
* Alert me to new issues of the journal
* Add to My Personal Archive
* Download to Citation Manager
*Reprints & Permissions
Citing Articles
* Citing Articles via HighWire
* Citing Articles via ISI Web of Science (9)
* Citing Articles via Google Scholar
Google Scholar
* Articles by Teutsch, S. M.
* Articles by Weinstein, M. C.
* Search for Related Content
PubMed
* PubMed Citation
* Articles by Teutsch, S. M.
* Articles by Weinstein, M. C.
Related Collections
* Evidence-Based Medicine
* Health Reform
* Medicare
* Pharmaceuticals
* Physicians
* Quality Of Care
* Research And Technology
* Politics

Evaluating Evidence

PERSPECTIVE

Comparative Effectiveness: Asking The Right Questions, Choosing The Right Method

Steven M. Teutsch, Marc L. Berger and Milton C. Weinstein

   Abstract
 
The Medicare Prescription Drug, Improvement, and Modernization Act (MMA) of 2003 has placed renewed focus on assessing the comparative effectiveness of various therapeutic options. Unfortunately, all of the evidence needed to fully assess these options is rarely available to drug formulary decisionmakers. Comparative randomized trials frequently fail to find differences when there indeed are some, while decision-modeling approaches are more likely to identify differences where there are none. We consider the consequences of these strategies. This paper proposes a framework for using different methods to assess available evidence. We contend that choosing the appropriate method can occur only when there are clear policy goals.


The escalating cost of medical care in general, and prescription drugs in particular, has fostered the demand for mechanisms to assure that money is being spent wisely. The Medicare Prescription Drug, Improvement, and Modernization Act (MMA) of 2003, the advent of the Academy of Managed Care Pharmacy’s Format for Formulary Submissions, and Oregon Medicaid’s evidence-based formulary decision process are manifestations of the increasing demand for good comparative information on effectiveness. We discuss the information that is needed for effective decision making and how decisionmakers’ needs should inform priorities and the choice of scientific method.

   Methods For Assessing Evidence
 Top
 Methods For Assessing Evidence
 Toward A Coherent Policy
 Editor's Notes
 NOTES
 
While evidence-based decision making as currently practiced may seem like an approach that should be universally acclaimed, it has important limitations. Perhaps of greatest importance is the relative scarcity of randomized controlled trials (RCTs) to answer critical questions about comparative effectiveness.

Randomized controlled trials. When assessing efficacy, RCTs are considered to be the gold standard. Even though typically at least two pivotal RCTs are performed to obtain Food and Drug Administration (FDA) clearance to market a new drug, these rarely provide all of the information practitioners need; they usually compare the new drug to placebo and frequently use surrogate or intermediate measures of efficacy, such as blood pressure or low-density lipoprotein (LDL) cholesterol rather than outcomes such as cardiovascular mortality. The comparison against placebo is based on regulatory requirements and the desire to minimize the uncertainty surrounding efficacy assessments. Post-launch, comparative efficacy studies of alternative treatments employing intermediate measures of efficacy are more common, since their completion requires longer time frames and larger sample sizes and since they may address competitive marketplace issues. Larger, longer-term RCTs using true health outcomes (such as mortality) have been increasingly performed during the past decade. The importance of these trials has been dramatically illustrated by their impact on clinical practice and evidence-based guidelines; examples include the Antihypertension and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT), the Women’s Health Initiative, and the Heart Protection Study.

Limitations. Even so, groups developing guidelines that demand evidence of the highest quality often find that a sufficient number of studies of important clinical questions are not available. This situation will never disappear, as the full range of benefits and risks associated with therapeutic decisions across the range of potential clinical applications is not known until long after the technologies have been widely adopted. In addition, high-quality RCTs are frequently limited by their focus on carefully selected, adherent populations, which maximizes the opportunity to demonstrate benefits, and they potentially underestimate harms when used by patients and in settings more typically encountered in real-world practice.

Moreover, the majority of studies are designed with intermediate measures as the trial endpoints and do not provide information on the critical health outcomes that really interest decisionmakers. Thus, extrapolation from one population to another, from efficacy studies to effectiveness (that is, results that would be expected in more typical practice settings), or from one technology to another is usually required. This introduces additional uncertainty, which belies the aura of rigor associated with evidence-based decision making. The reality is that it will never be feasible to perform enough RCTs to assess the relative effectiveness of all important treatment strategies for different target populations. Moreover, often RCTs cannot provide other critical information. For example, harms of treatment are frequently assessed by long-term comprehensive surveillance, and information on impact among groups at different risks can be assessed from observational studies.

Value of certainty. A salient characteristic of RCTs is their ability to reduce the uncertainty surrounding estimates of efficacy. Thus, RCTs are most needed where decision-makers require a high level of certainty and provide the greatest value where the burden of illness (including cost) as well as potential benefits, risks, and costs of interventions are high. In other circumstances, observational studies, systematic reviews, and decision models may provide enough certainty for decision making, and results may be available more rapidly if new studies are needed.1 The level of certainty required for particular efficacy judgments must be determined by decisionmakers and stakeholders. Methodologists can then assess the strategies that can most efficiently provide the requisite information.

The subjectivity factor. Subjectivity is introduced into all decision making, even when it is based on evidence. This reflects in part differences among various stakeholders as to the level of certainty that should be required for particular decisions—differences based on variations in preferences and values. These legitimate differences of opinion can be addressed only by assuring that both deliberations and decision processes are transparent, issues are fully vetted, and conflicts of interest are addressed.2 In doing so, the assumptions underlying projections of risks, benefits, and costs should be clearly identified, so that all stakeholders can provide appropriate input.

When best available evidence suffices. Although this is not ideal, important decisions—such as coverage decisions—must be based on the best available information. Delaying decisions until definitive information is available is usually not an option. Indeed, Mark McClellan, administrator of the Centers for Medicare and Medicaid Services (CMS), has said that the government needs to find alternative approaches to obtaining information in a timely fashion, including use of systematic reviews and data from real-world clinical practice.3

Observational studies. Although RCTs are not available for many key effectiveness questions, a wealth of useful information does exist in observational studies and other sources. It can be assembled through rigorous literature synthesis, including meta-analysis. The full spectrum of information can then be incorporated into a formal analytic framework, such as a decision tree or Markov model, which can be used to assess the benefits, harms, and costs of alternative interventions.4 Thus, analysts can extrapolate from one population, time frame, or technology to another. They can take advantage of expert opinion and make reasonable assumptions to link diverse information. Organizing and combining this information allows the net benefit of alternatives to become apparent. The risks associated with this approach arise from assumptions that are not well founded and from combining information that turns out to be inaccurate or inappropriate.

Transparency and conflicts of interest. The issues of transparency and conflicts of interest may be more acute for decision models because of the many obvious and less-than-obvious choices that are made. For example, extrapolation from one therapeutic agent to another because of similar effects on surrogate markers (for example, LDL cholesterol, based on assumptions of analogous mechanisms of action) may ignore effects on other important outcomes. The harms associated with cerivastatin (Baycol) led to its recall in 2001, even though it was in the same class (statins) as other highly effective drugs with proven outcomes.5

Comparison with RCTs. When RCTs are available, the specificity of an evidence-based standard is very high, and false-positive conclusions are uncommon. However, this highly specific standard carries a price of reduced sensitivity. Adhering to a policy that technology will be adopted only when there is unambiguous evidence that it has an appropriate benefit, risk, and cost profile would delay the adoption of new technologies by many years and, while it might stimulate some additional studies, would likely discourage continued medical innovation and public health benefit. Use of systematic evidence reviews can fill the information gap and support more rapid adoption and dissemination of medical innovations, but with additional risk.

A strict requirement for good-quality RCTs minimizes the risk of a "Type I error" or "error of commission." If multiple high-quality RCTs find that "drug A" is safer and more effective than "drug B," one can have high confidence that this is so. Because detailed data are often not available on drug A and drug B for specific subpopulations, however, advantages of drug B for those patients may not be identified by RCT-level evidence. Thus the probability of a "Type II error," or "error of omission," may be large. Real differences among alternatives may not be found because there are insufficient studies available. Conversely, by admitting more varied information, observational studies and decision modeling can often elucidate real differences even though head-to-head RCTs are not available or have limitations. This information comes at a price, however. While the risk of a Type II error may be small, there is a very real possibility of making Type I errors, that is, finding differences when none exists.

   Toward A Coherent Policy
 Top
 Methods For Assessing Evidence
 Toward A Coherent Policy
 Editor's Notes
 NOTES
 
Approaching a coherent policy regarding how to address the lack of critical evidence to assess comparative effectiveness requires a more nuanced approach and should be tailored to the situation. For someone with a serious health condition such as cancer, for which long-term comparative treatment trials are not available, patients are generally willing to choose treatment based on imperfect information. Conversely, for a preventive service in the asymptomatic population—such as screening for the breast cancer (BRCA) gene, followed by treatment with tamoxifen—one would expect a high degree of certainty that benefits outweigh harms before subjecting the population to the service. In such a case, a strict requirement of good-quality RCT evidence may make sense.

Whether more evidence should be obtained at any stage ought to depend on whether the value of the information obtained, in terms of getting people on or off treatment, outweighs the cost and time of conducting the necessary studies. It will be important to recognize the trade-offs among different methodological approaches to informed decision making, so that choices made will be as closely aligned with intended goals as possible. This will require an open dialogue with all stakeholders, as others have described.6

When decisions are made on imperfect information, processes need to be in place to reevaluate those decisions as new information becomes available, and decisionmakers need to consider the costs and benefits of changing those policies once adopted. The recent decision by the CMS on narrow coverage of positron emission tomography (PET) scans for Alzheimer’s disease represents a provisional decision, which is based on available data and which will be revisited and revised after a clinical trial has been completed.

Comparative effectiveness of drugs Over the past few years, interest in the comparative effectiveness of drugs has increased. The Oregon Medicaid program has contracted with the Oregon Health and Science University (OHSU) Evidence-based Practice Center (EPC) to conduct evidence reviews for use in Medicaid formulary decisions. The EPC uses analytic frameworks and evidence-based methods adapted from the U.S. Preventive Services Task Force (USPSTF).7 Additional states will be using these reviews for their Medicaid formularies. MMA directs the CMS to work with the Agency for Healthcare Research and Quality (AHRQ) on conducting studies of comparative effectiveness.

Because outcomes trials often take many years to complete and existing resources allow for only a limited number of head-to-head comparative outcomes studies, there has been increasing interest in using observational data from cohort studies, claims data, and a hoped-for electronic medical record to complement RCTs. This call is tempered by the understanding that even well-done observational studies can lead to erroneous conclusions. The experience with combination hormone therapy is a recent example, in which observational studies suggested substantial cardiovascular benefits. Only with the recent Women’s Health Initiative, a large randomized trial, was the overall net increase in harms apparent.

One policy goal should be to ensure that adequate data are developed to assess the true benefit-risk-cost profiles of new therapies. Given the growing number of examples where "true outcome" RCTs have yielded results unfavorable to the manufacturers that were the study sponsors, it is an open question whether such studies will routinely be performed in the future. To encourage this, incentives could be provided to manufacturers, such as giving formulary preference to drugs with endpoint outcome data or extensive drug experience, or both. There could be more stringent requirements for information regarding effectiveness or safety, or both, from new entrants into an established class. Alternatively, these studies will need to be funded through the National Institutes of Health (NIH), AHRQ, or other governmental sources.

Criteria for government priority list. The government will need a systematic approach to prioritizing what studies need to be done. MMA calls for AHRQ to develop a priority listing of where additional research is needed; AHRQ has conducted hearings and is considering various criteria.

We propose the following criteria: (1) What is the value of gaining additional information? (2) What do we really need to know to make a good policy decision regarding the use of one technology or another in the treatment of a particular health condition? Aligned with this is the corollary question: (3) How certain do we need to be about what we know? How these questions are answered can permit researchers to decide upon the appropriate methods to assess comparative effectiveness.

Choosing the "right" method. It is important that methodological choices be driven by policy goals. Strict reliance on good-quality RCTs may be most appropriate where there are treatment options with excellent benefit-risk-cost profiles or where there exist high levels of confidence in the ability to identify real differences among therapies. These differences are important. For example, for potentially hazardous interventions among patients at low risk of poor outcomes, demanding long-term outcomes data is an incentive that rewards innovators who conduct the required RCTs to demonstrate the true benefit-risk-cost profile of a new treatment option.

Alternatively, comparative effectiveness studies that use observational data and modeling may be preferable when addressing therapies for which there are no available treatments with acceptable benefit-risk-cost profiles or where less certainty in efficacy estimates is required. This may commonly apply to the development of novel cancer chemo-therapeutic agents for patients for whom no therapeutic options exist.

When a decision must be made regardless of the quality of data available or when it can be made only when high-quality data are available, "choosing the right method" to develop the evidence (and the appropriate level of certainty) is relatively simple. Of course, most technologies and interventions lie between these extremes, and in these instances the importance of stakeholders’ preferences and values in making decisions increases.

We propose that a taxonomy of types of decisions should be developed. The taxonomy would define what level of evidentiary certainty and generalizability should be required for a category of decision and how that level was determined. It would make explicit the influence of stakeholder values and preferences that established the parameters of each category. Such a taxonomy across the spectrum of decisions would promote greater consistency than would probably emerge from individual decisions over time. This process would increase the likelihood of good decision making, raise public confidence in the decision process, and provide guidance for the use or conduct of the appropriate types of studies.

There is no single "right" answer on which approach along the continuum from observational data to strict evidence-based decisions is correct, but rigid adherence to one approach or another will clearly lead to suboptimal decision making. The proper choice of method requires that stakeholders clearly assess the purposes, harms, and benefits of alternative approaches and establish criteria against which different technologies should be evaluated.

   Editor's Notes
 Top
 Methods For Assessing Evidence
 Toward A Coherent Policy
 Editor's Notes
 NOTES
 
Steven Teutsch (steven_teutsch{at}merck.com) is executive director, outcomes research and management, at Merck and Company Inc., in West Point, Pennsylvania. Marc Berger is vice president in that department. Milton Weinstein is the Henry J. Kaiser Professor of Health Policy and Management in the Department of Health Policy and Management, Harvard School of Public Health, in Boston, Massachusetts.

   NOTES
 Top
 Methods For Assessing Evidence
 Toward A Coherent Policy
 Editor's Notes
 NOTES
 

  1. Observational studies, unlike RCTs, do not involve any intervention to study participants; they measure health outcomes as they naturally occur in real-world populations. Systematic evidence reviews are structured analyses of available evidence from a comprehensive literature search with a detailed evaluation of the quality of studies found and a summary of their findings. Decision models are formal analytic frameworks, which may incorporate costs, the value of outcomes, and the probabilities for particular benefits and harms to assess the overall benefits and costs of treatment alternatives.
  2. N. Daniels and J. Sabin, "Limits to Health Care: Fair Procedures, Democratic Deliberation, and the Legitimacy Problem for Insurers," Philosophy and Public Affairs 26, no. 4 (1997): 303–350.
  3. J.D. Kleinke, "Think Globally, Protect Locally: A Conversation with Mark McClellan," Health Affairs 23, no. 3 (2004): 177–185.[Abstract/Free Full Text]
  4. S.J. Goldie and P.S. Corso, "Decision Analysis," in Prevention Effectiveness: A Guide to Decision Analysis and Economic Evaluation, 2d ed., ed. A.C. Haddix, S.M. Teutsch, and P.S. Corso (New York: Oxford University Press, 2003), 103–126.
  5. See Food and Drug Administration, "Baycol Information," August 2001, www.fda.gov/cder/drug/infopage/baycol/default.htm (23 November 2004).
  6. Daniels and Sabin, "Limits to Health Care."
  7. Methods, reviews, recommendations, and reports of the USPSTF are available at Agency for Healthcare Research and Quality, "Preventive Services," www.preventiveservices.ahrq.gov (21 October 2004).


Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati    What's this?


This article has been cited by other articles:


Home page
Health Aff (Millwood)Home page
J. R. Slutsky
Moving Closer To A Rapid-Learning Health Care System
Health Aff., March 1, 2007; 26(2): w122 - w124.
[Abstract] [Full Text] [PDF]


Home page
Med Decis MakingHome page
S. M. Teutsch and M. L. Berger
Evidence Synthesis and Evidence-Based Decision Making: Related But Distinct Processes
Med Decis Making, September 1, 2005; 25(5): 487 - 489.
[PDF]


Home page
Health Aff (Millwood)Home page
P. J. Neumann, N. Divi, M. T. Beinfeld, B.-S. Levine, P. S. Keenan, E. F. Halpern, and G. S. Gazelle
Medicare's National Coverage Decisions, 1999-2003: Quality Of Evidence And Review Times
Health Aff., January 1, 2005; 24(1): 243 - 254.
[Abstract] [Full Text] [PDF]



Home | Current Issue | Archives | Topic Collections | Search | Blog | Subscribe | Contact Us | Help

© 2001-2005 Project HOPE–The People-to-People Organization
Terms and Policies