|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Snapshot Of Hospital Quality Reporting And Pay-For-Performance Under MedicarePROLOGUE: Early in the new millennium, a new buzzword entered the health care lexicon: pay-for-performance, or P4P. The concept gained a strong foothold when the Centers for Medicare and Medicaid Services (CMS) announced in July 2003 a major initiative to test the use of financial incentives "to encourage hospitals to provide high quality inpatient care," according to a press release. Known as the Premier Hospital Quality Incentive Demonstration, the project uses financial bonuses to reward hospitals that are members of Premier, a nationwide organization of nonprofit hospitals and health systems, for their performance in selected clinical areas. Rewards are based on quality measures validated extensively by the work of the Agency for Healthcare Research and Quality (AHRQ), the Joint Commission on Accreditation of Healthcare Organizations (JCAHO), the National Quality Forum (NQF), and other collaborators. As of late 2005, more than 270 hospitals nationwide were voluntarily taking part. In May 2005, Premier released results from the first four quarters showing a trend toward improved quality among all participants. Fifth-quarter results released in July 2005 show even greater improvement. In this paper, Charles N. "Chip" Kahn III and his coauthors probe the P4P universe by examining quality performance activities taking place under the aegis of the Hospital Quality Alliance (HQA). They also examine the P4P approaches of the Premier Demonstration and the Medicare Payment Advisory Commission (MedPAC). Kahn (ckahn{at}fah.org) is president of the Federation of American Hospitals (FAH) in Washington, D.C. He served as a top health adviser on Capitol Hill before moving to the industry side as president of the Health Insurance Association of America, then taking over at the FAH in 2001. Thomas Ault is a principal at Health Policy Alternatives Inc. in Washington, D.C. Howard Isenstein is a vice president at the FAH. Lisa Potetz, director, public policy research, at the March of Dimes, in Washington, D.C., contributed as an independent consultant. Susan Van Gelder is a senior vice president at the FAH.
This paper examines the impact that Medicare pay-for-performance (P4P) might have upon hospital payment. It uses the initial two quarters of a national quality database to model financial gains or losses using the Premier Hospital Quality Incentive Demonstration rules, as well as the P4P approach recommended by the Medicare Payment Advisory Commission (MedPAC). Findings reveal variation among all types of hospitals and across all measures within each of the three conditions studied: heart attack, heart failure, and pneumonia. Initially, hospitals financial gains and losses likely will be marginal using the Premier demonstration payment rules and somewhat larger under the MedPAC recommendations as modeled.
AMONG HOSPITAL QUALITY IMPROVEMENT initiatives under way are efforts of the public-private Hospital Quality Alliance (HQA) collaborative, Medicare payment incentives for hospital quality data reporting, and Medicares Premier hospital demonstration program. Created in 2002, the HQA is a public-private collaborative intended to make accessible to the public critical information about hospital quality performance and to inform and invigorate efforts to improve quality. The HQA includes the Centers for Medicare and Medicaid Services (CMS); the Agency for Healthcare Research and Quality (AHRQ); and key national hospital groups, health care quality organizations, and consumer groups.1 The HQA provided data for Hospital Compare, a U.S. Department of Health and Human Services (HHS) database that included, at the time of the analysis, seventeen measures of clinical quality among three conditions: heart attack, heart failure, and pneumonia. The database is maintained by the CMS, which receives the data voluntarily from about 4,200 hospitals, based on valid and reliable measures that have been shown to reflect quality of care.2 This analysis uses the first two quarters of available data. The CMS recently released an additional two quarters of data. Preliminary analysis confirms the results in this paper. New measures will continued to be added over time. For example, in late 2006early 2007, hospitals are scheduled to publish data about patients perspectives of care. Medicare, for its part, has sought to further connect payment with performance, or at least with the reporting of certain HQA measures. As part of the Medicare Prescription Drug, Improvement, and Modernization Act (MMA) of 2003, Congress encouraged hospitals to participate in the public reporting of quality information; those that do not report on ten measures of quality receive a 0.4 percent reduction in their annual Medicare payment update for inpatient hospital services. It should be noted, however, that hospitals were signing up to report quality scores voluntarily even before MMA was passed. As of 12 December 2003, 2,338 hospitals had signed up with the CMS to report their quality data.3 In spring 2005 the Medicare Payment Advisory Commission (MedPAC) presented a series of recommendations that would explicitly link part of Medicares payment to providersincluding inpatient hospitalsto performance.4 In a further contribution to the development of pay-for-performance (P4P) for hospitals, the CMS has undertaken the Premier Hospital Quality Incentive Demonstration. This demonstration includes 268 hospitals that are members of Premier Inc., an alliance of nonprofit hospitals. The demonstration is designed to examine the effect on hospital care of financial rewards and penalties linked with performance on a set of common medical conditions. Under this demonstration, the highest-performing hospitals receive bonuses, while the lowest-performing hospitals might be subject to penalties, based on their performance on certain evidence-based quality measures for patients with heart attack, heart failure, pneumonia, coronary artery bypass graft (CABG), and hip and knee replacements. Participation in the demonstration is voluntary and began in 2003. Notably, public policy changes often have been based on experiments like the Premier demonstration. Diagnosis-related groups (DRGs), payment methods for Medicare managed care, and changes in payment for home health, for example, were first tested in agency-sponsored demonstrations. This study examines hospital quality and financial performance under two P4P approaches: Premier and the one envisioned within MedPACs spring 2005 recommendations.
Data. The primary source of data is a public use file released in April 2005 by the CMS containing Hospital Compare data reported by each hospital. Overall, data on 4,203 hospitals are available through Hospital Compare. The quality of care provided to patients treated for the reported conditions (heart attack, pneumonia, and heart failure) clearly is important clinically. These conditions also account for 16 percent of Medicare discharges from acute care hospitals as well as 16 percent of Medicare hospital payments.5 The Hospital Compare data encompass the original ten quality measures for which reporting hospitals received a full inflation update under MMA as well as the seven additional measures reported voluntarily by many hospitals. The ten measures cover all patients, not solely Medicare patients, treated from January through June 2004. The seven optional measures are from the period AprilJune 2004. Both the clinical definitions of the quality measures and the methodology for computing composite scores follow the specifications of the Joint Commission on Accreditation of Healthcare Organizations (JCAHO).6 Methods. To examine performance by type of hospital, we used hospitals Medicare provider numbers to match the Hospital Compare quality data to characteristics contained in the fiscal year 2005 Final Rule Impact File for Medicares hospital prospective payment system (PPS).7 Data about the number of patients discharged and the level of Medicare payments for each of the three conditions come from the FY 2003 Medicare Provider Analysis and Review (MEDPAR) file. We calculated composite scores reflecting the collection of measures included for each of the three conditions. To the extent possible, the calculation methodology adheres to the specifications of the Premier demonstration, which in turn follows the JCAHO definitions and rules that define each quality measure precisely. For example, the measure "aspirin at arrival" in the heart attack condition applies to all patients in ten identified International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) principal diagnosis codes who are not contra-indicated for aspirin and who are not described by several exclusion criteria.8 A hospitals score on a measure equals the percentage of patients subject to the measure for which the hospital fulfilled the indicated action: for example, administered aspirin on arrival. JCAHO also prescribes how scores on the individual measures are converted into composite scores for the three conditions. Because the individual quality measures are given equal weight in determining the composite score, the score is simply the percentage of instances across measures in which the hospital performed the required action (for example, gave aspirin on arrival or communicated discharge instructions) compared with the cumulative number of actions that should have been performed. For example, a composite heart attack score of 90 percent means that the hospital failed to perform 10 percent of the actions that should have been performed for heart attack patients, assuming that all actions that were performed were recorded. As in Premier, composite scores were not calculated for a hospital that did not have at least thirty patients for at least one of the measures included in the condition.9
Heart attack had the fewest number of hospitals with a condition score: 2,008 hospitals out of 4,203 hospitals in the Hospital Compare universe (Exhibit 1
To examine hospital performance, we determined the distribution of composite scores for each condition by decile, and we compared how different types of hospitals performed against these national performance levels.11 We also looked at how hospitals would fare under two different P4P scenarios. Under Premiers three-year demonstration, the best-performing hospitals are eligible for bonus payments in all three years of the project: The top 10 percent performing hospitals receive a bonus equal to 2 percent of payments made for discharges of patients with the corresponding condition, and those between the eightieth and ninetieth percentiles get a 1 percent bonus. Penalties2 percent applicable to hospitals below the tenth percentile and 1 percent for hospitals above that but below the twentieth percentilecome into play only in the third year. The penalty thresholds are established in the first year of the demonstration project and remain fixed. In effect, hospitals have two years to surpass this level and avoid a penalty. Bonus payments, on the other hand, are based on annually determined top decilesthat is, the thresholds are moving targets. To be eligible for a bonus payment, a hospital must be in the top 10 or 20 percent of hospitals composite scores for the reporting year. The bonus and penalty results presented here are based on six months or less of dataall available data at the time of this analysisand do not incorporate a time period for hospitals to improve to avoid a penalty. Thus, the results show only what level of dollars might be at risk for each of the conditions, by type of hospital. As in Premier, bonuses and penalties are applied to base DRG payments in the applicable ICD-9 codes for each condition.12 The March 2005 MedPAC recommendations call for 12 percent of hospital payments to be set aside to create a pool for rewarding quality performance, with bonuses to be paid both for attaining a specified level of performance and for achieving improved performance over time. According to MedPAC, details of the P4P policy would be established in regulations by the HHS secretary. To model how the MedPAC recommendations might compare with Premier, the paper assumes that payments in the affected ICD-9 codes are reduced 1 percent to establish a pool and that all money in the pool is distributed based upon achieving a high level of performance; since available data are from a single time period, the paper does not model bonuses based on improvement over time.13 Bonus payments are modeled so that all pool dollars are expended, and bonus rates for hospitals above the ninetieth percentile are twice the bonus rates paid to hospitals between the eightieth and ninetieth percentiles. Only hospitals in the top 20 percent receive bonus payments. Although MedPAC does not recommend imposing penalties for low performance, hospitals can experience lower Medicare revenues because payments are reduced 1 percent at the outset.
As measured by the three condition composite scores, performance varied among hospitals overall and by hospital type. Although fewer hospitals had scores for heart attack than for the other two conditions, the mean composite score was higher for this condition than for the others: 90.0 percent, compared with 74.4 percent for heart failure and 76.2 percent for pneumonia (Exhibit 2
For each condition, results varied by type of hospital, with heart attack and heart failure showing a somewhat different mix of top and bottom performers than pneumonia. Average heart attack composite scores for major teaching hospitals exceeded those for nonteaching hospitals (Exhibit 3
Rural hospitals were more likely than others to be among the lowest scorers for heart attack: 38 percent were in the bottom 20 percent but only 17 percent, in the top 20 percent of scores. At 86.4 percent, however, the average score for rural hospitals was only slightly below the 90.6 percent average of urban hospitals. Tax-exempt hospitals were slightly more likely than others to be among the top performers22 percentwith 15 percent in the bottom 20 percent of heart attack scores. Although investor-owned hospitals were overrepresented among the lowest scorers for this condition, average scores for this group were not much below those of tax-exempt hospitals (87.6 percent compared with 90.6 percent).
Results for heart failure show a similar pattern to heart attack (Exhibit 4
Pneumonia represents a different story, however (Exhibit 5
Sources of variation. Variation in results for individual measures within the composite scores, summarized in Exhibit 6
For heart failure, variation in scores was affected most by the discharge instructions and left ventricular assessment measures. The latter measure applied to the greatest number of heart failure cases and contributed greatly to poor performance for this condition overall as well as to the wide variation in condition scores. Hospitals with heart failure scores in the bottom 20 percent averaged only 62.2 percent on this measure, compared with 95.3 percent for hospitals with scores in the top 20 percent. For heart attack, performance on individual measures was better and varied less than for the other conditions. The beta-blocker at discharge and aspirin at discharge measures were most important. Scores were lowest and differed most on the measure for angiotensin-converting enzyme (ACE) inhibitor for left ventricular systolic dysfunction (LVSD), but this measure involved fewer patients. Poor performance on this measure, which also affects heart failure scores, as discussed above, might be due in part to clinical controversy over the appropriateness of ACE inhibitors for these patients.15 Performance varied least on the aspirin at arrival measure.
P4P results.
Our comparison of the potential impact of the MedPAC and Premier approaches for varying payment using quality performance showed that the MedPAC method, which redistributes a pool of funds from a 1 percent set-aside, has a greater dollar impact than the Premier approach, which applies bonuses and penalties of 1 percent and 2 percent to top and bottom performers.16 As Exhibits 7
The difference in aggregate effects is a fully anticipated result of the design differences between these P4P approaches. The MedPAC approach sets out to have no effect on aggregate payments to hospitals. It would redistribute a set amount of funds collected from all hospitals and pay this amount back to good performers as bonuses. The Premier approach, however, pays bonuses and imposes penalties, and the aggregate effect depends on the level of payments made to the "winning" and "losing" hospitals. In addition, under the Premier approach, hospitals are given a period of time to improve performance to a known threshold and thereby avoid penalties. Despite relying on the same quality measures, the two approaches would also vary in their effects by hospital type. Notably, rural hospitals would have a better result under the MedPAC approach, while urban hospitals would gain under the Premier approach. Under the budget-neutral MedPAC approach, rural hospitals would contribute a relatively small amount to the pool of funds (about 13 percent), while a relatively strong performance on the pneumonia condition would yield more than that in bonus payments. Urban hospitals, which would contribute the vast majority to the pool, dont earn enough in bonus payments to make up those contributions. Under the Premier approach, however, bonus payments earned by urban hospitals on the heart attack and heart failure measures would offset the penalties for poor performance on pneumonia.
Given that only two quarters of quality data were available in the initial release, this analysis is inherently preliminary, and implications should be considered provisional. Additionally, because Medicare payment is tied only to hospital reporting but not to actual performance, the results presented might understate how well hospitals would perform within an incentive-based system like Premier or the withhold-and-payback approach recommended by MedPAC. The limited time frame means that for most measures, the average number of cases upon which composite scores were computed is not large. Hospitals performance can be expected to improve over time as they become more accustomed to complying with the measures and reporting procedures. Given the formative stage of reporting data publicly, both performance measures and reporting procedures likely are still evolving and improving, so the extent to which hospitals may be complying with the measures but not properly documenting it is unclear. In addition, during this period certain measures were undergoing refinement, which likely resulted in both lower overall scores for these measures and increased variation in hospital performance.17 A further limitation is that results reflect only process measures and could change if patient outcomes were evaluated as well. Future analysis of quality measures will benefit from additional data and more hospital experience with the measures and reporting requirements; it also will provide an opportunity to examine how well hospitals improve performance over time beyond this initial snapshot. Multivariate analysis also would further our understanding of which hospital characteristics contribute most to variation in hospital performance. In particular, in this univariate analysis, variation by region is difficult to interpret or to distinguish from differences by bed size or teaching status. Finally, examination of variation in patient mix and clinical practices could shed light on the strong performance of rural hospitals in caring for pneumonia patients relative to other hospitals and other conditions. Despite current interest in using payment-based incentives to improve quality of care, considerable additional research and further demonstrations could increase their effectiveness. A demonstration, for example, might study quality improvement over several years using payment-based incentives compared with a variety of reporting-only and management improvement approaches. Some interpret early results from Premier to suggest that marginal payment incentives can be very effective in improving quality performance.18 The Premier results, however, do not distinguish the effect of payment incentives from improvements spurred by the scrutiny and public relations impact of public reporting. Demonstrations also could explore what level of bonuses and penalties are necessary to stimulate investment in quality improvement.
Any pay-for-performance program inherently involves a number of policy choices and judgments. Although the quality measures are evidence-based and supported by clinical science, collapsing the measures into composite scores and specifying bonus and penalty formulas require policy choices for which there is no scientific foundation. Many questions arise: Should a P4P program be redistributive of current Medicare funds, or should it be designed along the Premier model, which makes available new funds to pay bonuses but could still be budget-neutral, depending on the magnitude of hospital penalties? Should the measures be weighted equally in calculating the composite scores? What is the relative role of outcome and process measures, and how should the outcome measures be risk-adjusted? By what formula will the bonus payments or penalties be calculated? As our simple simulation of Premier and MedPAC recommendations shows, these are not trivial questions because their answers will determine how payments are redistributed and who wins and loses. Ultimately, there is the potential that any P4P program could be seen as arbitrary and not directly linked to, or predictive of, high-quality care. More troubling, rather than creating a culture of quality, P4P could lead to distortions in reporting or misplaced quality improvement efforts driven primarily by bonus payments. Such developments could lead to the unintended consequence of hospitals narrowly focusing their interventions at the expense of broadly improving care. Although initial hospital performance is highly variable, our analysis reveals patterns among similar hospitals and provides clear benchmarks for improving quality. By focusing on the biggest quality gaps and how they relate to possible future Medicare financial incentives, hospitals can improve care for their patients and potentially benefit financially from P4P at the same time.
The authors thank Christopher Hogan of Direct Research LLC for his invaluable data assistance, and Steven Speil of the Federation of American Hospitals for his help in the preparation of this paper.
This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||