|
The Predictive Accuracy Of The New York State Coronary Artery Bypass Surgery Report-Card System
Ashish K. Jha and
Arnold M. Epstein
We examined the impact of New York States public reporting system for coronary artery bypass surgery fifteen years after its launch. We found that users who picked a top-performing hospital or surgeon from the latest available report had approximately half the chance of dying as did those who picked a hospital or surgeon from the bottom quartile. Nevertheless, performance was not associated with a subsequent change in market share. Surgeons with the highest mortality rates were much more likely than other surgeons to retire or leave practice after the release of each report card.
MANY RESEARCHERS BELIEVE THAT quality-performance data can aid consumers selecting physicians, help purchasers contracting with health plans and hospitals, and catalyze quality improvement efforts.1 The Centers for Medicare and Medicaid Services (CMS) now provides financial incentives to all hospitals caring for Medicare beneficiaries to publicly release quality-performance data. Critics worry that methodologies used in risk adjustment are imperfect; reported performance may disadvantage providers with severely ill patients and mislead purchasers and consumers.2 Initial studies have found that consumers and referring physicians rarely use published quality data in choosing providers, although advocates believe that consumers will increasingly use quality "report cards" as they gain experience with reporting systems.3
The New York State Cardiac Surgery Reporting System (CSRS) is arguably the gold standard for the public reporting of hospital and physician quality. New York began publicly reporting risk-adjusted coronary artery bypass graft (CABG) mortality rates in 1991. These reports have become a model for other states.
Although the New York State system is influential, knowledge of its function and impact is limited. Preliminary data through 1995 show that the system had little impact on hospital volume, although one report suggested that low-volume surgeons stopped practicing in the state.4 Despite the importance of this system, most of the existing studies focus on the early years; two recent studies suggest short-lived changes in hospital market share for outlier hospitals after publication, and another suggests that lower mortality trends in New York continued through 1999.5
We used data collected by the state from 1989 to 2002, the last year for which data were available, to answer four questions: (1) Does high or low performance by surgeons or hospitals predict performance in the period when the data are most likely to be used; (2) is there evidence that hospitals or surgeons performance affects patient market share subsequently; (3) is there an association between a surgeons performance and the likelihood that he or she will cease practicing in the state; and (4) if so, do the surgeons who cease practicing in the state convey that the reporting system had an impact on their decision?
Data.
We used publicly available data from the New York State Department of Health.6 These annual reports from the CSRS provide data on performance (risk-adjusted mortality rate) and volume (number of cases) by hospital (in a given year) and by surgeon (in three-year periods). The reports typically become available one to three years after the observation period. We also obtained unpublished data on annual surgeon volume from the New York State Department of Health.
We obtained data on hospitals academic affiliation (belonging to the Council of Teaching Hospitals, or COTH), profit status, and location (within New York City) from the American Hospital Association (AHA) database of U.S. hospitals. Data on surgeons age, sex, years in practice, and medical school location (domestic versus foreign) were obtained from a combination of state medical society Web sites, hospital Web sites, the American Medical Association (AMA) Web site, and a national Web site for cardiothoracic surgeons.
Finally, we identified all surgeons who dropped out of the reporting system and attempted to contact them by telephone or electronic mail. We used the AMA Physician Masterfile, the New York State Physician Profile, Internet-based people searches, and contacts with surgeons last known place of employment or residence to identify their whereabouts.
Variables.
Performance was defined as each hospital or surgeons risk-adjusted mortality rate (RAMR). High-performing providers were those with the lowest RAMRs, while low-performing providers had high RAMRs. Market share was defined as the number of cases of isolated CABG surgeries performed by a given surgeon or hospital in a given time period, divided by the total number of isolated CABG surgeries performed by all surgeons (or hospitals) in New York during that period. Finally, any surgeon who did not perform a single surgery in a given calendar year was assumed to have left the system during the prior year. We validated our assumption by asking delisted surgeons if they had stopped practicing in New York.
Analysis.
To examine how well report-card ratings predicted subsequent performance, we performed two sets of analyses. First, because users typically focus on top or bottom performers, we created four hospital and surgeon performance groups: top decile, top quartile, bottom quartile, and bottom decile. We calculated a mean risk-adjusted mortality rate for each category of performance (for hospitals and surgeons) in the year (for surgeons, three years) that followed the release of the report card. For example, data on 1993 hospital performance were released in mid-1995. Therefore, for hospitals in the top decile of performance in 1993, we calculated the weighted risk-adjusted mortality rate during 1996 (when a patient would likely use the report). Similarly, for surgeons, data on performance for 199193 were released in 1995. We calculated risk-adjusted mortality rates for those performers during 199698 and again weighted those calculations by caseload. We summed all years data into a summary result for both hospitals and surgeons. Next, we used Pearson correlation coefficients to evaluate the statistical association between a performance report and subsequent performance. We also graphically examined the relationship between a performance report and the subsequent report.
We measured the impact of ratings on subsequent market share by comparing each hospitals (or surgeons) market share in the year subsequent to the report-card release to the market share in the year prior to its release and calculated the percentage change. For each report, we assessed change in market share for each performance category. We summed data in the different performance categories from all available reports to measure overall changes in market share based on performance. Further, because consumers who used report cards would likely choose among providers in the same region, we analyzed the impact on market share by regions. We looked at three regions with the largest number of hospitals: New York City, Long Island, and western New York (Buffalo, Rochester, and Syracuse). We measured whether the change in market share in the two top-performing hospitals in any given region was different than the change in market share in the two bottom-performing hospitals in that region. Finally, because the report card identifies the few hospitals and surgeons whose performance is statistically significantly different from the state average, we examined whether being in one of these groups affected subsequent market share.
To statistically test the association between performance and change in market share, we built separate linear regression models for surgeons and hospitals for each set of reports in which we had both prerelease and postrelease market share data. In the hospital models, we used postrelease market share as the outcome, and predictors included performance rating, baseline market share, number of beds, COTH membership, urban versus nonurban location, and profit status. For surgeon data, we used postrelease market share as the outcome and included performance rating, baseline (prerelease) market share, age, sex, number of years experience, and graduation from a domestic medical school as predictors. Performance score was specified in separate analyses by rank and by quartiles. Because the results were qualitatively similar, we present data from performance only by rank. To assess the impact of performance on surgical market share, we studied surgeons who were present in both the prerelease and postrelease years.
We used the first five report cards available to assess the relationship between a surgeons performance (categorized in quartiles) and subsequent discontinuation of practice in New York State within the subsequent year. Because we were concerned with unstable estimates due to small numbers, we also examined discontinuation of practice over two years after the release of a report card. The analyses were qualitatively similar and therefore, we present data for discontinuation over a two-year period. We used logistic regression models with surgeon characteristics (age, years of experience, and graduation from a domestic medical school) as well as volume of surgeries performed in the year prior to the release of the report card as covariates to determine the association between surgical performance and decision to cease practicing in New York. We also analyzed models that excluded surgical volume in the prior year. Because the results were qualitatively similar, we present only the numbers from the analysis that included prior volume. We analyzed the data for each report card separately but also analyzed aggregate data for all the years. Since the same surgeon was usually present in more than one report card and these observations were therefore not independent, we used a repeated measures logistic regression with an unstructured correlation matrix to determine the independent association between performance in the bottom quartile and leaving practice in New York. Finally, after identifying surgeons who had dropped out of the report system, we attempted to contact each one and ask whether he or she was still practicing and whether their decision to cease practicing in New York was influenced by the reporting system.
CABG surgery was performed in thirty-three hospitals in New York State between 1989 and 2002. We excluded two hospitals that performed these surgeries for three years or less. Hospitals that performed CABG surgery were more likely to be COTH members (61 percent versus 14 percent, p < .001), to be in New York City (39 percent versus 25 percent, p = .05), and to be larger (mean number of beds: 644 versus 315, p < .001) compared with hospitals that did not perform CABG surgery.
One hundred sixty-eight surgeons were listed as having performed CABG surgery in adequate volume within New York State sometime between 1989 and 2002. Their average age was fifty-three years, with seven years of postgraduate training and 17.3 years of subsequent practice as of 2000 in New York.
Accuracy in predicting subsequent performance for hospitals.
We found that the hospital ratings reliably predicted subsequent risk-adjusted mortality rates. For example, patients undergoing surgery in 1996 who selected a hospital in the top decile using the latest available data (from 1993) had an average RAMR of 1.82, whereas patients who chose a hospital from the bottom decile had an average RAMR of 2.89 (Exhibit 1 ). Summing the data across all years, we found that patients in hospitals in the top decile during the index year subsequently had an average risk-adjusted mortality rate of 1.59, while patients in hospitals in the bottom decile in the index year subsequently had an average RAMR of 2.78. Performance of hospitals in the index year and postrelease year were significantly correlated for each pair of report cards.7 Exhibit 2 graphically depicts one set of these relationships.

View larger version (17K):
[in this window]
[in a new window]
|
EXHIBIT 2 Hospital Performance In 1997 (Released In 1999), Versus Performance In 2000, When Data Were Likely To Be Used
|
|
Accuracy in predicting subsequent performance for surgeons.
We found that patients could reliably use performance data to choose surgeons with lower CABG mortality. For example, a patient undergoing surgery in 1994, who used the latest available data (19891991) to choose a top-decile surgeon, would have had an average RAMR of 1.71, while a patient who had chosen a bottom-decile provider would have had an average RAMR of 3.80 (Exhibit 3 ).
In our aggregated data, we found that choosing a surgeon from the top decile led to an average RAMR of 1.58 in the year when the data would likely be used, while surgeons from the bottom decile had an average RAMR of 3.20 during the time period when the report card was likely to be used (Exhibit 3 ).8 We graphically depict one set of these relationships in Exhibit 4 .

View larger version (22K):
[in this window]
[in a new window]
|
EXHIBIT 4 Surgeon Performance In 199395 (Released In 1997), Versus Performance In 19982000, When Data Were Likely To Be Used
|
|
Changes in market share associated with performance.
We found no evidence that performance was associated with subsequent change in hospitals market share (Exhibit 5 ). For example, hospitals performing in the top decile and top quartile in a report released in 1995 experienced 0.3 percent and 1.0 percent decreases in market share, respectively, while hospitals performing in the bottom decile and bottom quartile in the same report experienced 0.3 percent and 2.1 percent increases in market share, respectively. There was no consistent association between performance rank and change in market share using linear regression (Exhibit 5 ). When we examined changes in market share based on performance within regions, we found that changes in market share for the best performers within a region were similar to those for the worst performers within that region for the four time periods we examined. Data among surgeons who were in practice before and after the release of any given report were qualitatively similar, with no association between performance and subsequent change in market share (Exhibit 6 ).
Discontinuing surgical practice.
Surgeons with poor performance on report cards were more likely than others to leave practice of CABG surgery in New York State within two years after the release of the report. Overall, more than 20 percent of bottom-quartile surgeons stopped practicing CABG surgery in New York within this period, whereas only about 5 percent of surgeons in the top three quartiles did so (Exhibit 7 ). The association was evident in all five reports we examined but statistically significant in only the combined data.
We identified thirty-one surgeons who left practice, according to the reporting system, between the year 1989 and 1999. They had a median age of sixty-one (range, 4382 years old) and a median of twenty-four years practice experience (range, 645 years) when they left practice. We could not obtain follow-up information for four of them, despite an exhaustive search. Two others had died. Of the twenty-five remaining surgeons, nine were still performing bypass surgeries outside New York (median age fifty-four; range, 4970), while nine had retired (median age sixty-nine; range, 5982), and seven were working in nonclinical positions (median age sixty; range, 4363).
We received survey responses from eighteen of these twenty-five surgeons. Three of the seven surgeons who did not respond had been in the bottom quartile of performance in the year prior to leaving practice. Ten of the eighteen responding surgeons stated that the CABG reporting program had no impact on their decision. Four of these ten had been in the bottom quartile of performance prior to leaving. Finally, two of the eighteen said that it had a minimal impact, and six reported a moderate or substantial impact on their decision. Of these eight surgeons, four were in the bottom quartile prior to departure. Two low-mortality surgeons reported leaving because pressure to reject high-risk patients and focus on documentation made practicing surgery less enjoyable.
We found that New Yorks publicly released mortality reports had good, although not perfect, predictive value for those who might use the data to select providers. However, there was no evidence that they affected hospitals or surgeons market share after they were released. Perhaps most important, their publication seemed to have a critical impact on practicing surgeons livelihood. With the release of each report card, approximately one in five bottom-quartile surgeons relocated or ceased practicing within two years.
Reporting outcome data by individual physicians is controversial. Proponents argue that surgeon performance has a large and measurable impact on surgical mortality and other measures of quality. Critics counter that publicly reporting individual-level data can be inaccurate when sample size is inadequate and that it can increase physicians resistance to quality improvement efforts. Further, they argue that professional ethos is sufficient to catalyze meaningful improvement in performance with confidential release of performance data.9 Our study is the first to thoroughly evaluate the impact of reporting individual surgeons performance data on their decision to discontinue practice or relocate; our results suggest that public release of performance data can have a profound impact on physicians and their livelihood. These results underscore the importance of adequate risk adjustment and sufficient sample size to permit accurate assessments.
Professional groups and organizations have focused on practice volume as a proxy for quality and suggested standards. One early description of the New York State system noted that twenty-seven low-volume (fewer than fifty cases annually) surgeons stopped practice in the state between 1989 and 1992.10 The surgeons identified in our study as having left practice were a different group: All of them performed more than 150 operations over three years.
A reporting systems ability to predict performance after it becomes publicly available is central to its usefulness in guiding consumers decisions. Therefore, it is puzzling that we could find limited information about this issue from prior evaluations of this and other report-card systems. Our results suggest that bottom-decile hospitals or surgeons subsequently have approximately twice the risk-adjusted mortality rate as the top-performing hospitals or surgeons. If the collection and analysis of performance data can be expedited and the length of time between collection and reporting shortened, we expect predictive accuracy to improve.
Given the expense, morbidity, and mortality associated with CABG, one might assume that performance data would greatly affect patients decisions or cardiologists referral practices. Although the procedure is sometimes done urgently, past surveys of patients getting CABG surgery suggest that the large majority believe they have sufficient time to choose a provider.11 Indeed, the proliferation of these public reporting systems is testimony to policymakers perception that patients, referring physicians, and health plans will choose based on reported quality performance. A number of previous studies have not borne this out in practice.12 However, other studies suggest that performance reports can affect patients choice of providers.13
This study has important limitations. Because the risk-factor models used to calculate risk-adjusted mortality rates changed each year, we could not accurately assess changes in patient mortality over time. Many factors affect patient market share, and confounders might have reduced our ability to find an impact of performance on subsequent market share. The relationship between surgeons performance and subsequent decision to retire or relocate might not be caused by the reporting system but could be a marker for illness, or other factors that led to poor performance and also led surgeons to change their livelihood. Percutaneous coronary intervention (PCI) became much more common during the time period we studied, and its presence might have affected our results. Finally, New York is a highly regulated state, and patterns of care there might not generalize to other states.
WE FOUND THAT RE P ORTS ON PERFORMANCE can reliably predict better-than-average performance for both surgeons and hospitals and can help patients and payers avoid low-performing providers. However, despite a decade of experience, we found no evidence that purchasers or patients are using these reports to drive market share to higher-performing providers. Perhaps more importantly, we found a strong association between poor performance and a surgeons decision to change his or her profession. Advocates of publicly reported data might see this as a benefit of the system. The large impact on practicing physicians underscores the need for highly accurate reporting.
Ashish Jha (ajha{at}hsph.harvard.edu) is an assistant professor in the Department of Health Policy and Management, Harvard School of Public Health, in Boston, Massachusetts. Arnold Epstein is the John H. Foster Professor and chair of that department.
The authors thank Ed Hannan and Mark Chassin for their extremely thoughtful comments on an earlier version of the manuscript. They also thank E. John Orav for his input on the statistical analyses.
- See, for example, A. Epstein, "Performance Reports on QualityPrototypes, Problems, and Prospects," New England Journal of Medicine 333, no. 1 (1995): 5761[Free Full Text]; A.M. Epstein, "Public Release of Performance Data: A Progress Report from the Front," Journal of the American Medical Association 283, no. 14 (2000): 18841886[Free Full Text]; and R. Galvin and A. Milstein, "Large Employers New Strategies in Health Care," New England Journal of Medicine 347, no. 12 (2002): 939942.[Free Full Text]
- L.I. Iezzoni, "The Risks of Risk Adjustment," Journal of the American Medical Association 278, no. 19 (1997): 16001607.[Abstract/Free Full Text]
- See, for example, E.C. Schneider and A.M. Epstein, "Influence of Cardiac Surgery Performance Reports on Referral Practices and Access to Care: A Survey of Cardiovascular Specialists," New England Journal of Medicine 335, no. 4 (1996): 251256[Abstract/Free Full Text]; and J.H. Hibbard et al., "Strategies for Reporting Health Plan Performance Information to Consumers: Evidence from Controlled Studies," Health Services Research 37, no. 2 (2002): 291313.[CrossRef][Web of Science][Medline]
- M.R. Chassin, "Achieving and Sustaining Improved Quality: Lessons from New York State and Cardiac Surgery," Health Affairs 21, no. 4 (2002): 4051[Abstract/Free Full Text]; and M.R. Chassin, E.L. Hannan, and B.A. DeBuono, "Benefits and Hazards of Reporting Medical Outcomes Publicly," New England Journal of Medicine 334, no. 6 (1996): 394398.[Free Full Text]
- P.S. Romano and H. Zhou, "Do Well-Publicized Risk-Adjusted Outcomes Reports Affect Hospital Volume?" Medical Care 42, no. 4 (2004): 367377[CrossRef][Web of Science][Medline]; E.L. Hannan et al., "Provider Profiling and Quality Improvement Efforts in Coronary Artery Bypass Graft Surgery: The Effect on Short-Term Mortality among Medicare Beneficiaries," Medical Care 41, no. 10 (2003): 11641172[CrossRef][Web of Science][Medline]; and D.M. Cutler, R.S. Huckman, and M.B. Landrum, "The Role of Information in Medical Markets: An Analysis of Publicly Reported Outcomes in Cardiac Surgery," American Economic Review no. 94 (2004): 342346.
- This system is accessible online at http://www.health.state.ny.us/nysdoh/heart/heart_disease.htm.
- Pearson correlation coefficients 0.10 for 1993 with 1996 reports, p = .60; 0.12 for 1994 with 1997 reports, p = .53; 0.37 for 1995 with 1998 reports, p = .04; 0.38 for 1996 with 1999 reports, p = .04; 0.30 for 1997 with 2000 reports, p = .10; and 0.36 for the 1998 and 2002 reports, p = .04.
- Pearson correlation coefficients for the five sets of reports depicted in Exhibit 3
were as follows: r = .34 for the reports from 198991 with 199496, p = .005; r = .42 for the reports from 199193 with 199698, p < .001; r = .61 for the reports from 199294 with those from 199799, p < .001; r = .42 for the reports from 199395 with those from 19982000, p = .0001; and r = .14 for the reports from 199496 with those from 19992001, p = .17. - W.A. Ghali et al., "Statewide Quality Improvement Initiatives and Mortality after Cardiac Surgery," Journal of the American Medical Association 277, no. 5 (1997): 379382[Abstract/Free Full Text]; and G.T. OConnor et al., "A Regional Intervention to Improve the Hospital Mortality Associated with Coronary Artery Bypass Graft Surgery: The Northern New England Cardiovascular Disease Study Group," Journal of the American Medical Association 275, no. 11 (1996): 841846.[Abstract/Free Full Text]
- Chassin et al., "Benefits and Hazards."
- E.C. Schneider and A.M. Epstein, "Use of Public Performance Reports: A Survey of Patients Undergoing Cardiac Surgery," Journal of the American Medical Association 279, no. 20 (1998): 16381642.[Abstract/Free Full Text]
- Ibid.; D.W. Baker et al., "The Effect of Publicly Reporting Hospital Performance on Market Share and Risk-Adjusted Mortality at High-Mortality Hospitals," Medical Care 41, no. 6 (2003): 729740[CrossRef][Web of Science][Medline]; J.H. Hibbard, J. Stockard, and M. Tusler, "Hospital Performance Reports: Impact on Quality, Market Share, and Reputation," Health Affairs 24, no. 4 (2005): 11501160[Abstract/Free Full Text]; Romano et al., "Do Well-Publicized?"; and Cutler et al.,, "The Role of Information."
- D.B. Mukamel and A.I. Mushlin, "Quality of Care Information Makes a Difference: An Analysis of Market Share and Price Changes after Publication of the New York State Cardiac Surgery Mortality Reports," Medical Care 36, no. 7 (1998): 945954[CrossRef][Web of Science][Medline]; and D.B. Mukamel et al., "Quality Report Cards, Selection of Cardiac Surgeons, and Racial Disparities: A Study of the New York State Cardiac Surgery Reports," Inquiry 41, no. 4 (20042005): 435446.

What's this?
This article has been cited by other articles:

|
 |

|
 |
 
P. LINDENAUER
Public reporting and pay-for-performance programs in perioperative medicine: Are they meeting their goals?
Cleveland Clinic Journal of Medicine,
November 1, 2009;
76(Suppl_4):
S3 - S8.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. R. McLean
Informed Consent and Clinician Accountability: The Ethics of Report Cards on Surgeon Performance
JAMA,
January 14, 2009;
301(2):
224 - 225.
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
W. H. Ettinger, S. M. Hylka, R. A. Phillips, L. H. Harrison Jr, J. A. Cyr, and A. J. Sussman
When Things Go Wrong: The Impact of Being a Statistical Outlier in Publicly Reported Coronary Artery Bypass Graft Surgery Mortality Data
American Journal of Medical Quality,
April 1, 2008;
23(2):
90 - 95.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
C. H. Fung, Y.-W. Lim, S. Mattke, C. Damberg, and P. G. Shekelle
Systematic Review: The Evidence That Publishing Patient Care Performance Data Improves Quality of Care
Ann Intern Med,
January 15, 2008;
148(2):
111 - 123.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. McLean
Will reputational incentives stimulate a reversal of the physician brain drain?
J Health Serv Res Policy,
January 1, 2008;
13(1):
50 - 52.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. Steinbrook
Public Report Cards -- Cardiac Surgery and Beyond
N. Engl. J. Med.,
November 2, 2006;
355(18):
1847 - 1849.
[Full Text]
[PDF]
|
 |
|
|