Jul 2013

Pay for Performance: What We Measure Matters

Laura A. Petersen, MD, MPH
Mr. Ozonoff arrived at Dr. Mehta’s office for his annual checkup. His blood pressure had been in the normal range until a few months ago, when it had started to creep up, according to the blood pressure machine he sometimes used outside his workplace cafeteria. At Dr. Mehta’s office, it registered 145/90—just into the hypertensive range.

Dr. Mehta wanted to get Mr. Ozonoff’s blood pressure back into the normal range and thought the goal could be achieved by changes in his eating and exercising habits. At the same time she recognized that her practice received a financial bonus every quarter from several of the health plans they contracted with when a certain percentage of the patient panel maintained blood pressures within the normal range, and medication was the surest and simplest way to accomplish the goal quickly.

Because Mr. Ozonoff’s blood pressure was only slightly above the 140/90 cutoff for hypertension, Dr. Mehta began to discuss lifestyle changes—such as regular exercise and eating a healthier, lower-salt diet— with him, changes that would help not only with his blood pressure but with other health problems; his weight, for example, had been edging upward over the past few years.

Mr. Ozonoff seemed uninterested in Dr. Mehta’s suggestions that he alter his lifestyle in any way. “I’m too busy right now to change anything,” he said. “But I know I can’t continue with my blood pressure going up and up. Just write me a prescription and we’ll see how that works.”

Writing a prescription is a quick fix that’ll leave him dependent on medication and not change his poor eating habits for the better, Dr. Mehta reasoned to herself. Moreover, she thought, does having a certain percentage of blood pressures under 140/90 really indicate that we’re doing a good job clinically?


There is a growing realization that financial incentives are powerful influences on the amount and type of health care provided to patients. The fee-for-service payment model is associated with greater use of (well-reimbursed) services, which does not necessarily entail any attention to their indications or quality [1]. Capitated and salary payments are associated with use of fewer expensive services and therefore poorer access to those that are needed. Such observations about the relationship between financing methods and use of services have influenced approaches to the financing of health care under the Affordable Care Act (ACA). The provisions of the ACA seek to make health care more affordable for patients, control rising health care costs, and ensure high-quality care. Value-based payment systems, such as those being advocated by the Centers for Medicare and Medicaid Services (CMS) and other payers, are intended to align incentives with high-quality health care [2]. As one example, the New York City Health and Hospitals Corporation, the nation’s largest public health system, recently announced a performance-based pay plan for physicians [3].

Despite the face validity of pay-for-performance programs, evaluations of their effectiveness have shown contradictory results [4-6]. Furthermore, many questions have been raised about how they should be implemented. In particular, the way that the quality of care is measured can have profound influences upon how hospitals and clinicians are ranked, rated, and rewarded.

How We Measure

In general, many of the “first-generation” performance measures, such as the Healthcare Effectiveness Data and Information Set (HEDIS) [7], do not necessarily account for the complexity of patients’ conditions. So a single patient with multiple chronic diseases may be part of the denominator for a number of performance metrics (e.g., proportions of patients screened for colorectal cancer; proportion of patients receiving aspirin after acute myocardial infarction), with no consideration given to the relative benefit or relevance of those treatments to the specific patient. For example, risk factor control for a particular patient who is at risk for cardiovascular disease might be more urgent during a specific primary care visit than colorectal cancer screening. Yet, the patient is in the denominator when the percentage of patients who receive colorectal cancer screening is calculated.

Also, HEDIS-type measures incorporate only a “cross-sectional” approach; there is a yes-or-no answer to the question of whether a certain threshold is met or not. This approach does not account for patient preferences about trying lifestyle modifications, or even for patient visits following a lapse in medication adherence and when the patient merely returns for a repeat measurement. Measures that incorporate a follow-up assessment period would capture the results of treatment intensification (i.e., addition or dose titration of a medication) as well as the results of longitudinal chronic disease care [8-11].

What We Measure

What is measured also has a significant effect on how performance is rated. Process measures, such as ordering a test or providing tobacco cessation counseling, can be easily achieved in only a single encounter. Conversely, intermediate outcome measures (e.g., blood pressure or glucose control) may require many visits involving several medication adjustments and counseling regarding lifestyle modifications [8, 9, 12]. We have shown that diabetic patients with life-limiting chronic conditions are less likely to have standard “good” outcomes despite frequent monitoring [13]. For such patients, comfort control should take precedence over glucose control or retinal screening. However, patients with life-limiting conditions are rarely excluded from the denominator when glucose control and retinal screening are assessed [13]. Few measures, if any, reflect patient preferences or inform clinicians specifically about how they might improve their care.

Given these methodological problem, physician skepticism about the motivation for and accuracy of performance measurement programs is understandable [14, 15]. While physicians overwhelmingly believe that financial incentives should be given for high-quality care, fewer than one-third think that current performance measures are accurate, and only slightly more endorsed the statement that those responsible for designing quality measures will work to ensure their accuracy [16]. Those who are being profiled expect rigorous statistical methods and approaches for performance measurement that are reproducible and robust. Failure to design methodologically rigorous performance measurement programs may limit physician buy-in and hinder quality improvement.

Poorly designed measures may lead to unintended consequences, including erroneously identifying physicians as poor performers and the even more concerning possibility that physicians may avoid seriously ill patients to prevent negative impacts on their individual or hospital ratings. Professionalism is what keeps physicians from weighing their personal and practice financial welfare ahead of that of their patients, and these programs must be designed so that they do not overwhelm professionalism.

Why might financial incentives work to improve guideline adherence, above and beyond other interventions such as computerized reminders or audit and feedback? Of course, there are myriad reasons, including professionalism and intrinsic motivation, for physicians to do a good job. But financial incentives for individual effort and task performance might amplify the effects of educational interventions and performance feedback reports. According to Bandura’s self- efficacy theory, incentives work by piquing an individual’s interest in a task, leading to greater effort at performing the task and ultimately to an increased sense of self-efficacy [17]. The goal of the incentive is to ignite motivation rather than to coerce or to overcome professionalism.

This case illustrates some of the pitfalls of performance measures and pay-for-performance programs. In this hypothetical case, the practice is rewarded for the proportion of patients who have achieved an arbitrarily bounded threshold blood pressure goal. As clinicians, we know that there are multiple reasons that patients do not achieve a given blood pressure threshold, many having little to do with the clinician and more to do with the patient’s adherence or preferences and medication efficacy, side effects, affordability, and so on. Therefore, the best measures of quality of care should reward clinicians for “doing the right thing,” regardless of whether the patient meets a particular blood pressure goal.

As in this case, despite the best intentions of the clinician, the patient does not wish to pursue weight loss and lifestyle modifications. Ideally, there should be a way to reward the doctor for having the discussion and educating the patient about lifestyle modifications and then documenting that the care provided followed patient preferences. But it appears that Dr. Mehta feels she is left with a choice between prescribing medication or the practice’s forgoing the reward. The case raises the issue of whether the physicians in this practice can put the patient’s well-being ahead of personal or practice group financial implications of treatment decisions, suggesting that a different performance metric and reward system are needed to properly align incentives.

Ratings of the quality of care at the hospital level (e.g., Hospital Compare, Consumer Reports, and others), at the practice group level (by health plans such as UnitedHealth and others), and at the level of individual clinicians (on websites such as Angie’s List) are becoming ubiquitous. And changes in the way that clinicians are rated and reimbursed are inevitable under the ACA [18]. But as in anything else, what we measure matters. The challenge is to create measures and performance pay plans that enhance quality, support professionalism, and align incentives to promote delivery of high-quality care. Involving physicians in the design and execution of these programs may help achieve these goals.


