In response to the pressures of health care cost inflation, the Centers for Medicare & Medicaid Services (CMS) and other payers have been investigating "value-based purchasing" initiatives. These initiatives, including pay-for-performance, assume that the entities from which payers buy health care have a sufficiently large number of patients with specific conditions to support statistically valid comparative measurement.
However, the reality is that many primary care physicians provide a wide variety of services to different patients. Consequently, they do not see enough eligible patients in any given category to produce statistically reliable performance measurements. In essence, individual primary care physicians are often as indistinguishable as Tweedle-Dee and Tweedle-Dum for performance measurement purposes.
For proponents of accountable care organizations and similar arrangements, the answer to this conundrum is primary care physician aggregation. That is, if you put enough primary care physicians together and measure their collective performance at the practice in which they work, you can overcome the sampling problem posed at the level of the individual primary care physician.
That may be true, but unfortunately, you have to do a whole lot of aggregating to get to that point, and most practices aren't that big. This point was driven home to me in a study by Nyweide et al in the Dec. 9, 2009, edition of the Journal of the American Medical Association. They concluded that relatively few primary care practices are large enough to reliably measure even 10 percent relative differences in common measures of quality and cost performance (at least among Medicare fee-for-service patients). For instance, the authors calculated that it would take a caseload of 2,526 Medicare fee-for-service patients to detect a 10 percent difference in ambulatory costs. However, only practices with 11 or more primary care physicians are likely to reach that threshold, and the percent of practices with that many primary care physicians is less than 2 percent.
According to the study, no groups had enough patients to detect 10 percent differences in preventable hospitalizations or congestive heart failure readmissions. This implies that a group of Tweedle-Dees is as indistinguishable from a group of Tweedle-Dums as the individuals are from each other.
Does this mean that CMS and other payers are likely to abandon pay-for-performance, tiering and steering, and other efforts to grade physicians relative to each other? I would suggest that you would be Mad as a Hatter to think so. As for the payers, they're not saying, just grinning like a Cheshire Cat.
Sign up to receive FPM's free, weekly e-newsletter, "Quick Tips & Insights," featuring practical, peer-reviewed advice for improving practice, enhancing the patient experience, and developing a rewarding career.