Glance and colleagues1 provide a damning assessment of the association between physician performance in the Merit-Based Incentive Payment System (MIPS) and the measured outcomes of patients in the associated hospitals. Figure 1 of their study1 illustrates the results most clearly. It shows no association between MIPS percentile performance and a composite patient safety and adverse event score, based on a very large sample size. Secondary results by physician specialty and by specific outcomes, such as in-hospital mortality and failure-to-rescue rates, were largely indistinguishable from random chance.
The Centers for Medicare & Medicaid Services (CMS) have spent more than a billion dollars developing quality measures, and hospitals and clinicians have spent many billions more attempting to meet them. Why? Because society values health care, spends a lot of money on it, and wants reassurance that it is effective. However, health care is also extremely complex, and quality is difficult to define. Like all quality measurement systems, MIPS is a balance between multiple, often conflicting, goals. The public, and CMS, want measures that reflect real outcomes of health care, such as preventable death, and they want them at the level of an individual physician because that is the face of health care that they see. However, health care is provided by teams of people across multiple environments, even for straightforward episodes of care, and must encompass a wide range of variability in patient acuity and comorbidities. Attribution of responsibility for a bad outcome to any single member of the team is challenging, and risk adjustment for patient factors is required to enable fair comparisons. Applying financial incentives to the process motives participation but also inspires gamesmanship at multiple levels.
Development and approval of a new MIPS measure, eg, through the National Quality Forum (NQF), takes many years and many dollars. A natural desire for methodological rigor has led to a process so bureaucratic that few organizations are willing to take it on; a candidate measure submission to the NQF can run hundreds of pages and tens of thousands of dollars to produce. And those that do take on the task of measure development—typically physician specialty associations—are conflicted at the outset. What professional association is going to advance an MIPS measure that makes its members look bad? The result is a set of measures that might be reassuring to the public because performance is uniformly high but do nothing to demonstrate variations in care that might enable quality improvement. Glance et al1 have demonstrated the resulting disconnect between intention and results.
Failure to report MIPS data to CMS in 2021 will result in a 9% penalty to reimbursement in 2023, a substantial enough incentive to motivate more than 90% of eligible physicians to participate. As noted in the study,1 most physicians will do so through group reporting mechanisms. These have evolved to achieve MIPS-reporting compliance with the least possible administrative burden, including the common mechanism of quality registries that have no function but to move topped-out process performance data to CMS. It is not surprising that most clinicians in private practice regard MIPS as a regulatory tax to be met with the least possible investment of either money or time. Academic clinicians can be even more cynical because they are usually part of a multispecialty physician practice plan under a common taxpayer identification number. They can share a group score based on a few carefully selected measures that have nothing to do with their individual skills or competence. It might be reassuring to a surgeon to know the practice plan’s benchmarked performance for blood pressure screening or medication reconciliation, but this information is unlikely to improve their work. For some specialists, such as anesthesiologists, bad outcomes are so rare as to be nearly random, meaning that any meaningful outcome measure will have little chance of achieving statistical validity at the individual level. MIPS measures for intraoperative death, cardiac arrest under anesthesia, and postoperative reintubation have all been retired by CMS over the years due to uniformly high performance and despite their obvious importance to the public.
Some glimmers of hope can be seen in MIPS and in the work of Glance et al.1 By endorsing evidence-based process measures, CMS has accelerated the uptake of clinical best practices that might have been slow to move into the community if not financially incentivized; current examples in anesthesia include screening and mitigation of obstructive sleep apnea and use of multimodal techniques to manage perioperative pain. In the study by Glance et al,1 the only positive association that stood out was between MIPS score for cardiac surgeons and the hospital’s cardiac surgery outcomes. This is partly because the relevant MIPS measures cover coronary artery bypass surgery and valve replacement, which represent a large portion of all cardiac surgery (as opposed to knee replacement surgery, which is common but still only a fraction of all orthopedic procedures). But it is also because the Society for Thoracic Surgery (STS) has invested decades of work in building an inclusive and very detailed national quality registry that supports MIPS reporting as a secondary benefit, rather than a primary purpose, and that enables appropriate risk adjustment for benchmarking of hospital and clinician outcomes. Unfortunately, the cost and focus of the STS Registry have proven difficult to achieve in other specialties.
CMS should rethink their pay-for-performance strategy for clinicians. What has worked well for hospitals does not work for physicians. As presently constructed, MIPS does little but contribute to the 34% of US health care dollars spent on administrative activities, with only marginal gains in quality improvement.2 The public would be better served by investment in high-quality clinical registries—perhaps enabled by mandatory interoperability of electronic medical records—or by a system that considers clinicians as one part of a facility-based team, with high-level clinical outcomes attributed to all participants equally. It is time to buy the emperor some new clothes—and make sure they are visible to all.
Published: August 3, 2021. doi:10.1001/jamanetworkopen.2021.19334
Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2021 Dutton RP. JAMA Network Open.
Corresponding Author: Richard P. Dutton, MD, MBA, US Anesthesia Partners, 12222 Merit Dr, Dallas, TX 75251 (firstname.lastname@example.org).
Conflict of Interest Disclosures: None reported.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Dutton RP. The Merit-Based Incentive Payment System and Quality—Is There Value in the Emperor’s New Clothes? JAMA Netw Open. 2021;4(8):e2119334. doi:10.1001/jamanetworkopen.2021.19334
Customize your JAMA Network experience by selecting one or more topics from the list below.