Calculation of Overall Hospital Quality Star Ratings With and Without Inclusion of the Peer Grouping Step

Key Points Question What are the implications of applying a peer grouping step on hospitals’ Overall Star Ratings? Finding In this cross-sectional study of 3076 hospitals that received a star rating in 2023, presence of the peer grouping step resulted in 585 hospitals (19.0%) being assigned a different star rating than if the peer grouping step was absent, including considerably more hospitals having a higher star rating (517 hospitals) than a lower star rating (68 hospitals). Meaning These findings suggest that the inclusion of a star ratings peer grouping allows an updated comparison of quality between hospitals and better supports the ability of patients to assess overall hospital quality.


Finding
In this cross-sectional study of 3076 hospitals that received a star rating in 2023, presence of the peer grouping step resulted in 585 hospitals (19.0%) being assigned a different star rating than if the peer grouping step was absent, including considerably more hospitals having a higher star rating (517 hospitals) than a lower star rating (68 hospitals).
Meaning These findings suggest that the inclusion of a star ratings peer grouping allows an updated comparison of quality between hospitals and better supports the ability of patients to assess overall hospital quality.

Introduction
Care Compare is the health care comparison website created by the Centers for Medicare & Medicaid Services (CMS) to offer patients and caregivers information on quality of health care systems and practitioners. 1 CMS introduced the Overall Hospital Quality Star Rating (hereafter, Overall Star Rating) in 2016 as a summary score of the quality measures reported on Care Compare.The Overall Star Rating assigns hospitals 1 to 5 stars based on their overall performance, which is intended to be easily interpreted by patients and consumers. 2The visibility of the Overall Star Rating has been shown to contribute to patient decision-making and has garnered high levels of attention from hospitals seeking to improve scores to attract patients. 3,4][7][8][9][10][11] They noted that certain hospitals may have benefited by being scored on only measures related to procedures commonly performed or conditions most often treated, while not being scored on measures for other conditions or procedures the hospital was less experienced in due to low patient volumes.With all hospitals scored together in prior versions of the Overall Star Rating, these hospital measure exclusions may have advantaged or disadvantaged certain hospitals based on heterogeneous hospital characteristics, such as patient mix and service lines.These differences raised the possibility that some hospitals would be classified with a different star rating had they been compared only with hospitals scored on a similar type and number of measures. 12A proposed solution was peer grouping, in which hospitals receiving star ratings would be compared with peers with similar patterns of reported quality measures.
In January 2021, CMS published version 4.1 of the Overall Star Ratings method, with several significant methodological updates intended to address prior concerns and improve comparability between hospitals. 13Included in these changes was the incorporation of a peer grouping step, acknowledging the fact that hospitals may be scored on different types and numbers of measures based on their size, patient volume, services provided, and patient mix.This update was meant to allow fair comparison among hospitals included in CMS programs that reflected the overall quality of like hospitals.To our knowledge, there are no published data on how hospital star ratings are influenced with the presence of a peer grouping step.Therefore, we sought to assess how the incorporation of the peer grouping step changed 2023 hospital star ratings.

Methods
This cross-sectional was approved by the Yale University institutional review board.This study was deemed exempt from informed consent as it was covered by the Common Rule exemption described at 45 CFR 46.104(d)(4)(i).We followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.

Data Sources
We used the January 2023 Care Compare results as the primary dataset for analyses. 14Updated on a quarterly basis, Care Compare is a publicly available website that reports numerous process, structural, and outcome measures for acute inpatient and outpatient hospitals. 1Using the hospitalspecific CMS Certification Number, we linked the Care Compare dataset to several additional sources to incorporate data regarding hospital characteristics.Specifically, we used the 2018 Hospital Provider Cost Report to obtain hospital specialty status, bed size, urban or rural designation, and disproportionate share percentage. 15We used the FY 2022 IPPS Final Rule Impact file to obtain hospital teaching status. 16

Overall Star Ratings Peer Grouping
To directly assess the peer grouping step, we calculated star rating assignments in 2 fashions, including the use and nonuse of peer grouping.Our calculation of star ratings was in accordance with published methods reports and otherwise identical. 13In brief, we included all 46 underlying measures within the 5 measure groups that comprise Overall Star Ratings: mortality (7 measures), readmission (11 measures), safety of care (8 measures), patient experience (8 measures), and timely and effective care (12 measures) (eMethods in Supplement 1).We calculated a measure group score based on the simple mean of standardized measure scores within the group for each hospital, followed by the determination of a hospital summary score as the weighted mean of each hospital's measure group scores.For example, if a hospital had Care Compare scores for 8 measures within the safety of care measure group, then each of the standardized measure scores would contribute 12.5% toward the safety of care measure group score.Similarly, for a hospital with standardized measure scores for all 5 measure groups, then the safety of care measure group would contribute 22% toward the hospital summary score, in addition to the mortality, readmission, and patient experience measure groups contributing 22% each and the timely and effective care measure group contributing 12%.
We then applied CMS's public reporting thresholds, requiring hospitals to have at least 3 measures in each of at least 3 measure groups (1 of which must be the mortality or safety of care group) to receive a star rating.For the approach with use of the peer grouping step, remaining hospitals were assigned to a peer group based on the number of measure groups in which the hospital had 3 or more reported measures, resulting in a 3-measure group peer group (peer group 3), a 4-measure group peer group (peer group 4), and a 5-measure group peer group (peer group 5).For the approach without the peer grouping step, all hospitals meeting the threshold were compared in a single group, reflecting the only difference between approaches.Independently within each peer group (when performed) or across all hospitals when peer grouping was not performed, we then used a k-means clustering algorithm to relatively classify hospitals into 5-star rating categories based on the overall summary score, with 1 star being the lowest-performing hospitals and 5 stars, the highest.

Statistical Analysis
To better understand the composition of hospitals within each individual peer group, we described characteristics for each peer group of hospitals, including hospital specialty status, teaching status, safety-net status, critical access status, bed size, geography, and disproportionate share hospital patient percentage (calculated as [Medicare Supplemental Security Income days / total Medicare days] + [Medicaid, non-Medicare days / total patient days]).We identified teaching status as nonteaching if no residents were present, minor teaching if fewer than 100 residents were present, and major teaching if at least 100 residents were present.We identified hospitals as safety net status if they were publicly funded or if the hospital's number of inpatient Medicaid discharges were 1 SD above the state mean.
We then identified the distribution of star ratings whether the peer grouping step was used or not used.We also identified the number of hospitals with a higher, lower, or identical star rating with the use of the peer grouping step compared with its nonuse to understand how the star ratings of hospitals with certain characteristics were influenced by the peer grouping step.
We further sought to describe the measure group reporting patterns that served as drivers for changes in star ratings to identify which measure groups were not frequently reported by hospitals in peer groups 3 and 4. Finally, we calculated the mean score within each measure group, stratified by peer group and star rating.P values were 2-sided, and statistical significance was set at P < .05.Analyses were conducted using SAS version 9.4 (SAS Institute).Data were analyzed from April 2023 to December 2023.

Results
Of 4654 hospitals with quality measures reported on Care Compare in January 2023, 3076 were assigned a star rating.Of hospitals that were assigned a star rating, most were nonspecialty (  1).
Application of peer grouping resulted in 585 hospitals (19.0%) being assigned a different star compared with no peer grouping; no hospital's star rating differed by more than 1 star.In peer group 3, a total of 24 hospitals had a higher star rating and 20 hospitals had a lower star rating with the use of peer grouping than without its use.In peer group 4, 48 hospitals had a lower star rating while none had a higher star rating with the use of peer grouping than without its use.Conversely, all 493 hospitals in peer group 5 had a higher star rating while none had a lower star rating with the use of peer grouping than without its use (Table 2).
With peer grouping applied, we identified modest shifts in Overall Star Ratings, stratified by key hospital characteristics.Of 68 hospitals receiving a lower star rating with peer grouping, most were rural, non-safety net hospitals and hospitals with low bed volume (1-99 beds).Of 517 hospitals receiving a higher rating with peer grouping present, most were urban, non-safety net hospitals and hospitals with a higher bed volume (Table 3).
By definition, all 2420 hospitals in peer group 5 reported at least 3 measures in each of the measure groups.Most hospitals (71.6%) in peer group 3 did not meet reporting requirements for the patient experience measure group but almost universally were scored on at least 3 measures within the readmission and timely and effective care measure groups (Table 4).Stratified by peer group, hospitals receiving a greater number of stars tended to receive higher mean measure group scores (eTable 2 in Supplement 1).

Discussion
In this cross-sectional study, we describe the application of a peer grouping step and the associated changes in the assignment of CMS's overall hospital star rating.As 19% of hospitals receiving a 2023 star rating would have received a different rating without peer grouping (by no more than 1 star), our results show a modest change on hospitals' assigned star ratings from the peer grouping step that groups hospitals with similar characteristics together.Incorporation of peer grouping within the star rating method allows comparison of hospitals for which a similar amount of measure information is reported and therefore likely have similar patient volumes and patient mixes.Overall, the peer grouping step provides an improvement to the star ratings assignment method by enhancing the comparability of hospitals receiving a star rating.
][7][8][9][10][11][12] A prior "Rating the Raters" assessment of 4 hospital quality rating systems identified that an earlier version of the star rating method, without peer grouping present, resulted in the heterogeneous collection of hospitals into a single group. 12 Stakeholders identified that larger hospitals with more diverse patient mix and service mix, such as large urban teaching hospitals, frequently report a greater number of measures, and should therefore be compared within a more directly comparable group. 17ratification by hospital characteristics showed noticeable differences in the makeup of each of the 3 peer groups, particularly apparent for bed size, geography, and critical access status.This finding is aligned with and expands on the work by Chung et al 11 that assessed the 2017 star ratings program which, prior to peer grouping, used a cluster analytic approach to outline a method that grouped hospitals "that take similar tests," as they report similar measures and similar numbers of measures. 11The 3 clusters had patterns of unique characteristics, including teaching status, size, and patient mix index, yet the work was limited by potential complexity and difficulty for stakeholders to understand, as the number and composition of clusters could change every year.The current approach to peer grouping in the Overall Star Rating version 4.1 method captures the benefit of a considerable degree of proportional consistency between hospital characteristics and the number of measure groups with multiple measures, while also resulting in an easily interpretable 3 to 5 measure group peer group stratification.a An example interpretation is that 351 (67.9%) of the 517 hospitals that had a higher star rating after peer grouping step implementation were identified to be in an urban geography.Presence of the peer grouping step resulted in 3 noteworthy trends.First, a review of the overall outcome found that a considerably greater number of hospitals received a higher star rating (517 hospitals) than a lower rating (68 hospitals) when ratings were calculated using peer grouping.

JAMA Network Open | Health Policy
Second, all hospitals in peer group 4 with a star rating difference received a lower rating with the use of peer grouping, while all hospitals in peer group 5 with a star ratings difference received a higher rating.Based on hospital characteristics, the lower ratings were observed for hospitals that were predominantly small (low number of beds), rural, or teaching.Prior literature speculated that some hospitals with these characteristics may have had an advantage in earlier versions of Overall Star Ratings due to not being scored on measures with low patient volumes. 5Based on our findings, use of the peer grouping step may result in a more appropriate comparison among similar hospitals.
Third, the more granular assessment of the measure group reporting frequencies can help answer the question of what is driving change in the star ratings with addition of the peer grouping step.In peer group 4, the mortality measure group was most commonly (7.4%) omitted from public reporting due to insufficient patient volumes; in peer group 3, the safety of care and patient experience measure groups are most commonly omitted (25.8% and 71.6%, respectively).This provides evidence of a greater level of homogeneity in the submitted measure groups when making comparisons among hospitals with the presence of the peer grouping step.

Limitations
We recognize several limitations of our study.First, the Overall Star Rating only includes hospitals that meet the reporting requirements for a sufficient selection of the underlying measures reported on Care Compare.Thus, our findings do not account for hospitals that did not meet inclusion criteria to receive a star rating.However, because our focus was only on hospitals that were included in publicly reported Overall Star Ratings, this is a minor limitation.Second, we recognize some incongruity between measurement results (January 2023), and the date at which some hospital characteristics were defined (2018).It is possible that characteristics of included hospitals may have changed between 2018 and 2023.Third, analyses did not cross-validate to conclude whether the new method of Overall Star Rating is more or less accurate but rather assessed agreement between the use and nonuse of the peer grouping step.Fourth, in accordance with the overall hospital star ratings methods, hospitals were assigned to a peer group based on the number of measure groups in which the hospital had 3 or more reported measures.While this approach was determined through several years of stakeholder engagement discussions, other methodologic approaches may be considered to explicitly group peer hospitals together, such as hospital characteristics (eg, specialty status, teaching status, critical access status).However, recent CMS final rules have shown that the number of measure groups reported by a hospital is closely associated with certain characteristics and is therefore a valid approach 18 ; as an example, urban teaching hospitals with a greater number of beds are more frequently within peer group 5 compared with rural nonteaching hospitals with a smaller number of beds.

JAMA Network Open | Health Policy Calculation
Resultantly, large academic medical centers that reported nearly all of the of Overall Hospital Quality Star Ratings With vs Without Peer Grouping the star ratings method were compared with critical access hospitals reporting considerably fewer.As part of discussions with stakeholders regarding the comparability of hospital star ratings, peer grouping was discussed and gained face validity in technical expert panels, patient advocate workgroups, organizational leadership workgroups, and public comment periods.

Table 1 .
Overall Star Ratings and Hospital Characteristics Stratified by Peer Group, January 2023 a DSH quintiles are defined among the 2411 hospitals with a star rating and an operating DSH adjustment.Quintile 1 includes hospitals with the lowest DSH percentage.

Table 2 .
Overall Star Rating Distribution With and Without the Peer GroupingStep for January 2023 Care Compare Reporting Hospitals An example interpretation is that among the 2420 hospitals in peer group 5, 209 hospitals would have received a star rating of 3 if peer grouping were not used but would have received a star rating of 4 if peer grouping were used. a

Table 4 .
Measure Group Reporting by Hospitals Within Peer Groups development of the peer grouping step and methods used in calculations and valuable contributions to this study.Their work was compensated under the Centers for Medicare & Medicaid Services contract.