The vertical blue line marks the break between the baseline and intervention periods.
eTable 1. Baseline Characteristics for Medicare Patients In the Intervention Group Who Did and Did Not Receive Care Coordination Services
eFigure 1. Mean (Unadjusted) All-Cause Hospitalization Rates by Quarter, Intervention Group, and Risk Status
eFigure 2. Mean (Unadjusted) Ambulatory Care Sensitive Hospital Admissions, by Quarter, Intervention Group, and Risk Status
eFigure 3. Mean (Unadjusted) Outpatient Emergency Department Visit Rates by Quarter, Intervention Group, and Risk Status
eFigure 4. Mean (Unadjusted) Medicare Part A and B Spending by Quarter, Intervention Group, and Risk Status
eFigure 5. Mean (Unadjusted) Rates of Ambulatory Care Follow-up Visit Within 14 days of Hospital Discharge, by Quarter and Intervention Group
Customize your JAMA Network experience by selecting one or more topics from the list below.
Peterson GG, Geonnotti KL, Hula L, et al. Association Between Extending CareFirst’s Medical Home Program to Medicare Patients and Quality of Care, Utilization, and Spending. JAMA Intern Med. 2017;177(9):1334–1342. doi:10.1001/jamainternmed.2017.2775
Does CareFirst’s medical home program, which provides financial incentives to primary care practices and care coordination for high-risk patients, improve quality of care and reduce hospitalizations, emergency department visits, and spending for Medicare patients?
In a difference-in-differences analysis with 52 intervention practices and matched comparison practices, the program was not associated with outcome improvements for Medicare patients. Hospitalizations declined by 10%, but this was matched by similar changes in the comparison group, suggesting that outside market factors drove the decline in the treatment group.
This medical home model needs further adaptions and testing before being scaled broadly for Medicare patients.
CareFirst, the largest commercial insurer in the mid-Atlantic Region of the United States, runs a medical home program focusing on financial incentives for primary care practices and care coordination for high-risk patients. From 2013 to 2015, CareFirst extended the program to Medicare fee-for-service (FFS) beneficiaries in participating practices. If the model extension improved quality while reducing spending, the Centers for Medicare and Medicaid Services could expand the program to Medicare beneficiaries broadly.
To test whether extending CareFirst’s program to Medicare FFS patients improves care processes and reduces hospitalizations, emergency department visits, and spending.
Design, Setting, and Participants
This difference-in-differences analysis compared outcomes for roughly 35 000 Medicare FFS patients attributed to 52 intervention practices (grouped by CareFirst into 14 “medical panels”) to outcomes for 69 000 Medicare patients attributed to 42 matched comparison panels during a 1-year baseline period and 2.5-year intervention at Maryland primary care practices.
Main Outcomes and Measures
Hospitalizations (all-cause and ambulatory-care sensitive), emergency department visits, Medicare Part A and B spending, and 3 quality-of-care process measures: ambulatory care within 14 days of a hospital stay, cholesterol testing for those with ischemic vascular disease, and a composite measure for those with diabetes.
CareFirst hired nurses who worked with patients’ usual primary care practitioners to coordinate care for 3656 high-risk Medicare patients. CareFirst paid panels rewards for meeting cost and quality targets for their Medicare patients and advised panels on how to meet these targets based on analyses of claims data.
On average, each of the 14 intervention panels had 9.3 primary care practitioners and was attributed 2202 Medicare FFS patients in the baseline period. The panels’ attributed Medicare patients were, on average, 73.8 years old, 59.2% female, and 85.1% white. The extension of CareFirst’s program to Medicare patients was not statistically associated with improvements in any outcomes, either for the full Medicare population or for a high-risk subgroup in which impacts were expected to be largest. For the full population, the difference-in-differences estimates were 1.4 hospitalizations per 1000 patients per quarter (P = .54; 90% CI, −2.1 to 5.0), −2.5 outpatient ED visits per 1000 patients per quarter (P = .26; 90% CI, −6.2 to 1.1), and −$1 per patient per month in Medicare Part A and B spending (P = .98; 90% CI, −$40 to $39). For hospitalizations and Medicare spending, the 90% CIs did not span CareFirst's expected impacts. Hospitalizations for the intervention group declined by 10% from baseline year to the final 18 months of the intervention, but this was matched by similar declines in the comparison group.
Conclusion and Relevance
The extension of CareFirst’s program to Medicare did not measurably improve quality-of-care processes or reduce service use or spending for Medicare patients. Further program refinement and testing would be needed to support scaling the program more broadly to Medicare patients.
Payers and primary care practitioners (PCPs) (physicians, nurse practitioners, and physician assistants) have embraced the patient-centered medical home as a way to improve health system performance.1 By promoting care that is team-based, focused on the whole person, accessible, and coordinated across the health care system and community, the model aims to improve quality of medical care while reducing overall spending.2,3 Some studies have found that medical home interventions have improved care quality, patient and practitioner satisfaction, and/or reduced service use and spending modestly while others have not.4-9 The findings are mixed, in part, because the medical home concept is broad and the studies test different interventions.10
CareFirst BlueCross BlueShield, the largest commercial insurer in the mid-Atlantic region, runs a medical home program for its commercial members focused on care coordination for high-risk patients and strong financial incentives for PCPs to meet quality and cost targets.11,12 In 2012, the Centers for Medicare and Medicaid Services (CMS) awarded CareFirst a $20 million Health Care Innovation Award to extend its medical home program to Medicare fee-for-service (FFS) patients. CareFirst hypothesized that combining Medicare and CareFirst into a single program would create powerful incentives for PCPs to change referral patterns and better coordinate care for their high-risk patients.12 This, in turn, would reduce hospital admissions, emergency department (ED) visits, and Medicare spending. If the program proved successful, CMS could expand it to other practices, potentially even nationwide.
In this independent evaluation of CareFirst’s program for Medicare patients, we describe the program and test 2 prespecified hypotheses.13 Our primary hypothesis was that, after a 1-year ramp-up period, the program would improve quality-of-care processes while reducing use of hospital services and medical spending. Our secondary hypothesis was that any overall impacts would be concentrated among patients at high-risk of hospitalizations and other acute care.
The performance unit for CareFirst’s medical home program, both commercial and Medicare, is the “medical panel.” Panels are groups of 5 to 15 PCPs who agree to participate as a unit for quality measurement and share incentive payments. Panels can be solo or small practices that work together, group practices within the size range, or subgroups of large practices.12
In 2012, CareFirst selected 14 of the 450 panels in the commercial program to participate in the expansion to Medicare patients. These 14 panels represented a range of practice types (in size and ownership) and performed well on cost and quality measures for commercial patients.14 The 14 panels included 149 PCPs in 52 primary care practices in Maryland who see roughly 35 000 Medicare patients.12
The program’s extension to Medicare FFS patients ran from August 2013 to December 2015 and focused on 3 components, modeled after the commercial program.14
CareFirst hired registered nurses to work with panels’ PCPs to coordinate care for high-risk Medicare patients with multiple chronic and unstable conditions. Using the Diagnostic Cost Grouper classification model, CareFirst grouped patients into risk bands based on their predicted future spending and encouraged PCPs and nurses to select patients in the top bands.12 Working with the patients and their PCPs, nurses developed and implemented care plans over several months to a year, contacting patients about once per week by telephone. Care plans were designed to help control chronic conditions and avoid acute exacerbations by focusing on medication reconciliation, coordination with specialists, self-management support, and responses to early warning signs.
CareFirst paid rewards to panels that kept the total cost of care for their attributed Medicare FFS patients below a prespecified target, with the reward size scaled to the panel’s performance on 60 measures of PCP engagement with the program, clinical quality, patient access to care, and structural capabilities. The PCPs also received $200 for developing each new care plan and $100 for each care plan update.
CareFirst hired 5 program consultants who analyzed cost and quality performance data for each panel and met with panels’ PCPs at least quarterly to help identify candidates for care plans, increase PCP engagement, address gaps in care, and refer patients to more cost-effective practitioners or care settings.
CareFirst provided names and dates of Medicare patients receiving care coordination services. We also extracted the following from reports that CareFirst submitted to CMS: number and type of Health Care Innovation Award–funded practitioners; mode, frequency, and content of nurse contacts with patients receiving care coordination; frequency of program consultant meetings with panels; and number and size of rewards that CareFirst paid in each of the 3 performance years (2013-2015). Institutional review board approval was not sought; federal common rule (section 45 CFR 46.101[b]) provides an exemption from the institutional review board requirements when the purpose of the research is to study, evaluate, or otherwise examine a public benefit or service program.
To measure associations of CareFirst’s program with quality, service use, and spending, we compared outcomes for Medicare FFS patients attributed to 14 intervention panels with those attributed to 42 comparison panels in a 1-year baseline period and 2.5-year intervention period. Following CareFirst’s attribution rules, we attributed patients to panels monthly if a panel’s PCP provided the plurality of primary care services to the patient in the past 12 months (or 24 months if no services in the prior 12). Following an intent-to-treat design, we assigned patients to the intervention and comparison panels based on the first panel they were attributed to in the baseline or intervention period, and continued to assign them to that panel throughout the period. Patients remained in the sample as long as they were alive, observable in claims data, and not enrolled in Medicaid (excluded from CareFirst’s target population).
Because we aimed to estimate the marginal effect of extending the commercial program to Medicare patients, we selected the 42 comparison panels from the 450 participating in the commercial program in 2013, but not its extension to Medicare patients. We first limited this pool of potential comparison panels to those that, like the intervention panels, were in Maryland, joined the commercial program when it began in 2011, and served at least 1000 CareFirst patients in 2012. We then used propensity scores to match panels based on their cost and quality performance in the commercial program (data supplied by CareFirst), their size and ownership, and the demographics, service use, and spending of their Medicare patients.15
We constructed 6 outcomes from Medicare claims and enrollment data. Three outcomes measured quality-of-care processes: (1) whether patients hospitalized in a quarter had all of their stays followed by an ambulatory care visit within 14 days, (2) whether patients with diabetes received 4 recommended processes of care in a year (eye examination, hemoglobin A1c test, lipid test, and nephropathy screen), and (3) whether patients with ischemic vascular disease received a recommended lipid test in a year. We selected these quality-of-care measures because CareFirst incentivized improvements in chronic illness care through payments and because they are consistent with recommendations for medical home evaluations.16 The other 3 measures, defined quarterly, were (1) hospital admissions (all-cause and ambulatory-care sensitive), (2) outpatient ED visits, and (3) total Medicare Part A and B spending.
We implemented the difference-in-differences design using multivariate linear regressions for all outcomes. The independent variables were indicators for each measurement interval (quarter or year, depending on the outcome); fixed-effects (dummy variables) for each panel; interactions between each interval during the intervention period and panel participation in program (these interactions were the difference-in-differences estimates); and beneficiary characteristics defined at the start of either the baseline period (for observations in the 1-year baseline) or intervention period (for observations in the 2.5-year intervention period). Beneficiary characteristics included age, sex, original reason for Medicare entitlement (disability or old age), Hierarchical Condition Category (HCC) risk score, and the presence of select chronic conditions. We averaged the quarterly difference-in-differences estimates over specific quarters to test the study hypotheses. The regressions used robust standard errors, clustered at the patient level, and panel fixed effects to account for panel-level clustering. We conducted all analyses in Stata statistical software (version 14.1; StataCorp Inc). To avoid falsely concluding that the program did not have effects, we considered P < .10 to be statistically significant.
We estimated the models for high-risk patients, who we defined as those with HCC scores in the top third among intervention panels’ Medicare patients.17 We expected impacts to be larger for this group because (1) their risk of acute care was higher, creating more room for improvement, and (2) a larger fraction (16.0%) of these patients received CareFirst’s care coordination services than other patients (4.6%).
On average, each of the 14 intervention panels had 9.3 primary care practitioners and was attributed 2202 Medicare FFS patients in the baseline period (Table 1). The panels' attributed Medicare patients were, on average, 73.8 years old, 59.2% female, and 85.1% white. The intervention panels were similar (within 0.25 standardized differences) to comparison panels on most baseline measures (Table 1). For admissions, outpatient ED visits, ambulatory follow-up care after discharge, and spending, the panels also showed similar trends during the 4 baseline quarters (Figure and eFigures 1-5 in the Supplement). However, intervention panels, on average, had more PCPs (9.3 vs 8.5) and more Medicare patients (2202 vs 1342). The intervention and comparison panels also differed by more than 0.25 standardized differences on the percentage of patients with selected chronic conditions. However, these differences were not large or statistically significant.
All 14 intervention panels participated throughout the 2.5-year intervention. CareFirst hired 44 nurses who worked with PCPs to coordinate care for 3656 Medicare patients, about 8% of all attributed Medicare patients (Table 2). Patients receiving these services were 1.2 to 2.5 times more likely to have major chronic illnesses, like congestive heart failure, than those who did not, and also had higher predicted risk (mean HCC score of 1.77 vs 1.06) (eTable in the Supplement). Nurses coordinated care for patients for 9.5 months, on average, contacting patients about 3 times per month (almost always by telephone). CareFirst paid incentives to panels in each of the 3 performance years (2013-2015). In 2015, 13 of the 14 panels received awards, averaging $170 144 (or $18 280 per PCP). The 5 program consultants provided technical assistance to panels, on average, 3 times per month.
Participation in the intervention was statistically significantly associated with a decrease in the likelihood of receiving the 4 recommended diabetes processes of care for all Medicare patients (P = .007 in intervention year 1 and P = .07 in year 2), but not for the high-risk subgroup (P = .28 in year 1 and P = .80 in year 2) (Table 3). For all other outcomes, for all Medicare patients as well as the high-risk subgroup, the intervention was not associated with any statistically significant changes in outcomes (Table 3 and Table 4). The difference-in-differences estimates for all Medicare patients in months 13 to 30 were 1.4 all-cause hospitalizations per 1000 patients per quarter (P = .51; 90% CI, −2.1 to 5.0), −2.5 outpatient ED visits per 1000 patients per quarter (P = .26; 90% CI, −6.2 to 1.1), and −$1 per beneficiary per month for Medicare Part A and B spending (P = .98; 90% CI, −40 to 39). The 90% CIs did not span CareFirst's expected impacts for hospitalizations (−4.4 or 6%) or for Medicare spending (−$49 or 5%) but did for outpatient ED visits (−5.4 or 6%), the only other study outcome for which CareFirst set a target. All-cause hospitalizations declined by 10% from the baseline period to the last 18 months of the intervention for Medicare patients in the intervention group, but similar declines occurred in the comparison group (Figure). The means for the treatment and comparison groups tracked each other closely for other outcomes as well (eFigures 1-5 in the Supplement).
This study assessed the impacts of a medical home initiative focused on care coordination for high-risk patients and strong financial incentives to panels that meet cost and quality targets. Primary care practitioners and CareFirst-hired nurses coordinated care for high-risk patients as planned, and CareFirst provided technical assistance and paid outcome incentives awards as planned.
The difference-in-differences estimates show that the program did not measurably improve quality-of-care processes or reduce service use, for all Medicare patients or for the high-risk subgroup with larger expected impacts. As a result, the program did not produce any Part A and B savings to offset the cost of the program.
Hospitalization rates for the intervention group declined by an average annual rate of 6% from the baseline year through the 2.5-year intervention period, and Medicare spending grew by only 0.3% per year, well below historical rates.18 However, the comparison group showed similar trends, suggesting they were driven by outside forces. Medicare hospitalizations rates in Maryland have declined in the past decade, as they have nationally. Furthermore, cost growth has been modest in the past 5 years. These trends may reflect a combination of improved patient health, hospital responses to incentives to reduce readmissions, and a shift of hospital services from inpatient to outpatient settings.18-21 Maryland’s global hospital budget program, which began in 2013 and pays hospitals a fixed revenue not tied to patient volume, may have also reduced hospital admissions in Maryland.22
These trends help explain why our independent evaluation reached different conclusions from CareFirst’s. While CareFirst estimated that its program for Medicare patients saved $65 million, these estimates assumed cost growth without the program of 2.5% per year—much higher than the cost growth observed among matched comparison panels.
Our results contrast with 2 recent studies4,23 that estimated the program’s impacts on commercial patients. Cuellar et al4 found that the program reduced hospitalizations and total spending for commercial patients by 3% in the third year. Using the same time period and data but different sample definitions and regression specifications, Afendulis et al23 found that the program did not generate net savings but did reduce medical spending enough to fully offset the fees and bonuses CareFirst paid to participating panels. These payments ranged from 1% to 3% of total medical spending depending on the year.
Our study differs from these earlier studies in 3 ways. First, we estimated impacts on the Medicare FFS population, not the commercial population. Second, we estimated impacts for the 14 of the 450 commercial panels that CareFirst selected for extension to Medicare, whereas the earlier studies examined effects for all commercial panels. Afendulis et al23 cited lack of practitioner engagement as 1 likely explanation for smaller than expected impacts for the commercial patients, but this is unlikely to be the explanation in our study because CareFirst selected 14 panels that were among the most engaged for extension to Medicare. Indeed, our finding that 90% of PCPs in the 14 panels worked with nurses to develop care plans suggests they were engaged. Finally, the care coordination component reached a larger fraction of the Medicare population than of the commercial population in earlier studies. Specifically, about 8% of panels’ Medicare patients received care coordination services, compared with less than 1% of commercial patients.
Three potential limitations may help explain why the CareFirst program did not measurably reduce medical spending for Medicare patients but did for commercial patients.
First, CareFirst used the same algorithm for identifying high-risk Medicare patients as it does for identifying high-risk commercial patients. That algorithm segments the commercial population well; only 11% of commercial patients fall within CareFirst’s top 2 risk bands, and these patients account for 60% of total commercial spending. Therefore, targeting these top 2 bands gives clear direction to nurses about where to focus their efforts. In contrast, 60% of Medicare patients fall in the top 2 risk bands, so the algorithm provides less clear direction to nurses about whom to target. Furthermore, the care coordination services may not be sufficiently tailored to Medicare patients. For example, nurses contacted patients almost exclusively by telephone, whereas previous reviews have found that, for Medicare patients, frequent in-person contact may be critical for reducing hospitalizations.24,25
Second, CareFirst used commercial claims data to classify practitioners as high, medium, or low cost, and program consultants encouraged PCPs to refer Medicare patients more often to lower-cost specialists. However, physicians considered low cost based on commercial claims may in fact be medium or high cost for Medicare patients, given that price differences can drive variation in commercial spending while volume differences drive spending variation in Medicare (where prices are set administratively).26
Finally, by using a benchmark (2.5% per year) that exceeded actual spending growth, CareFirst calculated and paid large incentives to participating panels. Using a benchmark closer to the observed spending growth in Maryland may have signaled to panels the need to continue to adapt their interventions for Medicare patients to meet program aims. McWilliams and Song27 raised similar concerns for other payment reforms, stressing the importance of accurate benchmarks to reward true improvements in care, not forces outside of practitioners’ control.
Our difference-in-differences estimates also found the program was associated with a statistically significant reduction in the percentage of people with diabetes receiving recommended care. While surprising, this may be due to PCPs shifting their attention from lower to higher-risk patients. Rosenthal et al9 also found that a medical home initiative reduced diabetes processes of care, noting that a possible cause was diverting attention away from screening. However, it is also possible that the treatment panels performed unusually well in the baseline year and would have regressed closer to mean performance across all panels in the intervention period even without the intervention.
Our study has 3 main limitations. First, because the design is not experimental, unobservable differences between the intervention and comparison panels may mask program impacts. Second, our quality measures are limited to those measurable in claims and capture only a small set of the quality measures targeted in CareFirst’s incentive payments. Finally, because we estimated the marginal effect of extending CareFirst’s program to Medicare patients, our estimates may miss some positive spillover of the commercial program to Medicare patients. However, we expect any spillover to be small because the core of the intervention is care coordination for individual patients and care for 1 patient is unlikely to substantively change care for others.
This study’s null findings do not support scaling the current version of CareFirst’s program to Medicare patients broadly. The contrast with more favorable results for commercial patients suggests several ways the program could be further adapted to the Medicare population. These include refining the targeting algorithm to better identify those who could benefit from care coordination, adopting care coordination strategies (like in-person contacts) shown to be effective for Medicare patients, and tiering specialists on episode costs for Medicare, rather than commercial, patients. Furthermore, using local benchmarks of actual spending growth to calculate panel performance would improve signals to panels about when they need to refine their approaches. Additional testing would be needed to determine whether these or other changes would lead to a more successful medical home program for Medicare patients.
Corresponding Author: G. Greg Peterson, PhD, MPA, Mathematica Policy Research, 1100 1st St NE, Washington, DC 20002 (email@example.com).
Accepted for Publication: May 10, 2015.
Published Online: July 31, 2017. doi:10.1001/jamainternmed.2017.2775
Author Contributions: Dr Peterson had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Peterson, Geonnotti, Day, Blue, Kranker, Gilman, Stewart, Hoag, Moreno.
Acquisition, analysis, or interpretation of data: Peterson, Geonnotti, Hula, Kranker, Gilman, Stewart, Hoag, Moreno.
Drafting of the manuscript: Peterson, Geonnotti, Hula, Day, Hoag.
Critical revision of the manuscript for important intellectual content: Peterson, Geonnotti, Day, Blue, Kranker, Gilman, Stewart, Hoag, Moreno.
Statistical analysis: Peterson, Blue, Kranker, Gilman, Stewart, Moreno.
Obtained funding: Moreno.
Administrative, technical, or material support: Peterson, Geonnotti, Day, Blue, Gilman, Stewart, Hoag, Moreno.
Study supervision: Moreno.
Conflict of Interest Disclosures: None reported.
Funding/Support: Evaluation was funded by the Centers for Medicare & Medicaid Services, Center for Medicare & Medicaid Innovation, contract No. HHSM-500-2010-000261/HHSM-500-T0015.
Role of the Funder/Sponsor: The funding source had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation of the manuscript. The manuscript was approved for submission through a standard CMS communications clearance process.
Disclaimer: The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the US Department of Health and Human Services or any of its agencies.
Additional Contributions: Many individuals contributed to this study. We acknowledge the important contributions made by Sandi Nelson, MPP, Ken Peckham, AA, Andrew McGuirk, BA, and Huihua Lu, PhD, of Mathematica Policy Research, who skillfully processed and helped analyze Medicare claims data, Medicare enrollment data, and CareFirst data. They were all compensated for their work under the same contract that funded the work overall. We also thank CareFirst for providing us with data on its program.