Comparison of Quality Performance Measures for Patients Receiving In-Person vs Telemedicine Primary Care in a Large Integrated Health System

Key Points Question Is there a difference in standardized quality performance measures for primary care patients exposed to telemedicine compared with patients with office-only (in-person) care? Findings In this cohort study of 526 874 patients, telemedicine exposure was associated with significantly better performance or no difference in 13 of 16 comparisons, mostly in testing-based and counseling-based quality measures. Patients with office-only visits had modestly better performance in 3 of 5 medication-based quality measures. Meaning Findings suggest that telemedicine exposure in primary care poses a low risk for negatively affecting quality performance, highlighting its potential to suitably augment care capacity.


SlicerDicer Data Mining
Using a standardized, general approach to measure numerators and denominators in SlicerDicer provided consistent results. This necessitated 32 unique data sessions to ensure accurate numerators and denominators: 16 for office-only patients and 16 for telemedicine exposed patients. First, we built the HEDIS specified denominator criteria and noted the total number of patients. Next, we added the numerator criteria (unique from the denominator criteria), this resulted in a smaller proportion of the denominator. These numerator filters ensured that we were working within the same population between cohorts. We then exported deidentified patient data into excel to use automated calculations of differences between cohorts and sub-groups. We analyzed excel calculated percentages with SlicerDicer percentages to ensure comparability. Importantly, this approach avoided reliance on unreliable or inaccurate data representation within SlicerDicer.
To ensure unique denominators for patients in the divided cohort, we used a patient data model that counted the number of patients (rather than number of visits). Importantly, this avoided redundant inclusion (cross contamination) of the same patient between denominator groups. Moreover, since we built separate data sessions for the divided cohort, redundant numerator counts between groups (officeonly versus telemedicine-exposed) was not possible. Appropriately, this avoided double counting quality performance between groups, especially important for patients that had both office and telemedicine encounters (these patients for example, would only be counted in the telemedicine-exposed group since having a telemedicine encounter would exclude them from the office-only group).
Patient data models were used in 16 separate sessions to ensure accuracy in data mining. We used three primary data filters in every session to ensure consistency (these filters were all "linked"): (1) "primary care service line" (limiting the total number of patients to only primary care encounters), (2) "face-to-face encounter" (limiting to only live encounters with clinicians, filtering out any telephone triage or medication refill encounters), (3) "encounter type" (office-only, telemedicine-only, or office and telemedicine). For the "encounter type" filter, 3 sequences were conducted for each of the 16 measures to capture cohort numerators and denominators between groups: first, an office subfilter with exclusion of telemedicine; second, a telemedicine subfilter with exclusion of office; third, office and telemedicine combined.

Historical (pre-pandemic) baseline
This was the methodological approach to constructing the historical baseline comparison to the study timeframe. As shown in table 1, there was largely comparable quality performance by patients with identical selection criteria (see manuscript methods, figure 1) from pre-pandemic to pandemic. Further, the majority of differences between timeframes was less than 5%. It was important to avoid case-mix adjustments or timeframe selections in our data set since the NQF states that this can impair validity and reliability of results 1,2 . So evaluating the population over nearly a 4 year timeframe provided further reassurance to results of the study.

Quality measure selection
Of the 23 total measures in the original CQMC data set for primary care 3 , we ultimately chose only 16 measures since obtaining all 23 was not possible with the limitations of our EMR data extraction methods. We excluded measures that were either (1) unable to be tracked within the EMR (for example, diabetic eye exams that were done outside the health system), (2) difficult to obtain (for example, BMI and alcohol use counseling, medicine reconciliations, or patient experience surveying), or (3) too complex to obtain within the study timeframe (for example, HbA1c <9 or asthma control which involves complex calculations of medication ratios). As mentioned in the methods of the manuscript, measures from CMS were also added. Appendix table 2 reveals these details in the "measure steward" with any adjustments or additions made to each of NQF's quality measures 2 .
A notable adjustment was made to the nephropathy measure for patients with diabetes. For nephropathy evaluation as a HEDIS measure, the literature has described the overestimation and subsequent overreporting of reporting compliance of nephropathy screening when the measure numerator includes the presence of ACEi/ARBs prescription (inferring microalbuminuria from the presence of these medications may not be reliable) 4 . Thus, authors chose to not include the presence of ACEi/ARB in the numerator and only measured nephropathy tests (microalbuminuria, UA or protein/creatinine ratio).
Technical Appendix EHR data from a large integrated non-profit healthcare system was used to compare in-office and telemedicine outpatient encounters. Primary care quality measures from 3/1/20 to 4/30/21 were evaluated and analyzed separately for each of these cohorts so that measures were linked to corresponding encounter types.
Of the sixteen-quality metrics, four measures were used as outcomes in regression analyses (statin therapy, flu vaccination, blood pressure control and depression screening). We selected these measures from each quality domain based on the highest encounter count. The receipt of statin therapy for patients with diabetes among adults 40-75 years and who do not have clinical ASCVD, was coded as '1' if they received and adhered to statin therapy, otherwise codes as '0'. For the high blood pressure control measure, adults 18-85 years of age who had a diagnosis of hypertension (HTN) and whose blood pressure was adequately controlled (<140/90 mm Hg) during the measurement year were coded as '1', and 'o' for those with uncontrolled blood pressure.
Patients aged 6 months and older seen for a visit between October 1 and March 31 who received an influenza immunization OR who reported previous receipt of an influenza immunization were coded as '1' and '0' if they did not receive it. For depression screening measure patients aged 12 years and older screened for depression on the date of the encounter or 14 days prior to the date of the encounter using an age-appropriate standardized depression screening tool AND if positive, a follow-up plan documented on the date of the eligible encounter were coded as '1' and '0' otherwise.
The type of visit (telehealth or office-based encounter) was used as the explanatory variable. Age in years as a continuous measure, binary gender (male or female), race (White, Black or African American, Other, Asian, American Indian or Alaska Native, Native Hawaiian or Other Pacific Islander), ethnicity (Hispanic and Not Hispanic or Latino), and overall adult risk scores were used as controls. Binary logistic regression models were performed separately for each of the selected outcomes using Stata 16.0 (Stata Corp). The alpha was set at p<0.05).
For all the regression models, Odds Ratios, 95% Confidence Interval and p-values were reported (Appendix Table 3 Abbreviations: BP = blood pressure, HTN = hypertension, CVD = cardiovascular disease, MI =myocardial infarction, HF = heart failure, DM = diabetes mellitus, CC = colorectal cancer For column headers: "office" includes patients seen only in the office setting (excludes patients with any telemedicine encounters), "blended" includes only patients with office and telemedicine encounters (having at least one of each encounter type during the timeframe), and telemedicine includes only patients seen via video telemedicine (excludes patients with any office encounters).
Overall, in the pre-COVID-19 timeframe (historical baseline), interpretation of blended and telemedicine-only HEDIS performance should be in the context of the very small volume of telemedicine encounters throughout the health system. HEDIS numerators (not shown) were calculated according to the measure steward specifications (see appendix). Bolded values in the "total" column during the study timeframe indicate where quality performance increased compared to the historical baseline (improved in 9 of 16 measures). Percentage of patients aged 18 years and older with a diagnosis of heart failure (HF) with a current or prior left ventricular ejection fraction (LVEF) < 40% who were prescribed beta-blocker therapy either within a 12-month period when seen in the outpatient setting or alternatively at each hospital discharge 9 Numerator Statement: Patients who were prescribed beta-blocker therapy within a 12-month period when seen in the outpatient setting Denominator Statement: All patients aged 18 years and older with a diagnosis of heart failure with a current or prior LVEF < 40%.   Numerator Statement: Patients who were dispensed antibiotic medication on or three days after the index episode start date (a higher rate is better). The measure is reported as an inverted rate (i.e., 1-numerator/denominator) to reflect the number of people that were not dispensed an antibiotic.

Diabetes
Denominator Statement: All patients 18 years of age as of January 1 of the year prior to the measurement year to 64 years as of December 31 of the measurement year with an outpatient or ED visit with any diagnosis of acute bronchitis during the Intake Period (January 1-December 24 of the measurement year).