A, Recommended clinical care composites. Comparison of 2002 and 2013: recommended cancer screening (P < .01), recommended diagnostic and preventive testing (P = .05), recommended diabetes care (P = .21), recommended counseling (P < .01), and recommended medical treatment (P < .01). B, Avoidance of inappropriate clinical care composites. Comparison of 2002 and 2013: inappropriate medical treatment avoidance (P < .01), inappropriate imaging avoidance (P = .64), inappropriate cancer screening avoidance (P = .02), and inappropriate antibiotic avoidance (P < .01). C, Patient experience measures were dichotomized as follows: response of 4 on a Likert scale of 1 to 4 (physician communication and access) or responses of 8, 9, or 10 on a Likert scale of 0 to 10 (global care) were counted as positive. Comparison of 2002 and 2013: global care (P < .01), physician communication (P < .01), and access (P < .01). Error bars indicate 95% CIs. See the eFigure in the Supplement for linear representation on a 0 to 10 scale.
eTable 1. Detailed characteristics of the Medical Expenditure Panel Survey Participants, 2002 – 2013
eTable 2. Selected comparisons of quality of care studies
eTable 3. Comparison between McGlynn et al and MEPS measure sets
eFigure 1. Trends in patient experience, scaled 0 to 10, 2002 – 2013
Levine DM, Linder JA, Landon BE. The Quality of Outpatient Care Delivered to Adults in the United States, 2002 to 2013. JAMA Intern Med. 2016;176(12):1778-1790. doi:10.1001/jamainternmed.2016.6217
How has the quality of outpatient care delivered to adults in the United States changed from 2002 to 2013?
Analyses of a nationally representative cross-sectional survey show that the quality of outpatient care inconsistently improved. From 2002 to 2013, 4 clinical quality composites improved, 2 worsened, and 3 were unchanged. Patient experience improved. Most composites continued to demonstrate disappointingly low absolute rates, even when improvement occurred.
Deficits in care continue to pose serious hazards to the health of the American public.
Widespread deficits in the quality of US health care were described over a decade ago. Since then, local, regional, and national efforts have sought to improve quality and patient experience, but there is incomplete information about whether such efforts have been successful.
To measure changes in outpatient quality and patient experience in the United States from 2002 to 2013.
Design, Setting, and Participants
We analyzed temporal trends from 2002 to 2013 using quality measures constructed from the Medical Expenditure Panel Survey (MEPS), a nationally representative annual survey of the US population that collects data from individual respondents as well as respondents’ clinicians, hospitals, pharmacies, and employers. Participants were noninstitutionalized US adults 18 years or older (range, 20 679-26 509 individuals each year).
Outpatient quality measures were compiled through a structured review of prior studies and measures endorsed by national organizations. Nine clinical quality composites (5 “underuse” composites, eg, recommended medical treatment; 4 “overuse” composites, eg, avoidance of inappropriate imaging) based on 39 quality measures; an overall patient experience rating; and 2 patient experience composites (physician communication and access) based on 6 measures.
From 2002 to 2013 (MEPS sample size, 20 679-26 509), 4 clinical quality composites improved: recommended medical treatment (from 36% to 42%; P < .01), recommended counseling (from 43% to 50%; P < .01), recommended cancer screening (from 73% to 75%; P < .01), and avoidance of inappropriate cancer screening (from 47% to 51%; P = .02). Two clinical quality composites worsened: avoidance of inappropriate medical treatments (from 92% to 89%) and avoidance of inappropriate antibiotic use (from 50% to 44%; P < .01 for both comparisons). Three clinical quality measures were unchanged: recommended diagnostic and preventive testing (76%), recommended diabetes care (68%), and inappropriate imaging avoidance (90%). The proportion of participants highly rating their care experience improved for overall care (from 72% to 77%), physician communication (from 55% to 63%), and access to care (from 48% to 58%; P < .01 for all comparisons).
Conclusions and Relevance
Despite more than a decade of efforts, the clinical quality of outpatient care delivered to American adults has not consistently improved. Patient experience has improved. Deficits in care continue to pose serious hazards to the health of the American public.
Over a decade ago, McGlynn and colleagues1 reported that adults in the United States received just over half of recommended health care services. Since then, there have been local, regional, and national efforts to improve the quality of health care, including expanded quality measurement and public reporting programs2,3; increased adoption of pay-for-performance4; increased adoption of value-based purchasing by private and public payers; broad encouragement of electronic health record use5; improved coverage for recommended services6; and expansion of patient-centered medical homes.7 In recent years, these efforts have been complemented by an increasing focus on overuse through programs such as the American Board of Internal Medicine Foundation’s Choosing Wisely initiative8 and increasing attention to patient-reported outcomes.9- 11
Despite these efforts, there are few national data to gauge whether the quality of care in the United States is improving. Studies to date have been limited by reliance on a small number of quality measures,12,13 attention to specific diseases,14- 17 modest measurement of overuse,18 or reduced generalizability by focusing on Medicare19- 22 or only those with a usual source of care.1,23 Most have yielded only year-long snapshots of quality, precluding a broad appraisal of change.24 Some harbor cautious optimism that care is improving, while others express frustration at the perceived slow pace of improvement.25,26
To determine whether efforts to improve outpatient quality have been successful, we measured 46 indicators of the quality of outpatient care delivered to adults in the United States over the past decade in the areas of recommended care, inappropriate care, and patient experience. Evaluation of care quality performance may enable policymakers, clinicians, and health system leaders to target key areas for attention and improvement.
The Harvard Medical School institutional review board determined this study to not be human subject research and therefore exempt from approval.
We analyzed data from the 2002 to 2013 Medical Expenditure Panel Survey (MEPS), a nationally representative annual survey of repeated cross sections of the noninstitutionalized United States civilian population.27 The MEPS sample is drawn from respondents to the annual National Health Interview Survey. The MEPS uses a complex survey design that delivers English or Spanish computer-assisted personal interviews to collect detailed data on demographic characteristics, health conditions, health status, medical services utilization, medications, cost, source of payments, health insurance coverage, income, employment, experience with care, and access to care. Overall annual response rates ranged from 50% to 65% (mean, 59%).
The MEPS then supplements self-reported information by contacting respondents’ clinicians (mean response rate, 86%), hospitals (91%), pharmacies (75%), and employers (86%). Of those contacted, the MEPS records any and all encounters with these entities. Clinicians specify details regarding office visits (eg, diagnostic tests, cost); hospitals specify admissions (eg, cost); pharmacies specify individual medications dispensed (eg, dose, formulation, cost); and employers specify details about insurance coverage. For each encounter, the clinician, hospital, or pharmacy has the option to complete a computer-assisted telephone interview (CATI) or can send medical and billing records that are abstracted via the same CATI procedure. For example, if a patient reported visiting a physician for back pain and receiving an opioid prescription, the physician would corroborate that a medication was prescribed, and the pharmacy would corroborate the formulation, dose, and frequency of the opioid prescription. In cases of discrepancy between respondents’ self-report and other sources, the MEPS use the other sources, although the MEPS does not reveal how frequently this occurs.
The MEPS includes 2 additional mail-back surveys: the adult self-administered questionnaire (SAQ) and the diabetes care survey (DCS). The SAQ, administered to all adult respondents, includes items from the Consumer Assessment of Healthcare Providers and Systems (CAHPS) survey, the 12-item Short Form (SF-12), and additional items measuring respondents’ attitudes about health care (annual response rate range, 91%-94%). The DCS, administered only to respondents with self-reported diabetes, includes items related to care for diabetes (annual response rate range, 88%-97%).
We restricted our analyses to the adult population ages 18 years or older. Sample sizes ranged from 20 679 to 26 509 respondents per year.
We reviewed all Healthcare Effectiveness Data and Information Set (HEDIS) measures23; all ambulatory process measures endorsed by the National Quality Forum (representing measures from numerous advisory/governing bodies)28; all ambulatory measures from McGlynn and colleagues1 (based on more than expert opinion and not related to cancer treatment or pregnancy); all ambulatory process measures in the National Healthcare Quality and Disparities Report29; all recommendations of the US Preventive Services Task Force (USPSTF)30; and all measures from work on overuse by Schwartz et al, 20,31 Colla et al,32 and Choosing Wisely33 (eAppendix in the Supplement).
We excluded duplicate measures and measures that could not be accurately and reliably constructed and assessed using the MEPS. We also excluded a small number of measures that were controversial (n = 5), such as prostate cancer screening (USPSTF grade I [“insufficient evidence”] for most of our study period)34,35 or that had major changes over time (n = 22), such as statin use for patients with diabetes (which became an official recommendation in 2013)36 or avoiding β-blocker use for patients with concomitant airway disease.37 For measures with only minor changes over time (n = 4), we used consistent measure definitions to ensure valid comparison. For example, recommendations for influenza vaccination changed between 2002 and 2013.38,39 However, the recommendation to vaccinate persons 50 years or older has remained consistent; thus, our measure examines vaccination only among persons 50 years or older. Similarly, to reconcile most breast cancer screening recommendations, we applied less stringent criteria over the time period: screening once every 2 years for women 50 to 74 years old. Thus, our measures are broadly applicable to care delivered over time and currently.
After applying the exclusion criteria, we evaluated performance over time on 39 clinical quality measures, including 25 underuse measures and 14 overuse measures (Table 1; eAppendix in the Supplement).
From these measures, we constructed 5 clinically meaningful underuse composites (eg, recommended cancer screening), where delivery of the service is likely of benefit to the patient, and 4 overuse composites (eg, avoidance of imaging in specific clinical situations), where delivery of the service is considered either inappropriate or of little to no benefit (Table 1). To calculate performance for each measure, we first identified those respondents who were eligible for the measure (eg, those with diabetes) and then whether or not they received the particular care (eg, eye examination). To calculate composites, we divided all instances in which recommended care was delivered (for underuse measures) or avoided (for overuse measures) by the number of times participants were eligible for care in the category, as others have done.1 Theoretically, composites could range from 0 to 100%.
We evaluated MEPS measures of patient experience drawn from the CAHPS questionnaire that referred to a patient’s overall health care experience in the past year (Table 1). A global rating measure asked about patient experience with all health providers (0, “worst health care possible,” to 10, “best health care possible”). The physician communication composite asked 4 items (eg, “How often did the physician spend enough time with you?”), and the access to care composite included 2 items (eg, “How often did you get a medical appointment as soon as wanted?”). Responses for each item were coded from “never” (1) to “always” (4). To better discriminate changes over time, we assessed high ratings in overall care, physician communication, and access to care, similar to HEDIS analyses.23 We dichotomized all measures such that a positive response included 8, 9, or 10 for the items scored from 0 to 10 and 4 for the items scored from 1 to 4 (Table 1). As a sensitivity analysis, we also rescaled 1 to 4 measures to a 0 to 10 scale, as others have done (eFigure in the Supplement).68 We calculated each composite by first computing the mean for each respondent and then taking the mean for all respondents.
In all analyses we accounted for the complex design of the MEPS, applying survey estimation weights, primary sampling unit clusters, and sampling strata to allow for national estimates adjusted for nonresponse, as recommended by the Agency for Healthcare Research and Quality.69,70 We present weighted percentages. To examine whether performance was improved at the end of the study period compared with the beginning, we compared composites in 2002 and 2013 with χ2 tests adjusting for the complex survey design.71 We performed all analyses with SAS statistical software (version 9.4). We considered 2-sided P < .05 to be significant.
From 2002 to 2013, the US adult population aged (mean age increased from 45 to 47 years; P < .01), became less white (from 71% to 65%; P < .01), more Hispanic (from 12% to 15%; P < .01), more likely to have graduated college (from 15% to 18%; P < .01), and less likely to smoke cigarettes (from 21% to 16%; P < .01) (Table 2; eTable 1 in the Supplement). There were decreases in private insurance coverage (from 74% to 66%; P < .01), having a usual source of care (from 77% to 74%; P < .01), and employment (from 72% to 69%; P < .01). Rates of chronic diseases increased. In 2002, 8% of Americans had 3 or more chronic diseases compared with 18% in 2013 (P < .01). No changes occurred in perceived health status (27% reported their overall health as excellent) or activities of daily living.
Rates of recommended medical treatment delivery improved from 36% in 2002 to 42% in 2013 (P < .01) (Table 3; Figure, A). The most pronounced improvements in this composite were improvements in the use of β-blockers for heart failure (41% to 65%; P < .01) and statins for stroke (34% to 57%; P < .01). Declines occurred in the use of an angiotensin-converting enzyme inhibitor or angiotensin receptor blocker (ACEi/ARB) in patients with concomitant diabetes and hypertension (from 64% to 58%; P < .01) and controller medications among patients with poorly controlled asthma (from 71% to 59%; P < .01). Rates of recommended counseling delivery improved from 43% in 2002 to 50% in 2013 (P < .01), driven most by smoking cessation counseling (from 49% to 61%; P < .01). Recommended cancer screening improved minimally (from 73% to 75%; P < .01), with marked improvement in colorectal cancer screening (from 48% to 63%; P < .01), offset by decreasing rates of breast cancer screening (from 81% to 77%; P < .01) and cervical cancer screening (from 90% to 86%; P < .01).
Other underuse composites (diagnostic and preventive testing [76%; P = .05] and diabetes care [68%; P = .21]) were unchanged (Table 3; Figure, A).
Avoidance of inappropriate cancer screening improved from 47% in 2002 to 51% in 2013 (P = .02) (Table 3; Figure, B). Avoidance of inappropriate cervical cancer screening in women older than 65 years improved from 38% to 51% (P < .01), but avoidance of inappropriate colorectal cancer screening in those older than 75 years worsened from 70% to 61% (P < .01).
Avoidance of inappropriate antibiotic prescribing worsened from 50% to 44% (P < .01), as did avoidance of inappropriate medical treatments (from 92% to 89%; P < .01). For instance, avoidance of inappropriate medications in older adults (from 93% to 91%; P < .01), opioids for back pain (from 98% to 95%; P < .01), and nonsteroidal anti-inflammatory drugs in hypertension, heart failure, or kidney disease (from 88% to 85%; P < .01) all worsened. Avoidance of inappropriate imaging was unchanged (90%; P = .64).
The percentage of respondents rating their global experience with care an 8, 9, or 10 out of 10 increased from 72% in 2002 to 77% in 2013 (P < .01; Table 3; Figure, C). On a 0 to 10 scale, global rating improved from 8.1 in 2002 to 8.3 in 2013 (P < .01; eFigure in the Supplement). The percentage reporting a highly positive experience (a 4 out of 4) related to physician communication improved from 55% in 2002 to 63% in 2013 (P < .01). Respondents’ rating of physician communication improved from 8.1 out of 10 in 2002 to 8.5 out of 10 in 2013 (P < .01). Respondents increasingly noted their physician “always” spent enough time with them (46% to 55%; P < .01). The percentage who “always” had access to care (a 4 out of 4) increased from 48% in 2002 to 58% in 2013 (P < .01). On a scale of 0 to 10, access to care improved from 7.6 in 2002 to 8.0 in 2013 (P < .01).
Despite local, regional, and national efforts to improve care, we found inconsistent improvements in the quality of outpatient care delivered to adults over the past decade in this large, nationally representative study. Although there were areas of improvement, including provision of recommended medical treatments, recommended counseling, and avoidance of inappropriate cancer screening, there were also areas of decline, including avoidance of inappropriate antibiotic prescribing and avoidance of inappropriate medical treatments. Several composites continued to demonstrate disappointingly low absolute rates and small absolute changes, even when improvement occurred, such as for recommended medical treatment and inappropriate cancer screening. Patient experience with care showed consistent, significant improvements. All of this occurred in the context of an American population that became, on average, 1.6 years older, slightly poorer, and accrued more health conditions, although with unchanged self-rated general health status.
Continued deficits in recommended care have important implications for the health of Americans. About 1 in 4 eligible Americans failed to receive recommended cancer screening, diagnostic and preventive testing, or diabetes care. About 60% of eligible Americans did not receive beneficial cardiovascular and pulmonary therapies. For example, in heart failure, only 57% and 65% of eligible Americans took recommended ACEi/ARB and β-blocker medications, despite a 16% and 4% absolute reduction in mortality, respectively.73,74 Similarly, in poorly controlled chronic obstructive pulmonary disease, only 35% of eligible Americans took a recommended controller medication in 2013, a treatment that reduces exacerbations by 25% and hospitalizations by 17%.75
Waste and possible harm from overuse of care are also substantial. About half of older Americans received cancer screening when it was unlikely to prolong life. About half of Americans who made a visit for viral illnesses received inappropriate antibiotics, which exposes patients to adverse drug events, increases costs, and increases the prevalence of antibiotic resistant bacteria.76 Almost 1 in 6 Americans who made a visit for back pain received an inappropriate lumbar radiograph—the largest radiation dose of any plain film examination (equivalent to 70 chest radiographs).77 Importantly, many of these trends worsened. When considered in the context of increasing health care spending (approximately 15% of gross domestic product [$5700 per person] in 2002 vs approximately 17% [$9100 per person] in 2013),78 these areas represent prime targets for efforts to improve the value of care delivered by eliminating services that have a neutral or negative impact on health. Although early research suggests that initiatives such as Medicare’s accountable care organization (ACO) programs may be impacting overuse of low-value services, we see little evidence of widespread improvements during our study.20
Substantial and consistent improvements were seen in patient experience. From 2002 to 2013, 5% more Americans reported excellent global experience with care and 8% more thought highly of their physician’s communication, relatively large effects.68 When first introduced, there was considerable controversy over whether experience measures were valid measures of health care quality,10,79 yet over our study period, measurement and reporting of patient experience has become standard. In addition to being publicly reported beginning in 2007, patient experience is becoming increasingly important for reimbursement. For instance, patient experience measures are components of quality measured and rewarded in Medicare’s managed care programs.68,80 Our data likely demonstrate that health care systems have responded to these and other incentives and invested to improve patient experience. Whether other areas of care could be influenced with a similar system of reporting remains to be seen.
Similarly, Americans who reported always having access to care improved by 10%. Access improved despite increased proportions of adults who were uninsured or used public insurance, key factors associated with worse access to primary care.81 However, we specifically did not examine primary care access measures, but instead examined access to any medical care. Therefore, use of emergency departments, retail clinics, and other nontraditional medical establishments may have improved Americans’ perception regarding access.82
For quality of care, several possibilities exist for the inconsistent gains and disappointing absolute rates observed. First, despite recent efforts to transition to alternative payment models such as value-based purchasing, fee-for-service remains dominant in the US.83 Although data are mixed, fee-for-service incentives may run counter to efforts to improve the global delivery of recommended care and to avoid inappropriate care, as payments are not linked to appropriateness or quality.84 Moreover, most standalone pay-for-performance programs used to mitigate fee-for-service incentives have had little to no impact on quality. In contrast, integrated care networks that operate on a global budget and take a population health approach to achieving quality,85,86 programs that include integrated pay-for-quality,87- 89 and institutions that engage in substantive quality improvement,90,91 have shown improvements in the quality of outpatient care.
Second, most quality measures, including many of those we examined, are focused on the provision of primary care, but the United States underinvests in primary care.92 Comprehensive primary care is associated with lower costs, improved health outcomes, greater efficiency, and reduced disparities.93 Primary care spending in the United States accounts for only 6% to 8% of total medical expenditures, a number likely unchanged for over a decade.94 Current efforts to improve the delivery of primary care through the patient-centered medical home have yielded inconsistent effects on quality, though these efforts may evolve and strengthen over time.95
Third, broad policy changes and reform efforts may be necessary but not sufficient to improve quality. Many inputs affect, for example, whether an individual obtains colorectal cancer screening. Perhaps social determinants of health create barriers, or perhaps personal health beliefs impact the decision. The complexity and interconnectedness of health care mean that a single policy or national improvement initiative may be insufficient to bolster quality.
Fourth, changes resulting from the Affordable Care Act (ACA) are not reflected in our data, as key insurance provisions began in October 2013.96 The ACA may be crucial to the requisite multipronged effort to align payers, clinicians, and patients. It has encouraged organizational changes through ACOs and renewed investment in primary care through programs such as the Comprehensive Primary Care Initiative and National Health Service Corps. The ACA has also accelerated efforts to move away from fee-for-service toward pay-for-value and bundled payment programs. Perhaps most importantly, 30 million more Americans now have health insurance. Whether these efforts will lead to measurable improvements in national quality remains to be seen.
Our study has limitations. First, we could not always measure all potential relevant exclusions when calculating quality measures (eg, bilateral mastectomy for breast cancer screening). However, others have shown only small differences when accounting for multiple exclusions with billing data.31 The MEPS does not rely on administrative data but instead on a rich combination of self-report and clinical data, and our estimates are often comparable with others (eTable 2 in the Supplement). We consciously chose not to adjust for population characteristic changes in an effort to demonstrate actual care delivery. Most important, our measures are internally consistent over time so changes in rates are likely to reflect true changes.
Second, our quality measures do not address all outpatient care. However, to our knowledge, the MEPS represents one of the largest nationally representative sets of consistently collected quality measures available for more than a decade. The study by McGlynn and colleagues1 from over a decade ago was seminal work covering 18 outpatient categories (excluding cancer treatment and pregnancy) with 81 measures (excluding those based only on expert opinion). However, it was a single point in time, included only people seeking medical care in 12 metropolitan areas, had a response rate of 34%, and had denominators with fewer than 30 individuals for half of the 81 measures (eTable 3 in the Supplement). Our present analysis builds on the work of McGlynn and colleagues1 with a nationally representative population-based sample; 15 outpatient categories encompassing 46 measures; additional measures of medication safety, overuse, and patient experience; 50% to 65% response rates; and higher sample sizes (most measures with denominators >1000; just 1 measure with denominator <30). Like McGlynn et al,1 we focused on process and structure measures to identify portions of the health care system in need of improvement. These measures are most often under the control of clinicians and the health care system.97 Because of our structured search and exclusion criteria, our measures capture clinically important care that has been the focus of improvement efforts and have remained relevant for over a decade.
Third, although our composites capture change in the aggregate, they are not meant to enable comparison between, for example, diabetes care and cardiovascular care, because some individual measures might be easier to achieve than others. Similarly, the complexity of health care limits our ability to determine exact reasons for such inconsistent improvements in quality.
Fourth, whereas most measures were corroborated by a second or third source, 16 clinical measures relied only on self-report (Table 1). Although self-report has been shown to have reasonable concordance with electronic medical records and administrative sources, this can vary by item.98,99 Any bias would be internally consistent over time.
Despite more than a decade of efforts to improve the quality of health care in the United States, the quality of outpatient care delivered to adults has not consistently improved. There have been improvements in patient experience. Current deficits in care continue to pose serious hazards to the health of the American public in the form of missed care opportunities as well as waste and potential harm from overuse. Ongoing national efforts to measure and improve the quality of outpatient care should continue, with a renewed focus on identifying and disseminating successful improvement strategies.
Corresponding Author: David M. Levine, MD, MA, Division of General Internal Medicine and Primary Care, Brigham and Women’s Hospital and Harvard Medical School, 1620 Tremont St, Third Floor, Boston, MA 02120 (firstname.lastname@example.org).
Accepted for Publication: August 22, 2016.
Published Online: October 17, 2016. doi:10.1001/jamainternmed.2016.6217
Author Contributions: Dr Levine had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Drs Linder and Landon contributed equally to this manuscript.
Concept and design: All authors.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Levine.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Levine.
Administrative, technical, or material support: Levine.
Study supervision: Linder, Landon.
Conflict of Interest Disclosures: None reported.
Funding/Support: Dr Levine received funding support from an Institutional National Research Service Award from the National Institutes of Health (NIH) (T32HP10251) and from the Ryoichi Sasakawa Fellowship Fund.
Role of the Funder/Sponsor: The NIH had no role in the design and conduct of the study; the collection, management, analysis, and interpretation of the data; or the preparation, review, or approval of the manuscript.