Jencks SF, Cuerdon T, Burwen DR, Fleming B, Houck PM, Kussmaul AE, Nilasena DS, Ordin DL, Arday DR. Quality of Medical Care Delivered to Medicare BeneficiariesA Profile at State and National Levels. JAMA. 2000;284(13):1670-1676. doi:10.1001/jama.284.13.1670
Author Affiliations: Health Care Financing Administration, Baltimore, Md.
Context Despite condition-specific and managed care–specific reports,
no systematic program has been developed for monitoring the quality of medical
care provided to Medicare beneficiaries.
Objective To create a monitoring system for a range of measures of clinical performance
that supports quality improvement and provides repeated, reliable estimates
at the national and state levels for fee-for-service (FFS) Medicare beneficiaries.
Design, Setting, and Participants National study of repeated, cross-sectional observational data collected
in 1997-1999 on all Medicare FFS beneficiaries or on a representative sample
of beneficiaries with a particular condition. Data were collected using medical
record abstraction for inpatient care, analysis of Medicare claims for some
ambulatory services, and surveys for immunization rates. Separate samples
were drawn for each topic for each state.
Main Outcome Measures Beneficiary patients' receipt of 24 process-of-care measures related
to primary prevention, secondary prevention, or treatment of 6 medical conditions
(acute myocardial infarction, breast cancer, diabetes mellitus, heart failure,
pneumonia, and stroke) for which there is strong scientific evidence and professional
consensus that the process of care either directly improves outcomes or is
a necessary step in a chain of care that does so.
Results Across all states for all measures, the percentage of patients receiving
appropriate care in the median state ranged from a high of 95% (avoidance
of sublingual nifedipine for patients with acute stroke) to a low of 11% (patients
with pneumonia screened for pneumococcal immunization status before discharge).
The median performance on an indicator is 69% (patients discharged with heart
failure diagnosis who received angiotensin-converting enzyme inhibitors; diabetic
patients having an eye examination in the last 2 years). Some states (particularly
less populous states and those in the Northeast) consistently ranked high
in relative performance while others (particularly more populous states and
those in the Southeast) consistently ranked low.
Conclusions It is possible to assemble information on a diverse set of clinical
performance measures that represent performance on the range of services in
a health insurance program. These findings indicate substantial opportunities
to improve the care delivered to Medicare beneficiaries and urgently invite
a partnership among practitioners, hospitals, health plans, and purchasers
to achieve that improvement.
As concern grows that attempts to control the cost of health care will
crowd out quality, evidence has also emerged that quality of care is and has
been far more uneven than previously recognized. The public health report
entitled Healthy People 20101
showed wide gaps between public health performance goals and actual achievements
on many measures, including some delivered by the fee-for-service (FFS) health
care system. Reviews, most notably by Schuster et al,2
showed that there were major gaps in acute, chronic, and preventive care almost
everywhere that studies have been done. More recently, a report from the Institute
of Medicine showed serious problems of harm to patients from medical errors.3 This kind of evidence was reflected in the recommendation
of a recent presidential commission that quality of health care should become
a major national priority.4 Despite condition-specific
and managed care–specific reports, there has been no systematic program
for monitoring the quality of medical care provided to FFS Medicare beneficiaries.
Except for the clinical measures of the Health Plan Employer Data and
Information Set (HEDIS)5 and the Diabetes Quality
Improvement Project (DQIP)6 there is no clinical
quality measure set in general national use. About 4 years ago, the Health
Care Financing Administration (HCFA) began to implement a program to measure
and track the quality of the care for which Medicare pays. Simultaneously,
HCFA committed to using its peer review organization (PRO) contractors to
systematically promote improved performance on the quality measures tracked
under this program using a voluntary, collaborative, and nonpunitive educational
This article describes the 24 initial measures used in this program
and reports the baseline values measured in 1997-1999. The Medicare measurement
system we developed includes most of the HEDIS clinical measures, but it addresses
more conditions, measures more elements of care, and measures the care delivered
to the 85% of Medicare beneficiaries who are covered under FFS. The sampling
frame provides state-level results to target PRO activities, evaluate PRO
and HCFA effectiveness in improving care, and create a national picture of
care under Medicare FFS.
Even though purchasers and beneficiaries are primarily interested in
outcomes, we focused on measuring processes of care critical to outcomes rather
than on measuring outcomes themselves. Five reasons drove this choice: (1)
in comparison to outcomes of care, there is more consensus on appropriate
processes of care and the target rates (nearly 100%); (2) measuring processes
of care generally does not require the risk adjustment that has been so controversial
in comparisons of outcomes; (3) it is easier for providers, practitioners,
and plans to identify and fix the reasons why critical processes of care were
not carried out than to determine why outcomes are not optimal; (4) many important
outcomes take years; and (5) because significant, achievable improvements
in outcomes are generally much smaller in relative terms than improvements
in processes, unrealistic sample sizes are necessary to measure significant
improvements in outcomes. While we report only process measures here, HCFA
intends to track outcomes, risk-adjusted when possible, at the national level
for the targeted conditions.
The clinical topics were selected using 5 criteria: (1) the disease
is prevalent and a major source of morbidity or mortality in the Medicare
population; (2) there is strong scientific evidence and practitioner consensus
that there are processes of care that can substantially improve outcomes;
(3) reliably measuring the delivery of these processes is feasible; (4) there
is a substantial "performance gap" between current performance and desirable
performance; and (5) there is at least anecdotal evidence that PROs can intervene
effectively to improve performance on the measures. Using these criteria,
we adopted or developed 24 process-of-care measures (Table 1) relating to primary prevention, secondary prevention, or
treatment of acute myocardial infarction (AMI), breast cancer, diabetes mellitus,
heart failure, pneumonia, and stroke.
Each measure is based on professionally developed, widely accepted practice
guidelines that were translated into measures either as part of a larger partnership
(HEDIS and DQIP) or national public health surveillance effort (Behavioral
Risk Factor Surveillance System [BRFSS]) or by HCFA staff in consultation
with experts and relevant professional groups. Whenever possible, we used
measures that have wide acceptance and have been used and tested. The detailed
measure specifications and the scientific evidence supporting each of these
measures is summarized on the HCFA Web site.8
Acute Myocardial Infarction. We updated and/or expanded measures that had been used for the Medicare
Cooperative Cardiovascular Project.9,10
Heart Failure. We created measures based on treatment recommendations from the American
College of Cardiology/American Heart Association and the Agency for Healthcare
Research and Quality, which were reviewed by clinical expert technical advisory
panels and extensively field tested by PROs.
Stroke. We adapted measures based on treatment recommendations from the American
College of Chest Physicians, the American Heart Association, the National
Stroke Association, and the American Academy of Neurology; the measures were
reviewed by clinical expert technical advisory panels and extensively field
tested by PROs.
Treatment of Pneumonia. We used measures developed in collaboration with the American Thoracic
Society, the Infectious Diseases Society of America, and the Centers for Disease
Control and Prevention; the measures were reviewed by clinical expert technical
advisory panels and extensively field tested by PROs.
Prevention of Pneumonia We used outpatient immunization measures in the BRFSS, which correspond
both to the HEDIS system and to commitments that HCFA has made to Congress
under the Government Performance and Results Act and inpatient measures corresponding
to recommendations of the Advisory Committee on Immunization Practices.
Breast Cancer. We adopted the breast cancer screening measure used in HEDIS,5 which measures the percentage of women aged 52 to
69 years who have received a mammogram in the past 2 years.
Diabetes. We selected those measures developed by the DQIP that can be computed
from claims data. Indicators based on chart abstraction were not included
because a representative sample of office records is not currently available
In all measures except immunization status, the denominator or sampling
frame is patients enrolled in FFS Medicare, and Medicare+Choice (managed care)
plan members are excluded. All states in the United States were sampled, plus
the District of Columbia and Puerto Rico.
Inpatient Measures (AMI, Heart Failure, Atrial Fibrillation, Stroke,
Treatment of Pneumonia). We sampled from Medicare hospital claims data in each state for each
condition. The discharges were eligible for selection only if the principal
diagnosis met the criteria for the target condition, except for stroke prevention,
for which we accepted any diagnosis of atrial fibrillation. We sampled the
discharges for a 6-month period within each state. For a third of the states,
this period was from April to October 1998; for another third of the states,
July to December 1998; and for the remaining states, October 1998 to March
1999. We sampled up to 850 discharges for AMI, pneumonia, and stroke, and
up to 900 discharges for heart failure and used a census of all discharges
for states with fewer than the targeted number of discharges during the period.
The universe of eligible claims was first sorted by age, race, sex, and hospital,
and cases were then sampled systematically from a random starting point. Data
for the performance measures were abstracted from the hospital medical records
by 2 clinical data abstraction centers (which are administratively independent
of individual PROs) using computerized abstraction tools with explicit criteria
that were developed and tested specifically for these measures. The abstraction
tools collected information on contraindications to the treatment process
being studied. Informed consent was not required because the data were collected
for administration of the Medicare program, not for research, and access to
these data is given to the program by law.
Influenza and Pneumococcal Immunization Rates. We used the BRFSS,11 which is coordinated
by the Centers for Disease Control and Prevention and carried out by state
health departments, to estimate statewide vaccination coverage. The BRFSS
is a random-digit-dialed telephone survey of the noninstitutionalized adult
population, and the estimates are for all persons older than 65 years; the
national sample is 26,469 for this age group, with a median state sample of
430 in 1997 (estimated from the 1997 BRFSS Public Use Data File12).
The estimates therefore differ from those for other samples by including beneficiaries
who are enrolled in managed care and excluding persons younger than 65 years
old. Rate estimates reported here are from the 1997 survey. Screening for
or administration of influenza and pneumococcal vaccine for inpatients with
pneumonia was ascertained from nursing and physician notes and other information
in the medical record.
Breast Cancer (Mammography). The denominator was all women aged 52 to 69 years who were enrolled
in Medicare FFS in both 1997 and 1998. Whether a mammogram had been performed
in the 2 years was determined by whether Medicare had paid a claim for a diagnostic
or screening mammogram in that period.
Diabetes. The denominator was all FFS beneficiaries aged 18 to 75 years who had
2 outpatient claims or 1 inpatient claim with a diagnosis of diabetes mellitus
during a 1-year period starting January 1998-July 1998, with the start date
determined by the date when the PRO's contract began in that state. Whether
a service had been provided was determined by whether Medicare had paid a
claim for the service.
For the inpatient measures, patients found to have a clinical contraindication
to the process of care were either included as having received appropriate
care (heart failure measures) or excluded from both the numerator and denominator
(other appropriateness of care measures). Reliability was calculated as the
percentage agreement on an indicator for 2 blinded, independent abstractions
at different abstraction centers. Performance was calculated at the state
level for each of the measures. For 22 measures, results were calculated as
the percentage of patients receiving appropriate care; for time to angioplasty
or thrombolytic therapy, the result was calculated as the median number of
minutes from arrival at the hospital to beginning of angioplasty or thrombolytic
agent instillation. We primarily direct our attention to variation among states
(including the District of Columbia and Puerto Rico). We therefore calculated,
for each measure, performance of the median state rather than a national average.
We also calculated the rank of each state on each performance measure and
then calculated the average rank for each state across the 22 measures (we
excluded time to angioplasty and time to thrombolytic therapy from this calculation
because the sample size was too small in many states) and the SD of the 22
ranks for each state. We mapped the distribution of average ranks to display
Across the 4 inpatient conditions we obtained 94.3% to 99.2% of sampled
records (median, 95.3%). The reliability of measures based on medical record
abstraction ranged from 80% to 95% with a median interrater reliability of
90%. Table 2 shows the number
of charts in the denominator of each rate in 2 ways: the individual rate or
time number is formatted in a type that reflects the number of charts used;
the Table 1 also provides the
median number of charts across all states. Even though more than 700 records
were obtained for each condition in most states, the number of patients who
qualified for a particular indicator was rarely even half that number and
sometimes much less. Table 2 shows
3 kinds of results: (1) the performance of the median state on each measure,
(2) the average of each state's performance ranks across the 22 measures,
and (3) the rank of each state among all states based on this average rank.
More detailed results are available at the HCFA Web site.8
The performance rates in the median state for each of the 22 rate measures
range from a high of 95% (avoidance of sublingual nifedipine in acute stroke)
to a low of 11% (patients with pneumonia screened for pneumococcal immunization
status before discharge). When performance indicators are ranked by the rate
in the median state, the median performance is 69% (patients discharged with
heart failure diagnosis who received angiotensin-converting enzyme inhibitors;
diabetic patients having an eye examination in the last 2 years). The range
of rates for each measure also varies widely across the states, from a low
of a 13-percentage point range for avoidance of sublingual nifedipine for
patients with acute stroke (Nevada, 86%; Wyoming, 100%) to a high of a 54-percentage
point range for antibiotic administered within 8 hours of hospital arrival
to patients with an admission diagnosis of pneumonia (Puerto Rico, 38%; Montana,
93%). The median of the ranges for performance indicators (other than time
to angioplasty and thrombolytic therapy) is 33 percentage points and the median
interquartile range is 8 percentage points. Table 2 shows the performance of each state on each quality measure.
Table 2 also shows the average
of the ranking of each state compared with other states on all of the performance
measures (except time to angioplasty and thrombolytic therapy) and the SD
of these rankings; these averages of rankings range from 10 to 48 because
no state is consistently at the top or bottom. Based on the average of the
rankings, Table 2 shows the state's
rank among all states and areas (range, 1-52). Figure 1 shows that the rankings
tend to follow a geographic pattern with northern and less populous states
more likely to rank high than southern and more populous states.
Previous studies have reported results using some of the individual
measures reported here,1- 4,10
and HEDIS provides a picture (albeit more limited) of care in Medicare managed
care, but we believe that this is the first study to provide a broad picture
of quality of care in FFS Medicare and the first to include data that have
been verified by chart abstraction of a national sample for several conditions.
This study provides strong evidence of a substantial opportunity to improve
the care delivered to Medicare beneficiaries. Available data suggest that
providing the services measured here could each save hundreds to thousands
of lives a year, but more precise estimates of the effect of such improvement
on beneficiary health are beyond the scope of this study.
The differences in average performance among states and regions are
modest compared with the overall need for improvement. Nevertheless, the data
suggest real underlying geographic differences in the way care is delivered
to the Medicare FFS population. They also suggest that variations among states
on individual measures are part of a larger pattern and not simply local variation.
We do not yet understand the reasons for these differences or whether aspects
of the systems in high-performing states can be easily replicated in low-performing
These measures give a somewhat unbalanced picture of Medicare services.
They overrepresent inpatient and preventive services, underrepresent ambulatory
care, and scarcely represent interventional procedures at all.
This article is generally limited to care delivered in FFS Medicare.
Nationally, about 85% of Medicare beneficiaries are cared for under FFS and
about 15% under managed care, but in Arizona, California, Florida, and Pennsylvania
more than 25% of beneficiaries are enrolled in managed care. Comparing HEDIS
data from managed care with this FFS data presents technical problems that
we have not yet solved because denominators and/or measure definitions differ
in the 2 systems. However, the data reported here for FFS do not differ dramatically
from the HEDIS data reported for Medicare managed care.13
This article is limited to national- and state-level information. Information
on individual practitioners and providers requires a different and more efficient
data collection and reporting system designed to collect such voluminous data.
Even with practitioner- and provider-level data, many practitioners and providers
treat too few patients with particular conditions to generate a meaningful
sample size, and it will remain difficult to determine which practitioner
is responsible for delivering the process of care that is measured.
We must also consider the extent to which these measures fairly represent
quality of care for the services and population addressed. There are 2 concerns:
the validity of the measures as representations of quality of care and the
accuracy of the data.
Each of the measures is based on both strong science and professional
consensus that delivering the service would either improve outcomes or be
necessary to services that would improve outcomes. Nevertheless, for almost
all of the services, there are circumstances in which delivering them would
be inappropriate. For the inpatient measures, we included the major contraindications
in our abstraction and computational algorithms, but there are likely to be
unusual circumstances that account for a few cases of undelivered care. The
measures are designed to credit care as appropriate if there is doubt, and
we know from PRO field experience with the measures that valid, unmeasured
contraindications are not frequent.
Small numbers are a problem for some inpatient measures, such as time
to angioplasty and thrombolytic therapy, because a relatively small number
of the beneficiaries in our sample received these services in some states.
However, the effect of small denominators is to increase the variation among
states, not to bias the median downward. We use surveys for influenza and
pneumococcal immunization rates because many influenza immunizations are delivered
without claims being submitted to Medicare, and because there is no immediately
feasible way to accurately determine pneumococcal immunization status from
existing Medicare claims data files. Surveys, of course, may have recall and
sampling bias, but this does not appear to be a major problem for the other
If interrater reliability is 90%, the accuracy of the individual abstractor
is about 95% (each rater accounts for about half of disagreements between
raters). The range of reliabilities is about 80% to 95%, suggesting that,
even for the most unreliable measure, abstraction errors would not account
for a performance level below 90%.
We believe that this article and the tracking system behind it establish
a mechanism for HCFA to move beyond its historical focus on individual cases
and providers and to take responsibility as a purchaser for the care delivered
to the population of Medicare beneficiaries. Although it is customary to speak
of holding providers, practitioners, and health plans accountable for the
care they provide, it is at least as important to hold purchasers, whether
Medicare or Medicaid or commercial or government employers, accountable for
the quality of the care they purchase, because they are making continual and
important decisions that potentially balance quality against expenditures.
As required by the Government Performance and Results Act, HCFA is beginning
to assume this responsibility by reporting some of these measures to Congress
as part of its annual budget submission.
HCFA intends to extend the Medicare clinical performance tracking system
in 3 ways. First, for those measures based on medical record abstraction,
we are now collecting a continuous sample large enough to provide accurate
trending of national data every few months, although too small to provide
state-level estimates more than every few years. Second, we will collect enough
data to make accurate state-level estimates every 3 years (synchronous with
PRO contract cycles). This will allow us to evaluate the success of each PRO
in meeting its major contractual requirement, which is to improve statewide
performance on the measures. Third, we will extend the system to include other
settings, such as nursing homes, home health agencies, and other providers
and to include other clinical priorities.
Obviously, pervasive gaps between what is being done and what could
be done invite us to consider what policies might lead to improvements. A
future article will describe the quality improvement strategy that HCFA is
pursuing to improve performance on these and other measures. Recent reports3,4 have emphasized the importance of focusing
on system failure rather than practitioner failure to working to close these
performance gaps. The United States has poured enormous resources into practitioner
training and very little into improving processes in the systems within which
those practitioners work, and it is time to redress that balance. Available
evidence suggests that, at least for preventive services, systems changes
are more effective than either provider or patient education in improving
provision of services.14
The data should also remind us of the need for partnership among HCFA,
beneficiaries, practitioners, providers, and health plans to achieve improvements.
The HCFA PROs are charged with promoting improvement. They now have performance-based
contracts with more than $200 million a year for improving performance on
the measures reported. Their contracts hold them accountable for successful
promotion of improvement, and there is good evidence that they can contribute
to significant improvement in care.10 Nevertheless,
neither HCFA nor PROs deliver care. They can only provide technical assistance
to practitioners, providers, and plans; take steps that will make it easier
for practitioners and providers to deliver and for beneficiaries to receive
needed care; and serve as conveners for partnerships among local stakeholders.
Only practitioners and providers can make such systems changes as putting
appropriate standing orders in place, installing failure-resistant information
systems, and designing processes that deliver critical services within the
optimum window of time. Segmenting improvement efforts according to payment
source is inefficient and counterproductive. Partnerships among all of the
stakeholders, regardless of source of payment, can make improvement possible
and are urgently needed.