Background
Satisfaction with health plan performance has been assessed frequently, but assessment of physician group performance is rare.
Objective
To present ratings of the care provided by physician groups to enrollees in a variety of capitated health maintenance organization plans.
Methods
A random sample was drawn of adult enrollees receiving managed health care from 48 physician groups in a group practice association. Each individual in the sample was mailed a 12-page questionnaire and 7093 were returned (59% response rate). The mean age of those returning the questionnaire was 51 years; 65% were women.
Results
Reliability estimates for 6 multi-item satisfaction scales were excellent, and noteworthy differences in ratings among groups were observed. In particular, ratings of overall quality ranged from a low of 28 to a high of 68 (mean, 50; SD, 10). Average scores for physician groups were strongly correlated across all scales, but no single group scored consistently highest or lowest on the different scales. Negative ratings of care were significantly related to the following: intention to switch to another physician group, difficulty in getting appointments, lengthy waiting periods in the reception area and examination room, the inability to get consistent care from one physician for routine visits, and not being informed by the office staff when there was a delay in seeing the primary care provider.
Conclusions
Monitoring of health care quality at the physician group level is possible, and could be used for benchmarking, internal quality improvement, and for providing information to the public about how these physician groups will meet its needs.
THERE ARE 3 major components of quality of care—member ratings of care, appropriateness of care, and excellence of care. This study focuses on member ratings of the quality of care provided. Although consumer ratings are subjective, they are important because dissatisfaction leads to physician shopping,1-3 turning to nonmedical healers,4 the willingness to initiate malpractice litigation,5 noncompliance,6,7 and termination of enrollment in prepaid health plans.8
Satisfaction with health plan performance has been measured frequently,9 but assessment of physician group performance is rare. Health plans with staff model health maintenance organizations (HMOs) and complex physician networks are expanding rapidly, and are often composed of distinct physician groups that can be miles and hours apart. In addition, physician groups may contract with multiple health plans. Consumers are often more interested in how different physicians compare with one another than in how plans perform. Thus, comparisons of health plans need to be supplemented, or in some cases replaced, with information about the physician groups providing direct patient care. This study reports on the quality of care provided by 48 independent physician groups that care for more than 1 million enrollees in a variety of capitated HMO plans. The study was conducted at the request of the association of physician groups to initiate benchmarking and internal quality improvement efforts.
This study, which involved examining the HMO care provided by the Unified Medical Group Association, was sponsored by The Medical Quality Commission. The Unified Medical Group Association included 63 multispecialty medical groups when this study was initiated in 1994. Two thirds of the association's member groups were in California, and health care was provided to HMO enrollees and other patients at approximately 350 separate sites: 32 member groups with more than 240 sites in southern California and 10 member groups with more than 90 sites in northern California. There were also 21 member groups with more than 90 sites in other states. Participation in the study was voluntary with 48 of the 63 physician groups enrolling. Participating groups were assured that the confidentiality of the data would be preserved.
A random sample of adults (18 years and older) was drawn from members who had made a health care visit within the last 365 days to 1 of the 48 physician groups. Each of these members was mailed a 12-page questionnaire (provided in both English and Spanish), a cover letter, $2 in cash, and a return envelope. One week later, each individual in the sample was mailed a reminder and thank-you postcard. Two weeks later, nonrespondents received a second packet of materials and a reminder telephone call was attempted (up to 6 calls per person). The questionnaire included 153 items that assessed the following: (1) intention to switch to another physician group; (2) intention to switch to another health plan; (3) ratings of care: overall quality of care, quality and convenience of care, access to care, office waiting time, choice of primary care provider, and coverage for mental health care; (4) reports about care: appointment waiting time, office waiting time, continuity of care, and health promotion and disease prevention services; (5) utilization: number of visits in last 4 weeks, time since last visit, and use of urgent care services; (6) health status10,11; (7) presence of chronic conditions: hypertension, myocardial infarction, congestive heart failure, diabetes, angina, cancer, migraines, cataracts, glaucoma, macular degeneration, chronic allergies or sinus trouble, seasonal allergies, arthritis, sciatica or chronic back problems, vision trouble, chronic lung disease, liver trouble, dermatitis or other chronic skin rash, stomach trouble, deafness or other trouble hearing, kidney problems, limitation in use of an arm or leg, blurred vision, epilepsy, and thyroid problems; and (8) background information: age, sex, race, education, marital status, income, type of health insurance, and time enrolled in health plan.
A total of 7093 questionnaires were returned (59% response rate, adjusting for undeliverable questionnaires and death). Response rates across physician groups ranged from 46% to 73%, and were not significantly associated with ratings of health care. Eighty percent of those returning a questionnaire were white, 10% Hispanic, 4% Asian, 3% black, 1% Native American, and 2% other. Seventy percent of respondents were married and 93% were high school graduates.
To ensure that the random sample of members completing the questionnaire was an accurate representation of the entire population, a comparison was made between all adult members in the sampling frame who visited the physician within the last 365 days (n=1203001) and those members who returned the questionnaire (n=7093). The following variables were compared: age, sex, time since last visit, and last diagnosis recorded. The Medical Outcomes Study 36-Item Short Form Health Survey (SF-36) scores of the sample were compared with that of the US general population.12 Internal consistency reliability13 and reliability at the physician group level14 were estimated for multi-item scales. Product-moment correlations between ratings of health care and variables hypothesized to be related to these ratings were computed. We hypothesized that negative ratings of care would correlate with the intention to switch physician groups, difficulty in getting appointments, and lengthy waiting periods in the reception area and examination room. We also postulated that positive ratings of care would correlate with continuity of care and being informed by staff when there was a delay in seeing the primary care provider. Mean scores by medical groups were then computed and presented as standardized T scores (mean overall physician group, 50; SD, 10). The common mean and SD of T scores facilitate the interpretation of scale scores.
Forty-nine health care rating items were included in the questionnaire (Table 1). The majority of the items were administered using a 7-point response scale (very poor, poor, fair, good, very good, excellent, and the best) along with an option for "does not apply to me." The other items were administered using a 3-point response scale (yes, a big problem; yes, a small problem; no, not a problem) along with an option for "don't know." The 153-item questionnaire took respondents an average of 27 minutes to complete.
The following 6 multi-item scales were constructed based on prior research and results of an exploratory factor analysis15: (1) overall quality of care provided by the physician group; (2) quality and convenience of care provided; (3) access to care; (4) office waiting time; (5) choice of primary care provider; and (6) coverage for mental health care. Each scale was scored by averaging the items on the scale.
The mean age of those returning the questionnaire was 51 years (median, 49 years) compared with the mean age of the sampling frame that was 46 years (median, 43 years). Sixty-five percent of the respondents were women, whereas only 58% in the sampling frame were women. The last medical visit for study participants was, on average, 119 days (median, 88 days) before the beginning of the study. For those in the sampling frame, the average was 130 days (median, 112 days). Four percent of the respondents and 3% of the sampling frame had hypertension as the last diagnosis recorded (according to International Classification of Diseases, Ninth Revision, code).16 Mean SF-36 scores for the respondents in the sample were similar to those of the US general population, after adjusting the general population values to the age and sex distribution of the sample (Table 2).
Internal consistency reliability estimates for the multi-item scales in the study were excellent. The α coefficients were 0.81 or higher for 7 of the 8 SF-36 scales (0.60 for social functioning) and ranged from 0.77 to 0.98 for the 6 health care rating scales. Reliability of ratings of care at the physician group level was 0.85 or higher, with one exception (Table 3).
Mean scores on the 6 health care rating scales are presented in a T-score metric (overall mean for patients, 50; SD,10) in (Table 4). Ratings by members who reported they would definitely or probably switch physician groups were, on average, 1 SD or 10 points below the mean of 50. Ratings for those who reported that they would probably not switch were about 3 points below the mean. Members reporting that they would definitely not switch were approximately 5 points above the mean.
The construct validity of the health care ratings is supported by study results. Negative ratings of care were related to the intention to switch physician groups (product-moment correlations, for each, r ranging from 0.41 to 0.62; P<.001), excessive office waiting time (for each, r ranging from −0.22 to −0.58; P<.001), and difficulty getting appointments for routine care, sick care, and urgent care (for each, r ranging from −0.13 to −0.42; P<.001). Positive ratings of care were correlated with continuity of care for routinely scheduled health needs (eg, r=0.32; P<.001, with overall quality), and the relaying of information by the office staff regarding delays in seeing the physician (eg, r=0.46; P<.001 with overall quality). Table 5 provides T scores at the physician group level for the 6 health care rating scales. Scores followed by a dagger symbol differ significantly (P<.05) from the mean for all other physician groups. Because a number of respondents chose "not applicable" for the coverage of mental health care items, the group level estimates for this domain are less precise than for the others (Table 3, reliability estimates). For the other 5 domains, 4 of the groups were significantly above average and 8 groups were significantly below average on all 5 domains.
Scores for physician groups ranged from 28.4 to 68.4 for overall quality, 28.3 to 70.9 for quality and convenience, 26.6 to 70.0 for perceived access, 24.6 to 69.5 for office waiting time, 30.1 to 69.3 for choice of primary care provider, and 24.5 to 82.8 for coverage of mental health care. The difference between the highest and lowest scoring physician groups was large. For example, the percentage of members rating the waiting time in the reception area and in the examination room the best or excellent differed by more than 30% (ratings were 57% and 59% for the best group vs 25% and 22% for the worst group, respectively).
Average scores for physician groups were strongly correlated across all scales (Spearman ρ range, 0.75-0.96; n=48). However, no single group scored consistently highest or lowest on the different health care rating scales. One group scored at the top on 3 of the 6 scales (group 15 on overall quality, quality and convenience, and office waiting time), and 2 groups scored at the bottom on 2 scales (group 11 on overall quality and choice of primary care physician; group 22 on quality and convenience and office waiting time). Scores adjusted for differences in patient characteristics (age, sex, race, educational attainment, income, physical health, and mental health) and presence of chronic diseases among groups were similar to unadjusted scores with some slight variations (results not shown but available on request).
This study provides support for the reliability and validity of the ratings of health care. Reliability estimates tended to be adequate at the physician group level. Construct validity of the scales was supported by study results showing expected and predicted correlations between negative ratings of care and the intention to switch physician groups, difficulty in getting appointments, and lengthy waiting periods in the reception area and examination room. Construct validity was also supported since positive ratings of care correlated with continuity of care and with patients being informed by staff when there was a delay in seeing the primary care provider.
This study was conducted with support from The Medical Quality Commission, an accreditation, education, and research organization, and represented its first attempt to conduct a standardized member study to compare performance among physician groups. Both clinically important and statistically significant differences in ratings among groups were observed. The results were summarized in a report for the physician groups that participated in the study. A few months after the groups received the summary report, a subset of 18 groups provided written feedback in response to a brief survey administered by The Medical Quality Commission. A majority of the responding groups (10 of 18) indicated that the report was easy to understand and the information was useful (14 of 18). However, less than half the respondents implemented changes as a result of the information (7 of 18). This response confirms the recognized importance of member feedback in the improvement of quality of care, as well as the difficulty involved in implementing change.17,18
Table 5 shows the mean scores by physician group, which is presented by group number, rather than name. Although steps are already in place to release consumer ratings by health maintenance organization plans, we believe that data indicating the quality of care provided by physician groups may be more valuable than data that compare a million-member health care plan with another. The differences between groups are large and meaningful as demonstrated by their ability to predict who will leave a physician group. They are consistent enough among domains to allow one to select a group that is rated higher vs lower. Public release of results listing the physician groups by name should be considered, especially since little improvement was made by the physician groups after the results were made available to them. This lack of response was also seen when quality information was fed back to academic general internal medicine physician groups.19 However, if such data are to be publicly released listing physician group by name, it must be extremely accurate. This would require knowing who is in the group, having the member's address information, and obtaining a high completion rate. To accomplish this, patients must recognize that part of the success of an improvement study and plan requires prompt, thorough completion of related questionnaires. Up to now this responsibility has not been assumed by the public. Until this changes, it will be difficult to provide accurate information on member ratings of health care.
Accepted for publication August 12, 1997.
This survey was supported by an unrestricted research grant from The Medical Quality Commission to RAND.
The opinions expressed are those of the authors and do not necessarily reflect the views of The Medical Quality Commission or RAND.
We express our appreciation to the patients who participated in the study and Gail Della Vedova (Greater Valley Medical Group Inc, San Fernando Valley, Calif) and staff members of The Medical Quality Commission and the Unified Medical Group Association (now known as the American Medical Group Association, Seal Beach, Calif), namely Diane Bailey, Lori Bloomfield, Jim Hillman, and Alan Zwerner, MD, JD, for their constructive input. We also thank RAND colleagues Eve Kerr, Joan Buchanan, and Arleen Leibowitz for suggestions about the questionnaire used in the study. The editorial assistance provided by Deborah Kutnik is especially appreciated.
Corresponding author: Ron D. Hays, PhD, RAND, 1700 Main St, Santa Monica, CA 90407-2138 (e-mail: Ronald_Hays@rand.org).
2.Kasteler
JKane
ROlsen
DThetford
C Issues underlying prevalence of "doctor-shopping" behavior.
J Health Soc Behav. 1976;17329- 339
Google ScholarCrossref 3.Marquis
MSDavies
ARWare
JE Patient satisfaction and change in medical care provider: a longitudinal study.
Med Care. 1983;21821- 829
Google ScholarCrossref 4.Cobb
B Why do people detour to quacks?
Psychiatr Bull. 1954;366- 69
Google Scholar 6.Korsch
BMGozzi
EKFrancis
V Gaps in doctor-patient communication.
Pediatrics. 1968;42855- 869
Google Scholar 7.Sherbourne
CDHays
RDOrdway
LDiMatteo
MRKravitz
RL Antecedents of adherence to medical recommendations: results from the medical outcomes study.
J Behav Med. 1992;15447- 468
Google ScholarCrossref 8.Ware
JEDavies
AR Behavioral consequences of consumer dissatisfaction with medical care.
Eval Program Plann. 1983;6291- 297
Google ScholarCrossref 9.Allen
HMRogers
WH The employee health care value survey: round two.
Health Aff (Millwood). 1997;16156- 166
Google ScholarCrossref 11.Ware
JESherbourne
CD The MOS 36-Item Short-Form Health Survey (SF-36), I: conceptual framework and item selection.
Med Care. 1992;30473- 483
Google ScholarCrossref 12.McHorney
CAKosinski
MWare
JE Comparisons of the costs and quality of norms for the SF-36 health survey collected by mail versus telephone interview: results from a national survey.
Med Care. 1994;32551- 567
Google ScholarCrossref 14.Shrout
PEFleiss
JL Intraclass correlations: uses in assessing rater reliability.
Psychol Bull. 1979;86420- 428
Google ScholarCrossref 15.Rummel
RJ Applied Factor Analysis. Evanston, Ill Northwestern University Press1970;
16.World Health Organization, International Classification of Diseases, Ninth Revision (ICD-9). Geneva, Switzerland World Health Organization1977;
17.Berwick
DM The year of "how": new systems for delivering health care.
Qual Connect. 1996;51- 4
Google Scholar 18.Meterko
M Overview: the evolution of customer feedback in health care.
Joint Commission J Qual Improve. 1996;22307- 310
Google Scholar 19.Brook
RHFink
AKosecoff
J Educating physicians and treating patients in the ambulatory setting: where are we going and how will we know when we arrive?
Ann Intern Med. 1987;107392- 398
Google ScholarCrossref