The relationship between community preceptor and student summary scores on checklist items for the anemia case. The maximum score attainable was 26. The mean±SD score for the preceptors was 17.4±3.8 and, for the students, 15.5±3.7. The correlation coefficient was 0.19 (P=.15).
The relationship between community preceptor and student summary scores for checklist items on the growth-delay case. The maximum attainable score was 21. The mean±SD preceptor score was 16.0±3.2 and the mean student score was 10.0±4.5. The correlation coefficient was−0.41 (P=.06).
Malloy MH, Perkowski L, Callaway M, Speer A. The Relationship Between Preceptor Expectations and Student Performance on 2 Pediatric Objective Structured Clinical Examination Stations. Arch Pediatr Adolesc Med. 1998;152(8):806-811. doi:10.1001/archpedi.152.8.806
We designed 2 pediatric objective structured clinical examination stations, 1 anemia case associated with lead exposure and 1 failure-to-gain-weight case associated with extended breast-feeding, to evaluate third-year medical students who had studied in pediatric community preceptors' offices as part of a 12-week multidisciplinary ambulatory clerkship rotation.
To examine the relationship between preceptor expectations and student performance on these 2 objective structured clinical examination stations.
To elicit community preceptors' expectations of student performance, we constructed a 46-item survey replicating checklists filled out by simulated patients evaluating student performance on the objective structured clinical examination stations. The percentage agreement among preceptors for each checklist item as well as the percentage agreement between preceptor responses and student responses on each checklist item were calculated. A summary score of preceptor responses across all checklist items and a summary score for student responses across all checklist items on each station were calculated. The correlation coefficients between preceptor and student summary scores were then examined.
Fifty-nine preceptor surveys were mailed and 38 were returned (64% response rate). Data were usable from 37 surveys. Eighty-nine percent (33 of 37)of the preceptors agreed that a third-year clerkship student should have the knowledge to care for the patient with anemia and 92% (34 of 37)of the preceptors agreed similarly for the growth-delay case. Agreement among preceptors on individual checklist items varied widely for both cases. Fifty-seven students studied at the anemia station and 34 students studied at the growth-delay station. The mean±SD agreement across the 26 items on the anemia case between preceptor responses and student responses was 62%±23% and, for the 21 items on the growth-delay case, 60%±17%. The mean±SD preceptor summary score for the anemia case was 17.4±3.8 (maximum, 26) and 16.0±3.6 (maximum, 21) for the growth-delay case. The mean student score on the anemia case was 15.5±3.7 (maximum, 26) and, for the growth-delay case, 10.0±4.5 (maximum, 21). The Pearson correlation coefficient between the preceptor and student scores on the anemia case was 0.19 (P=.15), and for the growth-delay case,−0.41 (P=.06).
These data suggest community preceptors agree on topic areas in which students should be clinically competent. There was, however, considerable variation in agreement among preceptors about what preceptors believe students should be able to do and how the students actually perform. The overall percentage agreement between preceptor expectations and student performance appears to be no better than chance.
AS THE training of medical students is directed more into the community and ambulatory setting, it becomes more important for academic medical centers responsible for setting learning objectives and evaluation standards to ascertain what is considered important by community physicians and to determine how preceptor expectations may relate to actual student performance.1- 5 Depending on how closely these expectations and actual student performance are related, modifications of either the expectations of the preceptors or the provision of more-structured learning experiences to help students attain the expected level of performance may be necessary.
The incorporation of the objective structured clinical examination (OSCE) in the evaluation process of medical students has become more prevalent and has provided the opportunity to explore relationships that have existed in theory, but may never have been overtly examined.6- 11 For example, it would appear educationally sound to develop cases or stations for an OSCE that would evaluate students in areas considered to be important by the majority of physician preceptors and that are linked to course objectives. How closely OSCE stations developed in an academic medical setting are related to a consensus of the knowledge and performance standards considered important to community preceptors is a relatively unexplored area.
Given the paucity of literature concerning the relationship between preceptor expectations and student performance on OSCE stations, we examined this relationship by surveying a group of community physicians with regard to their opinion of how students might perform on 2 pediatric OSCE stations and then related the preceptor responses to actual student performance.
In June 1996, the University of Texas Medical Branch, Galveston, implemented its Multidisciplinary Ambulatory Clerkship. This is a third-year core clerkship in which medical students spend a total of 12 weeks studying in the community offices of pediatric, internal medicine, and family practice physicians. Time is split equally across the 3 disciplines. As part of the student evaluation process, an 8-station, end-of-clerkship OSCE was developed. Three of the stations were pediatric cases. One station was an interview station in which the presenting problem was anemia in an infant. The underlying cause of the anemia was exposure to a lead-contaminated environment. A second station, where students presented their interview findings to a faculty member and were asked several questions concerning the diagnosis of anemia in childhood, was paired with the first. The third station was an interview station in which the student received growth information and dietary information about an 8-month-old infant with a decrease in weight gain associated with prolonged exclusive breast-feeding.
The objectives of our study were 2-fold. First, we wanted to determine how closely community-based pediatric preceptors agree with each other in their expectations of student performance on items used in the pediatric OSCE stations. Second, we wanted to determine how closely pediatric preceptor expectations of student performance agreed with actual student performance. We define our terminology as follows: Percentage agreement among preceptors is a measure of the preceptors' expectations of student performance. Percentage agreement among the preceptors represents the proportion of preceptors responding positively to a checklist item. Percentage agreement between the preceptors and students measures the proportion of students and preceptors who responded similarly, either positively or negatively, to a checklist item. Percentage of agreement between preceptors and students was calculated as follows: (a+d)/(a+b+c+d), where a indicates the positive response of both preceptors and students; b, the negative response of preceptors and the positive response of students; c, the positive response of preceptors and the negative response of students; and d, the negative response of both preceptors and students.
We accomplished these objectives through the design of a survey instrument. The survey was used to obtain demographic and practice information of the pediatric Multidisciplinary Ambulatory Clerkship preceptors in the community. The survey instrument presented the case scenarios of the anemia case (Table 1) and the growth-delay case (Table 2). It also contained questions that were identical or very similar to the checklist items for the anemia and growth-delay stations that were completed by the standardized patients during the OSCE. For the survey, the anemia interview and initial evaluation station checklist items were combined. Fifty-nine community pediatric preceptors who had participated in the Multidisciplinary Ambulatory Clerkship received the survey instrument by mail. A follow-up mailing was sent to nonresponders.
Demographic and practice information of the community-based preceptors was summarized. The frequency of preceptor and student responses on the checklist items from the OSCE stations was determined, and we determined the percentage agreement between the preceptor response to individual items and the student response to the items. Because of the possibility of chance agreement between preceptor responses and student responses, κ statistics were calculated for each checklist item to determine the percentage agreement above chance. The κ statistic takes chance into account by the calculation of an expected value of agreement on the basis of chance alone, then subtracts that value from the observed percentage agreement. Values greater than 0.75 indicate excellent agreement above chance, while values less than 0.40 indicate poor agreement beyond chance.12 A preceptor summary score was derived for each case by summing the responses of the preceptors (ie, the preceptors' positive response of whether a student would be able to perform a task or obtain the necessary information indicated on the checklist was scored 1; a negative response was scored 0) over the individual items of each case. A similar summary score was calculated for the students. Correlation coefficients were used to examine the relationship between the preceptor and student summary scores. Preceptor summary scores were correlated directly with the summary score of the students who had studied in their office.
Of the 59 surveys mailed out, 38 were returned for a response rate of 64%. This response rate included the results of a second mailing to nonresponders. The majority of the preceptors were men aged 40 to 49 years, with a wide variation in the number of years at their practice site (Table 3). The number of students they had instructed varied widely; the average number of patients preceptors saw each day was 29.
The preceptors were asked 2 global questions about each case. The first question was, "Do you think it is likely that a student studying in your office would ever come in contact with a case like this?" For the anemia case, 68% (25/37) of the preceptors said yes, 19% (7/37) said no, and 13% (5/37) were uncertain. The second question was, "Do you think a third-year student on the completion of an ambulatory rotation should have the ability and knowledge to care for a case like this?" A total of 89% (33/37) responded yes and 11% (4/37) were uncertain. For the growth-delay case, 97% (36/37) of the preceptors responded affirmatively to the first question and 92% (34/37) replied affirmatively to the second.
The percentage agreement among preceptors on the checklist items, the percentage of students responding positively to the checklist items, and the percentage agreement between preceptor responses and student responses to checklist items are presented in Table 4 and Table 5. For this analysis, preceptor responses were paired with the number of students who studied in their office. For example, if a preceptor had 2 students in his or her office over the past year who were respondents on the examination, the preceptor's response was counted twice. As an example of the responses, the first item for the anemia case concerning inquiry about the type of milk showed that 89% (51/57) of the preceptors agreed that the students would inquire about the type of milk (Table 4). Ninety-eight percent (56/57) of the students actually did inquire about the type of milk. However, the percentage agreement between the preceptor responses and the responses of students who had studied in their offices was only 88%. The κ statistic for this particular item was−0.03, suggesting essentially no agreement above chance. In general, there was a great deal of variation between what preceptors thought students could do and how the students actually performed. The mean percentage agreement between preceptor responses and student responses for all 26 items was 62%, with an average κ statistic of 0.04 and a range from−0.10 to 0.26.
For the growth-delay case, a similar variation between preceptor responses and student responses was noted (Table 5). For example, 77% (17/22) of preceptors agreed that the student would show the growth chart to the mother, while 95% (21/22) of the students actually did. As an illustration of the lack of agreement between preceptor responses and student responses, 91% (20/22) of preceptors agreed that the students would inform the mother about the use of acetaminophen following immunization in this case, while only 55% (12/22) of the students did so. The overall mean percentage agreement between preceptor and student responses for this case was 60% and the mean κ score was 0.05.
The relationship between the summary scores for the preceptors and students on the anemia case is illustrated in Figure 1. Fifty-seven students completed this station. The maximum attainable score was 26. The mean preceptor score was 17.4, and the mean student score 15.5. The correlation coefficient between the 2 scores was 0.19 and did not attain significance. The relationship between the summary scores for the growth-delay case is illustrated in Figure 2. There are fewer points on this curve, because fewer students completed this station (n=34). The maximum attainable score was 21, and the mean score for the preceptors was 16.0, compared with 10.0 for the students. The correlation coefficient was−0.41 and approached, but did not attain, statistical significance (P=.06).
The use of simulated patients and OSCEs as part of the evaluation process for undergraduate medical education is a common phenomenon.6- 11 The validity of this form of evaluation, however, continues to be difficult to ascertain.13 Several studies14,15 have attempted to establish the validity of these forms of testing by establishing as a "criterion standard" faculty ratings of an observed station and then correlating the criterion standard with the checklist performance of students. This attempt at validation has the advantage of using faculty who may be in agreement with, or who have been at least educated about, the rationale and context of the examination. Thus, it is likely that they may demonstrate higher correlations with checklist items.
The formulation of OSCE stations or simulated patient cases seems to occur mainly through use of course evaluation blueprints6 or derivation from problems presented in a list of course objectives.16 The origin of the course objectives or blueprints varies considerably. In community-based clerkships, the course objectives or blueprints usually arise in the academic centers, not from the community faculty. Although ultimate responsibility for the course resides within the academic setting, review and validation of course objectives and participation in the evaluation process by community faculty might be an important educational process for both academic and community faculty.
As this study points out, the opportunity to determine community preceptor opinion on the content of the OSCE stations, as well as determine how accurately they could predict student performance, provides insight into what is likely taught in the community. The process may also serve to provide the community faculty with gold standard expectations for student performance. The preceptors may, in fact, be unaware of the rationale and context of the evaluation process. The survey of the community faculty may have served as a means for providing them with information about what the students' examinations covered and, thus, may have served as a faculty development tool. Although there was global agreement among the preceptors that the 2 cases used in the OSCE stations were cases that medical students should have knowledge about and be able to manage (89% agreement for the anemia case and 92% agreement for the growth-delay case), there was considerable variation in agreement among the preceptors as to the particular knowledge the students would have to obtain to effectively diagnose the conditions and manage these cases. This variability among experts of clinical standards has been documented previously.17 Only by extensive training and by limiting the number of experts do reliability measures appear to improve markedly.13
In summary, we have surveyed a group of community preceptors to obtain their opinion about how students who have studied in their offices would perform on 2 OSCE stations. The results suggest disparity between what the preceptors think the students can do and what the students actually did. These results imply that further education of the community faculty about course objectives is necessary and that community faculty may serve as a valuable resource to help validate academic faculty perceptions of what information may be important to teach.
Accepted for publication May 4, 1998.
This work was supported in part by a Robert Wood Johnson Generalist Physician Initiative Grant.
Presented in part at the American Association of Medical Colleges, Research in Medical Education Meeting, Washington, DC, November 4, 1997.
Editor's Note: The results of this study are depressing, if not surprising. I wonder what the results would have been if academic center–based preceptors were involved. Any bets?—Catherine D. DeAngelis, MD
Reprints: Michael H. Malloy, MD, MS, Department of Pediatrics, University of Texas Medical Branch, 301 University Blvd, Galveston, TX 77555-0526 (e-mail: firstname.lastname@example.org).