In Reply: Dr Pitrou raises several important points related to the abstraction of articles in our systematic review that concern the interrater reliability of the extracted information. Pitrou suggests that to train the abstractors, the modified Best Evidence in Medical Education abstraction form could have been tested on a random selection of articles. She asks whether κ coefficients were calculated to assess interrater agreement of abstractions and if more details concerning consensus adjudication could be provided. We agree that training raters to use an abstraction form is a critical initial step in the process of developing and using an abstraction form. All authors did abstract a random group of articles using an initial version of the modified Best Evidence in Medical Education form. Differences in abstraction were reviewed, and the form was iteratively revised to clarify ambiguous items. Additionally, we created and used a coding manual that defined each of the abstraction items. We agree with Pitrou that quantitative assessment of interrater agreement is important. However, because 2 authors (J.K. and K.H.) abstracted every article and reviewed each abstraction together, and all differences were discussed until agreement was reached, we did not calculate κ scores. In the rare instances when agreement could not be reached, a third author (E.H.) reconciled these.
Kogan JR, Holmboe E, Hauer KE. Tools to Assess Clinical Skills of Medical Trainees—Reply. JAMA. 2010;303(4):331-332. doi:10.1001/jama.2010.21