[Skip to Content]
Access to paid content on this site is currently suspended due to excessive activity being detected from your IP address Please contact the publisher to request reinstatement.
[Skip to Content Landing]
February 1987

Quantification of Agreement in Psychiatric Diagnosis Revisited

Author Affiliations

From the New York State Psychiatric Institute, New York (Drs Shrout, Spitzer, and Fleiss); and the Division of Biostatistics, School of Public Health (Drs Shrout and Fleiss), and the Department of Psychiatry (Dr Spitzer), Columbia University, New York.

Arch Gen Psychiatry. 1987;44(2):172-177. doi:10.1001/archpsyc.1987.01800140084013

Eighteen years ago in this journal, Spitzer and colleagues1 published "Quantification of Agreement in Psychiatric Diagnosis," in which they argued that a new measure, Cohen's k statistic,2 was the appropriate index of diagnostic agreement in psychiatry. They pointed out that other measures of diagnostic reliability then in use, such as the total percent agreement and the contingency coefficient, were flawed as indexes of agreement since they either overestimated the discriminating power of the diagnosticians or were affected by associations among the diagnoses other than strict agreement. The new statistic seemed to overcome the weaknesses of the other measures. It took into account the fact that raters agree by chance alone some of the time, and it only gave a perfect value if there was total agreement among the raters. Furthermore, generalizations of the simple k statistic were already available. This family of statistics could be used to assess