Background
An explicit clinical significance (CS) criterion was added to many DSM-IV diagnoses in an attempt to more closely approximate
the clinical diagnostic process and reduce the proportion of false positives
in epidemiological studies. The American Indian Service Utilization, Psychiatric
Epidemiology, Risk and Protective Factors Project (AI-SUPERPFP) offered a
unique opportunity to examine the success of this effort.
Objective
To determine the impact of distress, impairment, and help-seeking reported
in a lay structured interview on concordance with a clinical reappraisal.
Further, to test the efficacy of 5 operationalizations of CS on the concordance
and prevalence of DSM-IV lifetime disorders.
Design
Completed between 1997 and 2000, a cross-sectional probability sample
survey with clinical reappraisal of approximately 10% of participants.
Setting
General community.
Participants
A population-based sample of 3084 members of 2 American Indian tribal
groups, who were between the ages of 15 and 54 years and resided on or near
their home reservations, were randomly sampled from the tribal rolls and participated
in structured psychiatric interviews. Clinical reappraisals were conducted
with approximately 10% of the lay-interview participants. The response rate
for the lay interview was 75%, and for the clinical reappraisal it was 72%.
Main Outcomes Measures
The AI-SUPERPFP Composite International Diagnostic Interview (CIDI),
a culturally adapted version of the CIDI, University of Michigan version.
Adapted to assess DSM-IV diagnoses, questions assessing
the CS criterion were inserted in all diagnostic modules. The Structured Clinical
Interview for DSM-III-R (SCID) was used in the clinical
reappraisal.
Results
Most participants who qualified as having AI-SUPERPFP CIDI lifetime
disorders reported at least moderate levels of distress or impairment. Evidence
of increased concordance between the CIDI and the SCID was lacking when more
restrictive operationalizations of CS were used; indeed, the CIDI was very
likely to underdiagnose disorders compared with the SCID (false negatives).
Concomitantly, the CS operationalizations affected prevalence rates dramatically.
Conclusion
The CS criterion, at least as operationalized to date, demonstrates
little effectiveness in increasing the validity of diagnoses using lay-administered
structured interviews.
Recent advances in the Diagnostic and StatisticalManual of Mental Disorders (DSM)1 havefocused on developing definitions of mental disorder that faithfully representclinicians’ experiences and can be consistently replicated among practitioners.2 Although primarily a clinical tool, the DSM’s formal operationalizations of disorder also allow researchersto design and field structured and semistructured interview protocols, makingpossible estimation of the prevalence and incidence of the more common mentaldisorders within populations. Recently, critiques of such estimates have shiftedtheir focus from reliability to validity.3-6Inparticular, many question whether structured lay interviews overestimate therates of disorder.3 This “false positive”problem is probably best understood in the context of the largest psychiatricepidemiological studies in the United States to date. The Epidemiologic CatchmentArea (ECA) studies7and the National ComorbiditySurveys(NCS)8,9 suggest that,in any given year, almost 20% to 30% of the population experiences a mentalor addictive disorder, while lifetime rates range between 32% and 49%.4,7 Clearly, the implications of such estimatesfor mental health policy in the United States are enormous and have calledinto question whether these rates accurately reflect the need for treatment.3,10-13
Critiques of this epidemiological research focus on whether the diagnosesand the criteria defining them successfully differentiate disorder from “problemsof living,”14,15 in otherwords, whether the thresholds for disorder in such instruments are too low.Hence, in the revisions that occurred between the DSM-III-R16 and the DSM-IV,17 an explicit clinical significance(CS) criterion was added to many diagnoses to address at least 1 type of overinclusion:those meeting symptomatic criteria for a disorder but for whom such problemswere mild. Although worded somewhat differently across diagnoses, the CS criteriontypically asserts that “the symptoms cause clinically significant distressor impairment in social, occupational, or other important areas of functioning.”17(p1857)
The conceptual utility and validity of the CS criterion have been muchdebated3,13,14,18-20;however, empirical examinations of the CS criterion have only recently appearedin the literature. Narrow et al4 and Regieret al,21 in secondary analyses of the ECA andNCS studies, ascribed CS to participants who had either sought help for specificdisorders or reported that these problems interfered with their lives or activities“a lot.” With this operationalization, admittedly limited by thedata available in these surveys that predated the DSM-IV, past-year prevalence rates decreased 17% for the ECA (based on the DSM-III22) and 32% for theNCS (DSM-III-R). Participants meeting both the CSand the symptomatic criteria were more likely than those meeting only symptomaticcriteria to have sought services for mental health problems, to have reportedeither not being able to work or needing to cut down on work, and to havebeen suicidal. Slade and Andrews,23 using datafrom the Australian National Survey of Mental Health and Well-Being (ANSMHWB),24 found that the inclusion of significant self-reporteddistress or impairment also decreased the DSM-IV ratesof past-month disorder, between 19% for major depressive disorder and 65%for obsessive-compulsive disorder. Controlling for sociodemographics and comorbidity,participants with significant distress/impairment were more likely to reporthelp-seeking, distress, and impairment in other parts of the interview—butonly for some diagnoses.
These reanalyses of the ECA, NCS, and ANSMHWB demonstrate that addingCS to the symptomatic criteria substantially lowers prevalence rates and mayincrease the severity of the resulting diagnoses. However, they also raiseadditional questions. Each study operationalized CS somewhat differently;also, none was able to compare the impact that multiple operationalizationsof CS might have on prevalence. For instance, should only individuals reporting“a lot” of distress or impairment be considered to have met CSor might those reporting moderate or mild distress also qualify as disordered?20 What role should help-seeking play in this calculus?And what is the degree of overlap among distress, impairment, and help-seeking?5 Finally, none of these efforts used direct assessmentsof whether including the CS criterion would increase the concordance betweenlay- and clinician-administered interviews.25
Data from the American Indian Service Utilization, Psychiatric Epidemiology,Risk and Protective Factors Project (AI-SUPERPFP) provided us an opportunityto address these questions. Designed before the World Health Organization’s DSM-IV version of the Composite International DiagnosticInterview (WHO-CIDI 2.1,26 used by Slade andAndrews23) was widely available, the AI-SUPERPFPindependently supplemented the CIDI, University of Michigan version (UM-CIDI,27 used in the NCS) with items necessary to assess DSM-IV criteria in a format conducive to investigatingthe utility of different operationalizations of CS and the overlap among theseconstructs. A clinical reappraisal of more than 10% of the sample enabledus to investigate whether the concordance between the lay- and clinician-administeredinterviews increased with the inclusion of CS.
Ai-superpfp lay-interview design and samples
The AI-SUPERPFP sought to estimate the prevalence of psychiatric disordersand health service utilization in 2 American Indian reservation populations.The AI-SUPERPFP methods are described in detail elsewhere28;the interview and training manual are available online at http://www.uchsc.edu/ai/ncaianmhr/presentresearch/superprj.htm. The populations of inference were enrolled members of either 2 closelyrelated Northern Plains tribes or a Southwest tribe who were 15 to 54 yearsold at the time of development of the sample frame (1997) and who lived onor within 20 miles of their reservations. Once located and found to be eligible,73.7% (n = 1446) from the Southwest tribe and 76.8% (n = 1638)from the Northern Plains tribe agreed to participate. Tribal approvals wereobtained prior to the project’s beginning. Informed consent was acquiredfrom all participants; with minors, parental/guardian consent was obtainedbefore adolescent assent.
As explained in greater detail elsewhere,28 carefullyconsidered cultural adaptations had already been completed as part of a previouseffort conducted between 1992 and 1995.29 Becausethat process preceded release of the WHO-CIDI 2.1, this measure was independentlyaugmented and adapted to render it consistent with the DSM-IV.
The present analyses included lifetime diagnoses of the AI-SUPERPFPdisorders: panic disorder (PD), generalized anxiety disorder (GAD), posttraumaticstress disorder (PTSD), dysthymic disorder (DD), major depressive episode(MDE), substance abuse, and substance dependence. Substances included alcohol,sedatives, tranquilizers, stimulants, analgesics, inhalants, marijuana, cocaine,hallucinogens (including peyote), and heroin—each individually assessedand then combined into substance abuse or substance dependence. Additionalaggregations included any anxiety disorder (those with GAD, PD, or PTSD),any depressive disorder (MDE or DD), any substance disorder (either abuseor dependence), any anxiety or depressive disorder, and any disorder.
Assessment of cs constructs
Figure 1 summarizes the operationalizationsof CS by Narrow et al4 and Regier et al21 using the Diagnostic Interview Schedule (DIS)30 and the UM-CIDI, those of Slade and Andrews23 using the WHO-CIDI 2.1 in the ANSMHWB, and, finally,the AI-SUPERPFP CIDI measures of distress, impairment, and help-seeking.
Because the ECA-DIS and UM-CIDI predated the DSM-IV, they did not assess CS directly. However, the ECA-DIS used the probeflowchart to distinguish symptoms and syndromes (groupings of symptoms) thatwere possibly psychiatric from others—a judgment that overlaps but isnot synonymous with CS. This set of probes commenced with the question “Didyou ever tell a doctor about your [problems just endorsed]?” and continuedwith questions about other help-seeking, use of medication, and impairmentin terms of the specific symptom/syndrome. When the probe flowchart was assessedat the symptom level, only individuals with probable psychiatric symptomsadvanced in the diagnostic modules. As shown in Figure 1, the ECA-DIS used the probe flowchart at the symptom levelfor PD and DD, and thus the probe flowchart was a necessary component of thesediagnoses. On the other hand, the probes for MDE and substance disorders wereasked at the syndrome level and have been only recently introduced into theprevalence calculus in the work by Narrow et al4 andRegier et al.21 Similarly, the UM-CIDI assessedhelp-seeking, medication use, and impairment at the syndrome level; again,these data did not influence prevalence estimates until the recent secondaryanalyses. In the WHO-CIDI 2.1 algorithms, an instrument designed to assess DSM-IV, the probe flowchart was used at the symptom levelonly for MDE and DD; otherwise, it was used at the syndrome level. The distressand impairment questions specific to the DSM-IV wereasked for GAD, PTSD, and MDE (impairment only).
In the AI-SUPERPFP CIDI, identical items assessed self-reported distress,impairment, and help-seeking at the syndrome level across diagnoses. Impairmentand help-seeking questions were patterned after those of the UM-CIDI but alteredslightly to maximize cultural validity. For instance, whereas the UM-CIDIincluded “a little” as a response option in its impairment items,this was dropped in the AI-SUPERPFP because focus groups suggested that participantswere unlikely to reliably differentiate between “some” and “alittle.” Help-seeking questions reflected the service ecology of reservationresidents by including specific types of service providers (eg, communityhealth representatives as medical personnel) and directly assessing use oftraditional healing resources.
Operationalizations of cs
Using the AI-SUPERPFP data, 5 operationalizations of CS were compared.The first (CS0) excluded the CS criterion. The next 3 operationalizationsbuilt upon one another, ranging from less (CS1) to more (CS3) restrictive,and focused on degrees of self-reported distress and impairment, adheringclosely to the DSM-IV CS language. The final operationalization(CS4) included both help-seeking and impairment and most closely mimickedthe probe flowchart.
Operationalization CS0: no assessment of clinically significant distressor impairment. Diagnoses were based on symptomatic criteria only.
Operationalization CS1: “a lot or some” impairment or distress.A disorder was considered clinically significant if the participant reported“a lot” or “some” distress or impairment. This followsWakefield and Spitzer’s5 suggestion thatthose expressing either moderate or severe distress or impairment should beconsidered “true” positives.
Operationalization CS2: “a lot” of distress or impairmentor “some” of both. As a variation of the definition of moderatedisability, we included an operationalization whereby experiencing “alot” of distress or impairment or some deleterious effects in multipledomains merited a diagnosis. This operationalization represented a middleground between CS1 and CS3.
Operationalization CS3: “a lot” of distress or impairment.Here participants reporting “a lot” of distress or impairmentwere considered to meet the definition of CS. This operationalization wasclosest to that of Slade and Andrews.23
Operationalization CS4: help-seeking or “a lot” of impairment.4 This operationalization most closely matches the workof Narrow et al4 and Regier et al,21 albeit at the syndrome level. Seeing or talking toa mental health provider or other medical personnel about the psychiatricsymptoms or having been hospitalized for these problems constituted help-seeking.
The AI-SUPERPFP included a clinical reappraisal of approximately 10%(n = 335) of participants, who were reinterviewed by psychiatristsor clinical psychologists. This component was designed to assess the concordancebetween the AI-SUPERPFP CIDI and the Structured Clinical Interview for DSM-III-R, nonpatient version (SCID).31 Approximately75% of the clinical reappraisal sample was randomly chosen based on a positiveCIDI diagnosis of the 3 most common disorders: MDE, PTSD, or alcohol abuse/dependence.The remaining sample, also randomly selected, did not qualify for any AI-SUPERPFPdiagnosis. Because the greatest source of error between lay and clinical interviewsis commonly found in those who have some but not all of the symptoms requiredfor a diagnosis (subthreshold cases),32 approximatelyhalf of those in the no-disorder group endorsed significant levels of depressed,anxious, or irritable symptoms on a checklist independent of the CIDI, withthe remainder having few or none of these symptoms. The SCID was adapted toallow for changes between the DSM-III-R and the DSM-IV, including the CS criterion. The 8 clinician interviewershad extensive clinical experience (more than 15 years on average) and hadworked in American Indian communities. Before entering the field, each demonstrateda high level of interrater reliability (κ≥.80) in a series of videotapescoded by an expert panel, and they also performed supervised interviews withmembers of the local American Indian community. Furthermore, all clinicalreappraisals were audiotaped and reviewed by master clinicians for qualityassurance purposes. The response rate for the clinical reappraisal substudywas 72.3% and was similar to that for the CIDI-disordered and -nondisorderedparticipants. An average of 120 days elapsed between lay and clinical interviews.There was no association between the time elapsed between interviews and agreementbetween the CIDI and the SCID. Clinicians were blind to participants’CIDI diagnostic status.
Variable construction and noninferential analyses were completed usingSAS,33 SPSS,34 andStata.35 First, to better understand the patternsof distress, impairment, and help-seeking among participants meeting symptomcriteria for AI-SUPERPFP CIDI disorders, frequencies and associated confidenceintervals for these constructs are presented in Table 1 for each disorder. Two Venn diagrams (Figure 2) depict the overlap among these constructs for those qualifyingfor at least 1 disorder. As a second step, concordance between the AI-SUPERPFPCIDI and the SCID was evaluated by assessing the numbers of true and falsepositives and negatives generated by the CIDI when the SCID was consideredthe gold standard, as well as the following set of standard statistics: (1)Cohenκ36; (2) sensitivity and specificity;(3) positive and negativepredictive values calculated using the Bayes rule37; and (4) the McNemar χ2 test (a measureof bias).38 Although not an assessment of concordance,the Global Assessment of Functioning39 scores,determined by the clinicians during the reappraisal, provided a measure ofseverity of disorder and are included in Table2. Finally, Table 3 presentsthe differential prevalences, with associated 95% confidence intervals, ofspecific and aggregated disorders across the 5 CS operationalizations usingthe AI-SUPERPFP CIDI. The results in Table 1 and Table 3 provide inferences to the populationsand were conducted in Stata35 using sampleand nonresponse weights. Since the concordance analyses in Table 2 focus on relative functioning of instruments, unweightedestimates were deemed acceptable32 and offeredthe added benefit of being able to provide the actual numbers of participantsin various cells in the table.
Table 1 presents the self-reporteddistress, impairment, and help-seeking for participants meeting the symptomcriteria for DSM-IV lifetime disorders in AI-SUPERPFP.When considering distress, among those with any AI-SUPERPFP CIDI disorder,46.5% reported being upset or bothered “a lot” (distressed) bytheir symptoms. Between 35% and 50% of those with depressive or anxiety disordersreported such levels of distress. Among those with substance problems, a majoritywith dependence reported high levels of distress compared with about one quarterwith substance abuse. When reports of “some” distress were included,more than 90% of participants who were assigned a diagnosis other than substanceabuse reported being upset or bothered by their symptoms. Overall, an additional39.9% (95% confidence interval, 36.4%-46.5%) with any disorder reported beingbothered or upset “some” when compared with “a lot,”with the difference somewhat larger for depressive disorders (56.2%) thaneither anxiety disorders (44.4%) or substance disorders (40.2%).
Similarly, for impairment, one third of participants with any disorderreported that their symptoms interfered with their lives and activities “alot.” Those with substance dependence ranked highest, while those withsubstance abuse ranked lowest. The difference between “a lot”and “some” was 49.2% overall and followed patterns similar tothose seen for distress.
Turning to help-seeking, almost half of the participants qualifyingfor 1 of the AI-SUPERPFP disorders had sought biomedical help for their symptoms;about one third had sought help from traditional healing sources. Generally,the pattern of seeking help from traditional sources mirrored that from biomedicalsources. Combining traditional with biomedical sources of help-seeking increasedthe rates by 9.6%.
Overall, 78% of the clinical reappraisal sample was judged by the SCIDinterviewers to have a disorder; without considering CS (that is, CS0), theCIDI designated 70% of this select sample as having at least 1 DSM-IV disorder. The concordance between the AI-SUPERPFP CIDI and clinicalreappraisals with the SCID is presented in Table2. Similar to others’ reports,38,40 agreementbetween these clinical and lay methods of case ascertainment was modest. However,the hypothesis tested here was whether inclusion of the CS operationalizationsin the lay-interview data would more closely approximate clinical diagnoses.
Focusing first on distress, impairment, and help-seeking constructs,the κ values were lower for the more restrictive response patterns. Sensitivityassesses the ability of the CIDI to record a positive diagnosis for SCID-definedcases; here, a 26% decrease in sensitivities arose when we designated thosehaving “a lot” of distress as meeting CS compared with including“some or a lot.” In contrast, the specificity or ability of theCIDI to identify the SCID-defined noncases increased 10% between the samedefinitions. The positive predictive values indicated that the vast majorityof the CIDI-defined cases also received SCID diagnoses. The negative predictivevalues revealed that as the CS criteria became more restrictive, a greaterpercentage of CIDI-defined noncases actually received SCID diagnoses. Thebias was more severe as the requirements increased for being labeled distressed,impaired, and a help-seeker. Finally, although the Global Assessment of Functioningscores decreased with the more restrictive operationalizations, the differencesbetween them were minimal.
To this point, we have discussed distress, impairment, and help-seekingas separate constructs. Figure 2 depictsthe overlap among distress, impairment, and help-seeking for participantsqualifying for any of the AI-SUPERPFP disorders. When reports of “alot” of either distress or impairment were required, almost one third(32.3%) of participants meeting symptomatic criteria failed to report sufficientdistress/impairment or help-seeking. Less than 20% reported all 3, and 16.4%reported help-seeking only. Considering the overlap when participants whoreported “some” or “a lot” of distress or impairmentwere included, many fewer reported no distress, impairment, or help-seeking(8.1%). Here, the largest categories included participants who reported all3 indicators of CS (42.4%) and those who reported both distress and impairment(35.7%). Thus, the extent of this overlap varied dramatically by the inclusivenessof the responses considered as markers of distress and impairment.
Table 2 also examines the concordancestatistics for 4 methods of combining these distress, impairment, and help-seekingmeasures with varying levels of response inclusiveness, informed by previouswork in this area. Thus, CS3 (“a lot” of distress or impairment)approximated the definition found in Slade and Andrews,23 whileCS4 (help-seeking or “a lot” of impairment) was closest to theapproaches of Narrow et al4 and Regier et al.21 Operationalizations CS1 (“some” distressor impairment) and CS2 (“a lot” of distress or impairment or “some”of both) followed Wakefield and Spitzer’s5 suggestionsand included moderate levels of distress and impairment. Once again, as theoperationalizations became more restrictive, the κ values decreased, thesensitivities decreased more dramatically than the specificities increased,the positive predictive values remained stable while the negative predictivevalues decreased dramatically, the biases increased, and the Global Assessmentof Functioning scores remained quite stable. These results indicate that CSwas not the source of disagreement between the AI-SUPERPFP CIDI and the SCID.
Prevalence across different operationalizations of clinically significantdistress or impairment
Next we turn to the effect the various 5 operationalizations of CS hadon prevalence estimates, focusing first on “any disorder.” OperationalizationCS3 is the most conservative: the prevalence rate was 50.1% of that when noCS criterion was used (22.8% using CS3 compared with 45.5% with no CS criterion).Similarly, the relative rate for CS4 was 58.0%, 77.4% for CS2, and 88.3% forCS1. When the “any disorder” category was restricted to only diagnoseswith an explicit CS criterion in the DSM-IV (GAD,PTSD, MDE, and DD), the relative rates were less dramatic and ranged between71.4% for CS3 and 95.6% for CS1. Figure 3 demonstratesthe relative diminution by disorder across the various operationalizations.Imposing the CS criterion had the greatest impact for substance abuse. Althoughthe DSM-IV implies that CS is less relevant to definingsubstance dependence and PD, the patterns in these instances were quite similarto those of other disorders. When compared with the secondary analyses ofthe NCS of Narrow et al4 and Regier et al21 (see Table 2 ofNarrow et al4), the AI-SUPERPFP prevalencerates applying CS4 were reduced to a greater extent than was observed in theNCS (PD reduced 35.3% [100%−(2.2%/3.4%) = 35.3%] comparedwith 22.7%; GAD, 34.5% compared with 17.6%; MDE, 40.0% compared with 36.6%;DD, 45.5% compared with 28.0%; and substance disorders, 50.6% compared with33.9% for the AI-SUPERPFP and the NCS, respectively).
The concern that psychiatric epidemiological methods, especially thoserelying on structured protocols administered by lay interviewers, should reflectthe clinical expertise inherent in the diagnostic process of the DSM was a driving force behind the deliberate inclusion of the CS criterionin the DSM-IV. Subsequently, 2 basic questions haveinformed epidemiological work on CS. First, were the unexpectedly high ratesreported by the ECA and the NCS inflated by “false positives,”that is, by participants’ meeting the symptomatic criterion for disorderbut for whom such problems were relatively inconsequential?4,7 Second,given that a central goal of studies like the ECA and the NCS was to projecttreatment need, did adding CS to the diagnostic algorithm render the estimatesmore appropriate for mental health policy planning?4,20 Thedata presented here, and our experiences in cross-cultural settings generally,provide additional insight into these questions and further inform the debateabout CS in preparation for the DSM-V.
The AI-SUPERPFP DSM-IV rates of any lifetimedisorder were 45.5% when CS was not considered. When the DSM-III-R rates were compared directly with those of the NCS, the prevalenceof disorders was similar across the 3 samples, although the American Indianlifetime rates were somewhat higher (a range of 2% to 10% based on tribe andsex) than those in the NCS.41 The next question,then, was whether false positives inflated these rates. Using these methodswithin this cultural context, false negatives rather than false positiveswere the major source of discrepancy between lay- and clinician-administeredinterviews. This finding does not appear to be specific to the AI-SUPERPFP,having also been reported with the ECA data.32,38,42 Withinthe baseline NCS, MDE exhibited significant levels of false positives whencompared with the SCID; the bias for most disorders was in the opposite direction,although statistically significant only for simple phobias.27 Therecent NCS-Replication appears to have addressed the issue of false positivesfor lifetime but not current MDE when compared with clinical reappraisals.9 Thus, with the exception of the baseline NCS MDE rates,lay interviews appear to underestimate rather than overestimate lifetime disorderwhen compared with clinical interviews—the reverse of the bias the CScriterion was designed to address. Indeed, Spitzer and Wakefield17,20 anticipatedthat the CS criterion might dramatically increase the false negatives whilehaving little influence on the false positives. These findings support theirhypothesis.
Thus, if the SCID were considered a true gold standard, the conclusionthat the CS criterion should be ignored would be reasonable, for the AI-SUPERPFPat least. However, we are not yet asserting the primacy of one method of caseascertainment over the other. Rather, we consider both lay- and clinician-administeredinterviews to be potentially biased. For instance, underreporting may be morecommon in the lay interviews, while clinicians may be more prone to attributedisorder to “normal” behaviors.43,44 Furthermore,biases inherent in the measurement of DSM-defineddiagnoses in these cultural contexts may differentially affect lay- and clinician-administeredinterviews. While in-depth investigations of the relative validity of thelay- and clinician-administered protocols will be a focus of our investigativeenergies in coming years, the current analyses have implications for othersusing DSM-IV definitions of disorder.
As seen in Figure 1, the operationalizationsof CS to date have differed considerably. Neither the ECA-DIS nor the UM-CIDIwas designed to assess DSM-IV disorders; thus, theirdifferences preceded the explicit inclusion of the CS criterion in the DSM-IV. At the same time, their probe-flowchart data andthe resulting definitions of “probable” psychiatric symptoms drovemuch of the pre-DSM-IV debate about CS. The WHO-CIDI2.1 was designed for the DSM-IV and, in many senses,represents a compromise between the ECA-DIS and the UM-CIDI approaches withits use of the probe flowchart, mostly at the syndrome level, and with individualitems also assessing clinical significance for GAD, PTSD, and MDE. Figure 1 illustrates considerable variation inthe measurement of CS; further, the items often do not closely match the DSM-IV language. Survey methodologists have consistentlydemonstrated the large impact of even small differences in wording.45 As a limitation to the joint ECA/NCS analyses, Narrowet al4 pointed out that the ECA impairmentquestion included “a lot” in the stem whereas NCS did not. Thus,it is unknown whether participants answering “some” to the NCSquestion might choose “yes” or “no” in the ECA version.4 As shown in Table 1 and Table 3, the AI-SUPERPFP data suggest that suchdifferences may be substantial.
Figure 1 also highlights the differentialinclusion of help-seeking in the diagnostic calculus across instruments. Table 1 and Figure2 provide data on the prevalence of help-seeking in the AI-SUPERPFPsamples and the overlap among help-seeking, distress, and impairment. Individualsseeking help for their symptoms are often distressed or impaired; however,the inclusion of help-seeking may carry with it an assumption that adequateservices are available and known to be efficacious and acceptable to communitymembers.46 In preparation for the AI-SUPERPFP,both the ECA-DIS and the UM-CIDI were submitted to focus group review, andconcerns were raised about the use of the ECA-DIS probe flowchart with itshelp-seeking stem question. In particular, our informants suggested that manyAmerican Indians with emotional problems have learned from hard experiencethat local service providers are few in number and often lack the expertiseor training to treat such matters. Also, many American Indians who sufferfrom mental disorders seek treatment from traditional healing sources. Evenat this earlier stage in our research, therefore, serious concerns were raisedabout the use of a help-seeking question as a conditional definition of probablesymptoms.
Before concluding, limitations of the current work deserve mention.The samples from which these data were derived limit the inferences drawn.These data were restricted to American Indian participants and, even then,represented only 3 of more than 300 federally recognized American Indian tribes;participants were restricted to members living on or near their reservationsand covered a limited age range. Further, the analyses were limited; in particular,the concordance analyses required an assumption that the SCID be considereda gold standard, and thus the estimations of the sensitivities and specificitieswere likely biased to some degree.47 Finally,we did not assess the viability of other constructs such as “harmfuldysfunction”6 to explain the remainingfalse positives; an investigation of the cultural definitions of such constructsin American Indian communities is strongly recommended.
Even with these limitations, the current work informs ongoing debatesabout the CS criterion and other definitions of probable psychiatric symptoms.As others have noted,5,17 thelack of consistency with which the CS criterion is applied in the DSM-IV is unsettling and may reflect the ambivalence of the field aboutthis construct. In the absence of biological markers, most diagnoses mustsuperimpose a threshold on dimensions of psychopathology.48,49 Furtherinclusion of thresholds based on disability threatens to make the diagnosticcalculus unmanageable and, on the basis of the data reported here, may havelimited value. As previously argued,50 andas operationalized now in the International Classificationof Diseases, 10th Revision,25 we suggestthe authors of the DSM seriously consider uncouplingassessments of disability from diagnosis, which would serve to italicasize,in a slightly different manner than do CS criteria, that diagnosis shouldnot in itself be equated with medical necessity.3,12
Correspondence: Janette Beals, PhD, AmericanIndian and Alaska Native Programs, University of Colorado Health SciencesCenter, MS F800, PO Box 6508, Aurora, CO 80045-0508 (jan.beals@uchsc.edu).
Submitted for Publication: October 20, 2003;final revision received December 30, 2003; accepted April 21, 2004.
Additional Authors/The AI-SUPERPFP Team: CeceliaK. Big Crow, Dedra Buchwald, MD, Buck Chambers, Michelle L. Christensen, PhD,Denise A. Dillard, PhD, Karen DuBray, Paula A. Espinoza, PhD, Candace M. Fleming,PhD, Ann Wilson Frederick, Diana Gurley, PhD, Lori L. Jervis, PhD, ShirleneM. Jim, Carol E. Kaufman, PhD, Ellen M. Keane, Suzell A. Klein, Denise Lee,Monica C. McNulty, Denise L. Middlebrook, PhD, Laurie A. Moore, Tilda D. Nez,Ilena M. Norton, MD, Carlette J. Randall, Angela Sam, James H. Shore, MD,Sylvia G. Simpson, MD, and Lorette L. Yazzie.
Funding/Support: This study was supported bythe following grants from the National Institutes of Health (NIH), Bethesda,Md: R01 MH48174 (Dr Manson) and P01 MH42473 (Dr Manson). Manuscript preparationwas supported by NIH grants R01 DA14817 (Dr Beals) and R01 AA13420 (Dr Beals).
Acknowledgment: The AI-SUPERPFP would not havebeen possible without the significant contributions of many people. The followinginterviewers and computer/data management and administrative staff suppliedenergy and enthusiasm for an often difficult job: AmeliaT. Begay, Cathy A. E. Bell, Mary Cook, Helen J. Curley, Mary C. Davenport,Rhonda Wiegman Dick, Marvine D. Douville, Geneva Emhoolah, Fay Flame, RoslynGreen, Billie K. Greene, Jack Herman, Tamara Holmes, Shelly Hubing, CameronR. Joe, Louise F. Joe, Cheryl L. Martin, Jeff Miller, Robert H. Moran, Jr,Natalie K. Murphy, Ralph L. Roanhorse, Margo Schwab, PhD, Jennifer Settlemire,Donna M. Shangreaux, Matilda J. Shorty, Selena S. S. Simmons, Jennifer Truel,Lori Trullinger, Jennifer M. Warren, Theresa (Dawn) Wright, Jenny J. Yazzie,and Sheila A. Young. We would also like to acknowledge the contributions ofthe Methods Advisory Group: Margarita Alegria, PhD, Evelyn J. Bromet, PhD,Dedra Buchwald, MD, Steven G. Heeringa, PhD, Ronald Kessler, PhD, Peter Guarnaccia,PhD, R. Jay Turner, PhD, and William A. Vega, PhD. William E. Narrow, MD,Tim Slade, PhD, and Gavin Andrews, MD, are gratefully acknowledged for excellentsuggestions based on a review of the manuscript before submission. We arealso indebted to the ARCHIVES reviewers, whose comments greatlyimproved the manuscript. Finally, we thank the tribal members who so generouslyanswered all the questions asked of them.
1.American Psychiatric Association, Diagnostic and Statistical Manual of Mental Disorders,Fourth Edition. Washington, DC American Psychiatric Association1994;
2.Spitzer
RL Values and assumptions in the development of
DSM-III and
DSM-III-R: an insider’s perspectiveand a belated response to Sadler, Hulgus, and Agich’s “On valuesin recent American psychiatric classification.”
J Nerv Ment Dis 2001;189351- 359
PubMedGoogle ScholarCrossref 3.Regier
DAKaelber
CTRae
DSFarmer
MEKnauper
BKessler
RCNorquist
GS Limitations of diagnostic criteria and assessment instruments for mentaldisorders: implications for research and policy.
Arch Gen Psychiatry 1998;55109- 115
PubMedGoogle ScholarCrossref 4.Narrow
WERae
DSRobins
LNRegier
DA Revised prevalence estimates of mental disorders in the United States:using a clinical significance criterion to reconcile 2 surveys’ estimates.
Arch Gen Psychiatry 2002;59115- 123
PubMedGoogle ScholarCrossref 5.Wakefield
JCSpitzer
RL Why requiring clinical significance does not solve epidemiology’sand
DSM’s validity problem: response to Regierand Narrow. Helzer
JEHudziak
JJeds
Defining Psychopathologyin the 21st Century: DSM-V and Beyond Washington, DC American PsychiatricPublishing Inc2002;31- 40
Google Scholar 7.Robins
LNRegier
DA Psychiatric Disorders in America: The EpidemiologicCatchment Area Study. New York, NY The Free Press1991;
8.Kessler
RCMcGonagle
KAZhao
SNelson
CBHughes
MEshleman
SWittchen
HUKendler
KS Lifetime and 12-month prevalence of
DSM-III-R psychiatricdisorders in the United States: results from the National Comorbidity Survey.
Arch Gen Psychiatry 1994;518- 19
PubMedGoogle ScholarCrossref 9.Kessler
RCBerglund
PDemler
OJin
RKoretz
DMerikangas
KRRush
AJWalters
EEWang
PSNational Comorbidity Survey Replication, The epidemiology of major depressive disorder: results from the NationalComorbidity Survey Replication (NCS-R).
JAMA 2003;2893095- 3105
PubMedGoogle ScholarCrossref 10.Andrews
GedHenderson
ASed Unmet Need in Psychiatry Cambridge, England Cambridge University Press2000;
11.Regier
DANarrow
WERupp
ARae
DSKaelber
CT The epidemiology of mental disorder treatment need: community estimatesof “medical necessity.” Andrews
GHenderson
Seds.
Unmet Need in Psychiatry: Problems, Resources,Responses. New York, NY Cambridge University Press2000;41- 58
Google Scholar 12.Ford
WE Medical necessity: its impact in managed mental health care.
Psychiatr Serv 1998;49183- 184
PubMedGoogle Scholar 14.Wakefield
JC
DSM-IV: are we making diagnostic progress?
Contemp Psychology 1996;41646- 652
Google Scholar 16.American Psychiatric Association, Diagnostic and Statistical Manual of Mental Disorders,Revised Third Edition. Washington, DC American Psychiatric Association1987;
17.Spitzer
RLWakefield
JC
DSM-IV diagnostic criterion for clinical significance:does it help solve the false positives problem?
Am J Psychiatry 1999;1561856- 1864
PubMedGoogle Scholar 21.Regier
DANarrow
WE Defining clinically significant psychopathology with epidemiologicdata. Helzer
JEHudziak
JJeds
Defining Psychopathologyin the 21st Century: DSM-V and Beyond Washington, DC American PsychiatricPublishing Inc2002;19- 30
Google Scholar 22.American Psychiatric Association, Diagnostic and Statistical Manual of Mental Disorders,Third Edition. Washington, DC American Psychiatric Association1980;
24.Andrews
GHenderson
SHall
W Prevalence, comorbidity, disability and service utilisation: overviewof the Australian National Mental Health Survey.
Br J Psychiatry 2001;178145- 153
PubMedGoogle ScholarCrossref 25.Ustun
TBChatterji
SRehm
J Limitations of diagnostic paradigm: it doesn’t explain “need.”
Arch Gen Psychiatry 1998;551145- 1148
PubMedGoogle ScholarCrossref 26.Andrews
GPeters
L The psychometric properties of the Composite International DiagnosticInterview.
Soc Psychiatry Psychiatr Epidemiol 1998;3380- 88
PubMedGoogle ScholarCrossref 27.Kessler
RCWittchen
H-UAbelson
JMMcGonagle
KAKendler
KSKnauper
BZhao
S Methodological studies of the Composite International Diagnostic Interview(CIDI) in the US National Comorbidity Survey.
Int J Methods Psychiatr Res 1998;733- 55
Google ScholarCrossref 28.Beals
JManson
SMMitchell
CMSpicer
Pthe AI-SUPERPFP Team, Cultural specificity and comparison in psychiatric epidemiology: walkingthe tightrope in American Indian research.
Cult Med Psychiatry 2003;27259- 289
PubMedGoogle ScholarCrossref 29.Beals
JManson
SMShore
JHFriedman
MAshcraft
MFairbank
JASchlenger
WE The prevalence of posttraumatic stress disorder among American IndianVietnam veterans: disparities and context.
J Trauma Stress 2002;1589- 97
PubMedGoogle ScholarCrossref 30.Robins
LNHelzer
JECroughan
JRatliff
KS National Institute of Mental Health Diagnostic Interview Schedule:its history, characteristics, and validity.
Arch Gen Psychiatry 1981;38381- 389
PubMedGoogle ScholarCrossref 31.Spitzer
RWilliams
JGibbon
M Structured Clinical Interview for DSM-III-R, VersionNP. New York, NY,New York Psychiatric Institute Biometrics ResearchDepartment1987;
32.Eaton
WWNeufeld
KChen
LSCai
G A comparison of self-report and clinical diagnostic interviews fordepression: diagnostic interview schedule and schedules for clinical assessmentin neuropsychiatry in the Baltimore epidemiologic catchment area follow-up.
Arch Gen Psychiatry 2000;57217- 222
PubMedGoogle ScholarCrossref 33. Language SAS [computer program]. Version 8.2. Cary, NC SAS Institute2001;
34. SPSS [computer program]. Version 11.0. Chicago, Ill SPSS Inc2001;
35.Kish
L Survey Sampling. New York, NY John Wiley and Sons1965;
37.Rosner
B Fundamentals of Biostatistics. 5 Pacific Grove Calif: Duxbury2000;
38.Helzer
JERobins
LNMcEvoy
LTSpitznagel
ELStoltzman
RKFarmer
ABrockington
IF A comparison of clinical and diagnostic interview schedule diagnoses:physician reexamination of lay-interviewed cases in the general population.
Arch Gen Psychiatry 1985;42657- 666
PubMedGoogle ScholarCrossref 39.Endicott
JSpitzer
RLFleiss
JLCohen
J The global assessment scale: a procedure for measuring overall severityof psychiatric disturbance.
Arch Gen Psychiatry 1976;33766- 771
PubMedGoogle ScholarCrossref 40.Brugha
TSJenkins
RTaub
NMeltzer
HBebbington
PE A general population comparison of the Composite International DiagnosticInterview (CIDI) and the Schedules for Clinical Assessment in Neuropsychiatry(SCAN).
Psychol Med 2001;311001- 1013
PubMedGoogle ScholarCrossref 41.Beals
JManson
SMWhitesell
NRSpicer
PNovins
DKthe AI-SUPERPFP Team, Prevalence of
DSM-IV disorders and attendanthelp-seeking in 2 American Indian reservation populations.
Arch Gen Psychiatry In press
Google Scholar 43.Berkson
J Limitations of the application of fourfold table analyses to hospitaldata.
Biometric Bull 1946;247- 53
Google ScholarCrossref 45.Schuman
HPresser
S Questions and Answers in Attitude Surveys: Experimentson Question Form, Wording, and Context Thousand Oaks, Calif Sage Publications Inc1996;
46.Rubio-Stipec
MCanino
GRobins
LNWittchen
HUSartorius
NMiranda
CT The somatization schedule of the Composite International DiagnosticInterview: the use of the probe chart in 17 different countries.
Int J Methods Psychiatr Res 1993;3129- 136
Google Scholar 47.Buck
AAGart
JJ Comparison of a screening test and a reference test in epidemiologicstudies, I: indices of agreement and their relation to prevalence.
Am J Epidemiol 1966;83586- 592
PubMedGoogle Scholar 48.Kraemer
HCNoda
AO’Hara
R Categorical versus dimensional approaches to diagnosis: methodologicalchallenges.
J Psychiatr Res 2004;3817- 25
PubMedGoogle ScholarCrossref 49.Kessler
RC The categorical versus dimensional assessment controversy in the sociologyof mental illness.
J Health Soc Behav 2002;43171- 188
PubMedGoogle ScholarCrossref 50.Lehman
AF Mental disorders and disability: time to reevaluate therelationship. Kupfer
DJFirst
MBRegier
DAeds.
A ResearchAgenda for DSM-V. Washington DC American Psychiatric Association2002;201- 218
Google Scholar