Gould MS, Marrocco FA, Kleinman M, Thomas JG, Mostkoff K, Cote J, Davies M. Evaluating Iatrogenic Risk of Youth Suicide Screening ProgramsA Randomized Controlled Trial. JAMA. 2005;293(13):1635-1643. doi:10.1001/jama.293.13.1635
Author Affiliations: Division of Child and
Adolescent Psychiatry (Dr Gould, Mss Mostkoff and Cote, and Mr Thomas) and
Department of Epidemiology (Dr Gould), Columbia University and New York State
Psychiatric Institute (Drs Gould and Marrocco, Ms Kleinman, and Mr Davies),
New York, NY.
Context Universal screening for mental health problems and suicide risk is at
the forefront of the national agenda for youth suicide prevention, yet no
study has directly addressed the potential harm of suicide screening.
Objective To examine whether asking about suicidal ideation or behavior during
a screening program creates distress or increases suicidal ideation among
high school students generally or among high-risk students reporting depressive
symptoms, substance use problems, or suicide attempts.
Design, Setting, and Participants A randomized controlled study conducted within the context of a 2-day
screening strategy. Participants were 2342 students in 6 high schools in New
York State in 2002-2004. Classes were randomized to an experimental group
(n = 1172), which received the first survey with suicide questions,
or to a control group (n = 1170), which did not receive suicide
Main Outcome Measures Distress measured at the end of the first survey and at the beginning
of the second survey 2 days after the first measured on the Profile of Mood
States adolescent version (POMS-A) instrument. Suicidal ideation assessed
in the second survey.
Results Experimental and control groups did not differ on distress levels immediately
after the first survey (mean [SD] POMS-A score, 5.5 [9.7] in the experimental
group and 5.1 [10.0] in the control group; P = .66)
or 2 days later (mean [SD] POMS-A score, 4.3 [9.0] in the experimental group
and 3.9 [9.4] in the control group; P = .41),
nor did rates of depressive feelings differ (13.3% and 11.0%, respectively; P = .19). Students exposed to suicide questions
were no more likely to report suicidal ideation after the survey than unexposed
students (4.7% and 3.9%, respectively; P = .49).
High-risk students (defined as those with depression symptoms, substance use
problems, or any previous suicide attempt) in the experimental group were
neither more suicidal nor distressed than high-risk youth in the control group;
on the contrary, depressed students and previous suicide attempters in the
experimental group appeared less distressed (P = .01)
and suicidal (P = .02), respectively, than
high-risk control students.
Conclusions No evidence of iatrogenic effects of suicide screening emerged. Screening
in high schools is a safe component of youth suicide prevention efforts.
The President’s New Freedom Commission1 and
the Children’s Mental Health Screening and Prevention Act2 recommend
increased screening for suicidality and mental illness. The recent enactment
of the Garrett Lee Smith Memorial Act3 further
supports the development of youth suicide prevention and intervention programs.
Despite the proliferation of screening programs in recent years (eg, Signs
of Suicide,4 TeenScreen5),
the current debate about possible iatrogenic effects of other suicide preventive
interventions,6,7 and the belief
that prevention programs may “spur troubled youngsters to try suicide,”8 the potential harm of screening for suicide remains
Screening strategies are based on the valid premise that suicidal adolescents
are underidentified11- 15;
have an active, often treatable, mental illness16- 18;
and exhibit identifiable risk factors.11 Evidence
for the clinical validity and reliability of school-based screening procedures
has recently emerged. Use of the Suicidal Ideation Questionnaire (SIQ) in
a midwestern US high school yielded a sensitivity ranging from 83% to 100%,
with specificity from 49% to 70%.19 The Suicide
Risk Screen’s use among 581 students in 7 high schools had a sensitivity
ranging from 87% to 100%, with specificity from 54% to 60%.20 Among
2004 teenagers from 8 New York metropolitan-area high schools, Columbia TeenScreen
exhibited a sensitivity of approximately 88% and specificity of 76%.14 Moreover, many high-risk adolescents, defined as
those with a major depressive disorder, frequent suicidal ideation, or any
previous suicide attempt, were previously unidentified. Recently, the Columbia
Suicide Screen, completed by 1729 9th- to 12th-graders, had a sensitivity
and specificity of 75% and 83%, respectively.21 Systematic
clinical evaluations using interviews such as the Suicidal Behavior Interview22 and the Diagnostic Schedule for Children14 have provided the suicidal status criteria in these
studies. An evaluation of the Signs of Suicide school-based suicide prevention
program, which incorporates an educational component and suicide screen, reported
high satisfaction by school personnel23 and
a short-term decrease in students’ suicide attempts, although neither
help-seeking behavior nor suicidal ideation was affected.24
Studies have identified potential strengths of school-based screening
without assessing potential shortcomings. The US Preventive Services Task
Force cited a similar deficit when reviewing suicide screening by primary
care clinicians.9 A pervasive concern is whether
asking a child or adolescent about suicidal thoughts and behavior may trigger
subsequent suicidal ideation and behavior. This article addresses this concern
by examining whether asking about suicidal ideation and behavior during a
screening program creates immediate or persistent distress or increases suicidal
ideation among high school students generally or among high-risk students
with depression symptoms, substance use problems, or previous suicide attempts,
A randomized experimental design was conducted during a 2-day screening
strategy. Classes within each of 6 high schools were randomized to either
an experimental or a control group. Each school had an approximately equal
number of students in each group. The experimental group received a first
screening survey with a set of questions assessing suicidal ideation and behavior;
the control group received the same first survey but without suicide questions.
A measure of transient distress was given at the beginning and end of the
first survey and repeated at the beginning of a second survey administered
2 days after the first. Both groups received the same second survey with suicidal
ideation or behavior questions (Table 1).
During a subsequent safety review, a project child psychiatrist, psychologist,
or social worker interviewed adolescents reporting serious distress, serious
suicidal ideation, or any suicide attempt to assess imminent suicide risk
and the need for further evaluation and possible treatment. Referrals were
arranged with parents by project social workers.
This study targeted adolescents who were aged 13 through 19 years, in
grades 9 to 12, and attending 6 high schools in Nassau, Suffolk, and Westchester
counties in New York State. Five schools were public coeducational schools
and 1 was a parochial all-boys’ school. These schools were identified
from our earlier screening program.25 To examine
as sensitive an issue as iatrogenic risks of screening programs, we had to
recruit schools in which we had previously gained school administrators’
trust. However, students in the current study had not participated in our
previous screening because they were not yet in high school.
We assessed 2342 of 3635 students (64.4% participation rate) from the
fall of 2002 through the spring of 2004. Reasons for nonparticipation included
parental refusals (61.9%), student refusals (14.3%), and absences (23.7%).
The experimental and control groups consisted of 1172 and 1170 students, respectively
(Figure). The ethnic distribution of
the participating sample was 80.3% white, 5.1% black, 7.3% Hispanic, 3.8%
Asian, and 3.5% other. A total of 58.1% of the students were boys (the inclusion
of an all-male parochial school explains the high percentage of boys). The
mean (SD) age of participating students was 14.8 (1.2) years. There were no
significant differences between experimental and control groups or between
participants and nonparticipants in sex, age, and race/ethnicity. Participants
reported race/ethnicity according to options defined by the investigator.
For nonparticipants, demographic information was obtained from school records
by school administrators. Race/ethnicity was assessed because it is among
the demographic factors related to the epidemiology of suicidal behavior.
The attrition rate from the first to the second survey did not significantly
differ between the experimental (6.0%) and control groups (7.1%) (odds ratio
[OR], 0.83; 95% confidence interval [CI], 0.60-1.16; P = .28).
Furthermore, attrition rates were not significantly related to sex (girls,
7.2%; boys, 6.0%) (OR, 1.01; 95% CI, 0.64-1.60; P = .97)
or race/ethnicity (black, Hispanic, Asian and other groups, 5.4%; white, 6.8%)
(OR, 1.01; 95% CI, 0.53-1.94; P = .97),
nor was there an interaction between these demographics and randomization
group on attrition (OR for sex × randomization, 0.86; 95% CI, 0.44-1.68; P = .65; OR for ethnicity × randomization,
1.2; 95% CI, 0.49-2.95; P = .69). Study
dropouts were older (mean [SD], 15.5 [1.3] years) than those who participated
both days (14.8 [1.2] years) (OR, 1.57; 95% CI, 1.28-1.91, P<.001), but there was no differential relationship of age by randomization
group on attrition (OR for age × randomization, 1.02; 95% CI, 0.79-1.32; P = .88). The relationship of attrition to clinical
risk factors is discussed later.
Students were recruited with an “opt-out” procedure for
parents and active written assent for youth. Two mailings with an information
sheet describing survey content and procedures, a response form, and a stamped
response envelope were sent to parents 6 and 4 weeks before survey administration,
providing parents opportunities to refuse their children’s participation.
Student written assent was obtained immediately before the survey. Parents
and students were informed that the research was designed “to develop
good screening programs and test different methods of screening to minimize
distress in high school students,” and that “alternative formats
of the survey will be used, but over the course of 2 class periods, on separate
days, the same questions will be asked of all students.” The schools’
principals and guidance directors, cognizant of project aims, randomization
procedures, and survey content, approved recruitment and consent procedures.
The study procedures, consistent with the Family Educational Rights and Privacy
Act and the Protection of Pupil Rights Amendment, were approved by the institutional
review board of the New York State Psychiatric Institute/Columbia University
Department of Psychiatry.
Profile of Mood States. The Profile of Mood
States (POMS) is a self-administered adjective checklist measuring transient
mood states.26 Factor analyses of its 65 items,
coded on a 5-point scale,26- 30 yielded
6 factors: “tension-anxiety,” “depression-dejection,”
“anger-hostility,” “fatigue-inertia,” “confusion-bewilderment,”
and 1 positive state, “vigor-activity.” The POMS has demonstrated
excellent internal consistency and has proven sensitive to short- and long-term
change.31- 39 The
present study used an abbreviated version of the POMS, previously developed
and validated in a sample of nearly 2000 adolescents. Confirmatory factor
analysis supported the factorial validity of a 24-item 6-factor model.39 We used 3 of the 4 top loading items on each factor.
The POMS-A has demonstrated criterion and construct validity, and its “right
now” time frame is sensitive to short-term mood changes.39 The
POMS-A measured at the end of the first survey and at the beginning of the
second survey assessed immediate and persistent distress, respectively.
Suicidal Ideation Questionnaire. The SIQ-JR
assesses suicidal thoughts and is designed for large-scale, school-based screenings
of adolescents.40 The 15-item SIQ-JR uses a
7-point Likert-type scale, ranging from 0 (“I never had this thought”)
to 6 (“This thought was in my mind almost every day”), assessing
the frequency of specific suicidal thoughts during the past month. It assesses
a wide range of thoughts related to death and dying, passive and active suicidal
ideation, and suicidal intent. The SIQ-JR, designed for seventh- to ninth-graders,
accommodated the ninth graders in our sample. Reliability of the SIQ-JR is
high, ranging from 0.91 to 0.9640- 42 for
internal consistency and from 0.87 to 0.93 for test-retest reliability (0.89
overall; 0.87 for adolescent girls and 0.93 for adolescent boys).42 The SIQ-JR has demonstrated criterion validity,22,40,42,43 construct
validity in community41,42,44- 47andclinical
samples,43,48- 51 and
Interim Depression and Suicidal Ideation. During
the second survey, 2 questions directly assessed participants’ subjective
experiences of depression and suicidal ideation after the first survey: “Since
the first survey, have you felt depressed? . . . have
you thought about killing yourself?” These were coded on a 5-point scale,
ranging from 0 (not at all) to 5 (a lot). According to clinical judgment,
“a little” suicidal ideation was included as a yes response for
the dichotomized item, whereas “somewhat,” “quite a bit,”
and “a lot” defined yes for depression. These questions were added
to the study after data collection began, and data are available for 4 schools.
Depression Symptoms. The Beck Depression Inventory
(BDI)52 assessed cognitive, behavioral, affective,
and somatic components of depression. Loss of libido was not assessed. The
BDI’s use in more than 200 studies includes those with adolescent samples.53- 55 Each response ranged
from 0 (“symptom not present”) to 3 (“symptom is severe”).
Deleting the suicidal ideation question from the control group’s first
survey necessitated omitting this item from both groups’ total scores,
lowering the maximum total score to 57; therefore, we used a cutoff point
of 15 rather than 16, recommended to detect possible depression in normal
Substance Use Problems. The Drug Use Screening
Inventory (DUSI),56- 58 designed
to screen for alcohol or drug use and problems among teenagers, has demonstrated
good reliability and discriminant validity and sensitivity and has published
normative cutoff scores.57- 63 A
total score combined all 15 items from the substance use scale (assessing
the degree of involvement and severity of consequences from alcohol and drug
use), 3 alcohol or drug items on the school performance adjustment scale,
and 1 additional aggression item assessing the clinically predictive problem
of breaking things or getting into fights while under the influence of alcohol
or drugs.18 A cutoff point of at least 5 dichotomized
total scores according to the recommended cutoff points, roughly corresponding
to 10% of the sample.60
Suicide Attempt History. Seven questions asking
about lifetime and recent suicide attempts were derived from the depression
module of the Diagnostic Interview Schedule for Children64 and
an earlier suicide screen.21 These items have
demonstrated good construct validity.21,65 The
assessment of an attempt included questions about occurrences, injuries sustained,
medical care sought, and hospitalization.66 Any
attempt (regardless of timing, injury, or medical attention) categorized a
student as “high risk.” For purposes of parallel measurement,
attempt history was derived from the second-day survey for the experimental
and control groups. Agreement on attempt history between the first- and second-day
surveys for the experimental group was high (κ= 0.79; SE, 0.05).
The primary sampling unit was school and the secondary sampling unit
was student within school. Thus, we first examined the extent of within-school
clustering to determine whether this clustering variable warranted inclusion
in the analyses. The sample clusters (school) had little impact on the outcomes
(POMS, SIQ-JR) or risk modifiers (depression symptoms, substance use problems,
suicide attempt history), as indicated by the intraclass coefficients, which
were all close to zero. Therefore, the use of mixed-effects linear models
to account for the clustering variable of school was unnecessary. School was
included as a covariate in all analyses.
The primary tests of the a priori hypotheses about immediate distress,
persistent distress, and suicidal ideation involved comparisons of the experimental
and control groups on the outcome measures. Multivariable linear regression
models were estimated to determine the significance of randomization status
(ie, experimental or control group) on immediate distress (POMS-A2, end of
first survey), persistent distress (POMS-A3, beginning of second survey),
or suicidality (SIQ-JR). The total POMS-A1 score (beginning of first survey)
was used as a covariate in the analyses of POMS-A2 and POMS-A3 by design because
an expected high correlation between the pre-POMS and post-POMS scores (A1-A2 r = 0.87, P<.001;
A1-A3 r = 0.76, P<.001)
yields a substantial increase in statistical power to test the primary hypotheses.
Another series of models included each risk modifier separately (depression
symptoms, substance use problems, suicide attempt history), the risk × randomization
group interaction term, and randomization group to test whether some students
were more susceptible to distress or suicidality from the suicide questions.
Logistic regression models were estimated to examine the main and interactive
effects on interim depression and suicide. Significance levels were set at
5%. For continuous variables, there was ample statistical power (≥95%)
to detect small main effects (≥15%) and small interaction effects (≥25%).
For dichotomous variables, there was adequate power (≥80%) to detect a
small OR (≥1.4) for a main effect and an interaction OR of 2 for a rare
risk factor (approximately 5% prevalence) and an outcome in excess of 10%
Applying the Consolidated Standards of Reporting Trials67 statement
principles, there was no post hoc adjustment for baseline differences between
the randomized conditions because such adjustment is likely to bias the estimated
treatment effect.68 The DUSI score was the
only baseline variable to differ between the experimental and control groups
(mean [SD], 1.2 [2.4] and 1.0 [2.1], respectively, P<.001, which reflects a minimal effect size [0.1]). The statistical
analyses were conducted using SPSS statistical software, version 12 (SPSS
Inc, Chicago, Ill).
Attrition rates (Table 2) were
not significantly related to randomizationgroup (P = .28),
depression symptoms (P = .66), substance
use problems (P = .35), or suicide attempt
history (P = .52), nor were there any significant
interactions between risk status and randomization group. Baseline scores
on the POMS, SIQ-JR, BDI, and DUSI were not associated with attrition, nor
did they interact with randomization group. The lack of differential attrition
provides evidence that our subsequent analyses and interpretations are not
vulnerable to this potential threat to the study’s internal validity
and suggests that the experimental group’s high-risk students were no
more distressed than those in the control group.
Experimental and control groups did not significantly differ in distress
levels immediately after the first survey (POMS-A2) or 2 days later (POMS-A3)
(Table 3), nor were there any differences
on the 6 POMS-A subscales. Rates of depressive feelings in the 2-day period
between the surveys were not significantly different between the experimental
(13.3%) and control (11.0%) groups (P = .19).
The experimental group reported no more suicidality after the survey
than the control group (Table 3). Neither
SIQ-JR scores in the second survey nor rates of interim suicidal thoughts
between the first and second surveys were significantly higher among the experimental
group (4.7%) than among the control group (3.9%; P = .49).
Depression Symptoms. Students with depression
symptoms above the cutoff score of 15 on the BDI reported more distress and
suicidal ideation than students below the cutoff score in both experimental
and control groups (Table 4). However,
being exposed to suicide questions in the first survey did not exacerbate
distress or suicidal ideation among depressed students. On the contrary, the
direction of the significant depression by randomization group interactions
on POMS-A2 (β = −1.58; 95% CI = −2.78
to −0.38; P = .01) and POMS-A3 (β = −2.00;
95% CI = −3.52 to −0.48; P = .01)
indicated that among depressed youth, the experimental group had slightly
lower distress scores than the control group.
Substance Use Problems. Students with substance
use problems had significantly higher rates of interim depression symptoms
(P = .047) and interim suicidal ideation
(P<.001) and scored higher on the SIQ-JR (P<.001) than those without these problems; however,
none of the interactions reached statistical significance (Table 5).
History of Suicide Attempt. Students with previous
suicide attempts reported significantly more distress and suicidal ideation
(Table 6). The significant interactions
on the SIQ-JR (β = −5.33; 95% CI = −9.40
to −1.26; P = .01) and interim suicidality
(OR = 0.17; 95% CI = 0.04-0.72; P = .02)
indicated that among previous suicide attempters, the experimental group had
less suicidal ideation than the control group.
This article described 2342 adolescents from 6 high schools in New York
State participating in a school-based suicide screening program. Half the
students were randomized to receive questions about suicidal ideation and
behavior in the first screening survey. The other half did not receive these
questions until a second screening survey 2 days later. There was no evidence
of an iatrogenic effect of asking about suicide. Neither distress nor suicidality
increased among the entire population of surveyed students or high-risk students
who were asked about suicidal ideation or behavior. On the contrary, the findings
suggested that asking about suicidal ideation or behavior may have been beneficial
for students with depression symptoms or previous suicide attempts.
The lack of detrimental effects in the present study contrasts with
findings reported for some suicide-prevention programs,13,69 such
as suicide awareness curricula programs of the 1980s. These usually included
didactic presentations on suicide statistics, “warning signs”
of suicide, and mental health resources. Often a videotape depicted a suicidal
youngster or the consequences of failing to help a suicidal peer.70 Although several studies reported modest increases
in knowledge of symptoms,69,71- 73 helpful
help-seeking behavior,74 others reported either
no benefits13,76 or detrimental
effects.13,69 Detrimental effects
included a decrease in desirable attitudes,77 a
reduction in the likelihood of recommending mental health evaluations to a
suicidal friend,72 more hopelessness and maladaptive
coping responses among boys after exposure to the curriculum,69 and
negative reactions among students most at risk for suicide (ie, those with
a history of suicidal behavior).13 Adolescent
suicide attempters said they would not recommend suicide-curriculum programs
to other students, reporting that talking about suicide in the classroom “makes
some kids more likely to try to kill themselves.”13 Our
findings show that detrimental effects should not be inappropriately applied
to all school-based suicide-prevention strategies, such as screening programs.
Our findings also show that extensive research supporting an imitative
effect of suicide reports in the media11,78- 80 does
not apply to screening survey questions. Furthermore, the evidence that previous
suicidal behavior may enhance the imitative effect of media reports81,82 cannot be extrapolated to suicide-screening
The present study has several advantages for addressing the impact of
screening programs. First, the randomized experimental design involved the
direct manipulation of the suicide question exposure. Second, an ecologically
valid setting (high schools) was used, rather than a laboratory setting, enabling
generalization to the actual settings of suicide-screening programs. Third,
several outcome indicators exhibited consistent results. Fourth, the large
sample yielded ample statistical power to detect interactions between the
experimental condition and depression symptoms, substance use problems, and
a suicide attempt history.
The study also has important limitations. First, we used suburban schools
with predominantly white populations of limited socioeconomic diversity so
that the results cannot be generalized to urban, more ethnically or socioeconomically
diverse settings. The schools were recruited from an earlier “postvention”
screening project, involving schools with a student who had recently completed
suicide and demographically matched comparison schools without such students.
Three postvention and 3 comparison schools participated in the present study.
In thegreater New York metropolitan area, most adolescents who complete suicide
are white; consequently, our project was composed of a largely white population.
Design considerations also dictated our implementation of the postvention
project in the suburban counties surrounding New York City, rather than in
New York City (which has a more ethnically diverse population) because lengthy
delays in the adjudication of suicides in the New York City Medical Examiner’s
office precluded the timely implementation of the postvention protocol.
Second, our recruitment from an earlier postvention study might suggest
that postvention influenced the results, thus limiting generalizability. However,
the average interval since the index suicide was 72 months, ranging from 64
to 84 months, making the influence of the suicide less likely. Moreover, there
were no significant interactions between postvention status and randomization
group on distress or suicidal ideation, indicating that past postvention did
not affect outcome.
Third, our participation rate was low, common to other suicide-screening
protocols.21 Despite no significant differences
between participants and nonparticipants in demographic factors (eg, sex,
grade level, ethnicity), the same cannot be said about clinical factors (eg,
risk status, BDI and SIQ-JR scores).
Fourth, by design the experimental group was asked about suicidal ideation
or behavior in the first and second surveys, whereas the control group was
asked these questions once, raising the possibility that attenuation83,84 masked an iatrogenic effect. However,
standardized differences between means from 2 administrations of the SIQ-JR,
using data on its test-retest reliability,42 indicate
minimal attenuation (effect size = −0.03). In the present
study, there was a significant decrease in the SIQ-JR from the first survey
(mean [SD] = 7.7 [11.1]) to the second survey (mean [SD] = 6.5
[11.5]; t = −5.05; P<.001). If attenuation had masked an iatrogenic effect, this large
a decrease (effect size = −0.11) would not have been expected.
A masked iatrogenic effect would have been more consistent with no decrease
in scores. A comparison of the first administration of the SIQ-JR in the control
group (second survey SIQ-JR) and the experimental group’s first survey
SIQ-JR cannot inform this issue because the survey content preceding these
assessments was not comparable.
Our findings can allay concerns about the potential harm of high school–based
suicide screening. Universal screening for mental health problems and suicide
risk should continue to be at the forefront of the national agenda for youth
suicide prevention. Moreover, our findings should assure health professionals
that they should not refrain from asking their patients about suicidality
for fear of its inducement.
Corresponding Author: Madelyn S. Gould,
PhD, MPH, New York State Psychiatric Institute, 1051 Riverside Dr, New York,
NY 10032 (email@example.com).
Author Contributions: Dr Gould had full access
to all of the data in the study and takes responsibility for the integrity
of the data and the accuracy of the data analysis.
Study concept and design: Gould, Kleinman.
Acquisition of data: Gould, Marrocco, Thomas,
Analysis and interpretation of data: Gould,
Marrocco, Kleinman, Davies.
Drafting of the manuscript: Gould.
Critical revision of the manuscript for important
intellectual content: Gould, Marrocco, Kleinman, Thomas, Mostkoff,
Statistical analysis: Gould, Kleinman, Davies.
Obtained funding: Gould.
Administrative, technical, or material support:
Marrocco, Thomas, Mostkoff, Cote.
Study supervision: Gould.
Financial Disclosures: Mr Davies owns stock
in Merck, Pfizer, Bristol-Myers Squibb, Wyeth, Lilly, GlaxoSmithKline, Johnson
& Johnson, Amgen, Elan, and Bard. No other authors reported financial
Funding/Support: This project was supported
by National Institute of Mental Health (NIMH) grant R01-MH64632.
Role of the Sponsor: The sponsor, NIMH, was
not involved in the design and conduct of the study; collection, management,
analysis, and interpretation of the data; or in the preparation, review, or
approval of the manuscript.
Acknowledgment: We thank Lia Amakawa, BA, for
assistance in manuscript preparation.