A model for determining the auditory consequences of speaking. An internal forward model makes predictions of the auditory feedback (corollary discharge) based on a copy of the motor command (efference copy). These predictions are then compared with the actual auditory feedback (reafference). Self-produced speech sounds can be correctly predicted on the basis of the efference copy and are associated with little or no sensory discrepancy resulting from the comparison between predicted and actual feedback. This results in suppression of auditory cortex to the self-produced sound, as can be seen by a reduced N100 amplitude. When the actual feedback does not match the predicted feedback (by altering the feedback), the discrepancy increases and so does the likelihood that the sound is externally produced. As a result, the cortical suppression decreases and the N100 amplitude increases. Such a system would allow the individual to cancel out the effects of self-produced speech and thereby distinguish sounds due to self-produced speech from auditory feedback caused by the environment.
Behavioral performance during speaking (A) and listening (B) in normal controls (NC), schizophrenic hallucinators (SZH), and schizophrenic nonhallucinators (SZNH). 1 indicates self; 2, self, pitch-shifted; 3, alien; and 4, alien, pitch-shifted.
Grand averages at C3 and scalp amplitude maps during speaking. Note that the scale is different from that used in Figure 4. The green curve indicates unaltered speech; red curve, pitch-shifted speech.
Grand averages at C3 and scalp amplitude maps during listening. Note that the scale is different from that used in Figure 3.
Means and standard deviations for N100 amplitude on the left hemisphere during speaking (A) and listening (B).
Grand averages at Cz during speaking (green) and listening (pink) in the self, unaltered feedback condition for normal controls (A), hallucinators (B), and nonhallucinators (C).
Results of correlational analyses. A, N100 suppression and misattributions. B, N100 suppression and auditory hallucinations. C, Auditory hallucinations and delusions. SAPS indicates Scale for the Assessment of Positive Symptoms.
Theda H. Heinks-Maldonado, Daniel H. Mathalon, John F. Houde, Max Gray, William O. Faustman, Judith M. Ford. Relationship of Imprecise Corollary Discharge in Schizophrenia to Auditory Hallucinations. Arch Gen Psychiatry. 2007;64(3):286–296. doi:10.1001/archpsyc.64.3.286
A forward model of intended thoughts and actions prepares sensory cortex for sensations that are a consequence of those actions. Imprecision of the corollary discharge in schizophrenia may contribute to the misperception of inner experiences and thoughts as “voices” or auditory hallucinations.
To assess the precision of the forward model in schizophrenia using the N100 component of the auditory event-related potential to speech that is altered or unaltered, in real time, as it is being spoken. To assess the relationship between auditory hallucinations and the imprecision of the corollary discharge.
Prospective case-control study.
Community mental health centers and Palo Alto Veterans Affairs Health Care System, Palo Alto, Calif.
Twenty patients with schizophrenia and 17 sex- and age-matched healthy control subjects.
Main Outcome Measures
N100 responses to auditory feedback, which was altered by pitch-shifting the self-voice, substituting an alien voice, or pitch-shifting the alien voice. On each trial, subjects judged whether feedback was “self,” “other,” or “unsure.” Clinical ratings were used to assess severity of auditory hallucinations in patients.
In controls, N100 to unaltered self-voice feedback was dampened relative to N100 to altered self-voice or alien auditory feedback. This pattern was not seen in hallucinating patients. This imprecision correlated with the severity of hallucinations and with the percentage of misattribution errors.
These data support a connection between auditory verbal hallucinations and the imprecision of the corollary discharge heralding the sensory consequences of thoughts and actions.
Many theoretical models1- 3 have been developed to explain how the healthy human brain distinguishes between sensory experiences resulting from self-generated actions and those from external sources. In schizophrenia, this distinction seems to be blurred. Patients hear voices they attribute to others, they have delusions that their thoughts and behaviors are controlled by external forces, and they misinterpret the actions of others as being relevant to themselves. This constellation of symptoms has been attributed to a failure of a self-monitoring system.
Self-monitoring can be accomplished with a “forward model” system, in which an efference copy of a motor command is used to predict the sensory consequences (corollary discharge) of the resulting action.2- 4 A comparison of this corollary discharge with actual sensory feedback associated with the action (“reafference”) provides a mechanism for filtering sensory information. When there is a match between predicted and actual sensory feedback, a net cancellation of sensory input results, leading to a dampened sensory experience. When these signals do not match, or when there is no corollary discharge to cancel sensory feedback (as occurs when sensory stimulation results from external events5), sensory experience is intensified, alerting us to potentially important environmental events (Figure 1).
Support for such forward models comes from animal electrophysiologic studies of the auditory system: corollary discharges from motor speech commands prepare auditory cortex for self-generated speech, linking regions of frontal lobes where speech is generated to regions of temporal lobes where it is heard. Early evidence of the effect of vocal production on attenuation of auditory responses came from studies of bats and monkeys. In bats, a 15-dB attenuation of responses in the lateral lemniscus of midbrain is seen during vocalization.6,7 Similarly, in monkeys, activity in the auditory cortex is inhibited during vocalizations.8,9
In humans, there have been reports of dampened temporal lobe responsiveness during speech production. Creutzfeldt et al10 recorded from the exposed surface of the right and left temporal cortices while patients talked and listened during a presurgical planning procedure. During listening, all neurons in superior temporal gyrus responded to various aspects of spoken language. During overt talking, they observed suppression of ongoing medial temporal gyrus activity in about one third of the neurons and also in some neurons of the superior temporal gyrus. Similarly, noninvasive magnetoencephalography recordings have shown that auditory cortical responses to self-produced speech are attenuated compared with responses to tape-recorded speech.11- 14 For example, the magnetoencephalography-based M100 component to speech sounds is reduced as they are being spoken compared with played back.12,13 We showed a similar effect on the electroencephalogram-based N100 (N100) component of the event-related brain potential (ERP).15 Because both M100 and N100 have a dominant source in primary and secondary auditory cortex,16- 21 this attenuation happens early in auditory processing.
Support for such forward models also comes from animal and human electrophysiologic studies of the somatosensory system. Somatosensory responses to self-generated movements are attenuated compared with externally generated movements.22- 25
The precision of the feed-forward system has been addressed in studies of the somatosensory system, which show that sensory stimulation has to correspond accurately to the movement producing it to attenuate its perception.26 When different degrees of delay or rotation were introduced between the subject's movement and the resultant tactile stimulation, subjects rated the tactile sensation as more intense than when no distortion was introduced. Furthermore, the subjects reported incremental increases in perceived intensity as the delay or the rotation was parametrically increased.
Forward model precision has also been addressed in the auditory system. In a positron emission tomography study, altered auditory feedback during talking activated different brain regions than unaltered feedback.27 To achieve better temporal resolution, Houde et al12 used magnetoencephalography to compare the M100 to altered and unaltered speech during speech production. When white noise was substituted for speech sounds, M100 suppression during talking was abolished. While suggestive of some specificity of the feed-forward mechanism, white noise produces widespread activation of the auditory cortex and is very different from speech. Recently, our group showed28 that, when self-generated speech sounds were altered slightly (2 semitones) or substituted with someone else's speech, subjects made more errors recognizing the source of the auditory feedback and showed less suppression of the N100.
Consistent with the predictions of Feinberg,29 Feinberg and Guazzelli,30 and Frith and Done,31 which have been substantiated by a behavioral study by Blakemore et al32 and a study investigating smooth-pursuit eye movements,33 our group showed15 neurophysiologic evidence of a dysfunctional corollary discharge system in schizophrenia; N100 reduction to sounds, as they were being spoken, was not seen in patients. We extended this by showing that talking produced greater electroencephalogram coherence between frontal-temporal regions than listening in normal controls, but not in patients.34 We suggested that reduced frontotemporal functional connectivity contributed to misattributions of self-generated thoughts and actions to external sources.
An imprecise corollary discharge in patients with schizophrenia might affect their ability to form precise predictions about sensory consequences of their actions. When visual feedback about a self-generated hand movement was replaced by an alien hand35 or the angle of movement was slightly distorted (15°-20°),36 delusional patients made more errors than controls, suggesting a tendency to say “not me” when movements were simply distorted. Similarly, when voice feedback was pitch-shifted during speaking37 or substituted by an alien voice,38 delusional patients attributed the voice to external sources. Although all patients made more errors than controls, hallucinators were particularly prone to misattribute their distorted voice to an external source, which the authors interpreted as reflecting impaired awareness of internally generated verbal material.
In the current study, we combined neurophysiologic and behavioral measures to link patterns of ERP amplitudes with misattributions of source. To this end, feedback during talking could be pitch-shifted, replaced by an alien voice, or both. Using the paradigm described in our previous report (Heinks-Maldonado et al28), we predicted that N100 amplitude would be reduced during talking compared with listening, even in the patients, but this reduction would be significantly smaller in the patients. We further predicted that the corollary discharge would be undifferentiated in patients, resulting in equivalent N100 amplitudes regardless of the match between corollary discharge and reafference. We further predicted that this abnormality would be related to auditory hallucinations. Consistent with the literature,38 we predicted that all subjects would perform more poorly when feedback was altered and that patients with hallucinations would perform even worse. Also, we predicted that the tendency for misattributions would be related to an imprecise corollary discharge mechanism as reflected in an undifferentiated N100 response to different levels of distortion during talking. Finally, because the corollary discharge is not operating during listening, we predicted no group differences in the error rate during listening.
Twenty-three men with schizophrenia (DSM-IV [Structured Clinical Interview for DSM-IV])39 and 23 age-matched healthy male comparison subjects (screened with the Structured Clinical Interview for DSM-IV for any significant history of Axis I psychiatric illness) participated in the study. (The ERP data from these controls, analyzed with different inclusion criteria, appeared earlier.28) All subjects gave written informed consent after procedures had been fully explained. Patients were recruited from community mental health centers, as well as from inpatient and outpatient services of the Palo Alto Veterans Affairs Health Care System. Controls were recruited through Internet advertisements. Subjects were excluded if they had significant hearing loss, significant head injury (loss of consciousness longer than 30 minutes or resulting in neurologic sequelae), neurologic or other medical illnesses compromising the central nervous system, or DSM-IV alcohol or drug abuse within 30 days before the study. All patients were taking stable doses of atypical antipsychotic medication.
Data from 3 patients were excluded: 1 patient was not able to perform the task and the experiment had to be terminated; 2 patients were excluded from further analyses owing to movement artifact contamination. Data from 3 controls were excluded owing to movement artifact contamination. The experiment was prematurely terminated in 2 control subjects because they reported always hitting the “self” button to avoid being “mistaken for being schizophrenic.” One control had to leave early and was not able to return for a second session.
The final sample included 17 controls and 20 patients (10 hallucinators and 10 nonhallucinators). Details of the final samples appear in Table 1.
Patient symptoms were assessed by at least 2 trained raters (including a psychiatrist or clinical psychologist) administering the 18-item Brief Psychiatric Rating Scale,40,41 the Scale for the Assessment of Positive Symptoms, and the Scale for the Assessment of Negative Symptoms.42,43 A hallucinations score was calculated as the mean of “auditory hallucinations” (item 1), “voices commenting” (item 2), and “voices conversing” (item 3). Patients were classified as hallucinators if they were currently experiencing auditory verbal hallucinations on a regular basis and scored at least 2 (mild) on the auditory hallucinations score, a cutoff determined a priori.
All hallucinators experienced hallucinations in the 4 weeks before the experimental session. Some patients reported experiencing hallucinations during the experimental tasks. Nonhallucinators had not experienced auditory verbal hallucinations in the past 4 weeks (hallucinations score, ≤1). These subgroups also differed on the delusions subscale of the global Scale for the Assessment of Positive Symptoms (U = 15; P = .01).
Patients also underwent neuropsychological assessment. There was no difference in IQ between the patient groups.
A complete description of how the task was instrumented appears elsewhere.28 In the speaking task, subjects were told to utter a short ah about every 5 seconds. The feedback-voice heard via headphones was varied randomly between their own unaltered voice (self, unaltered), their own voice pitch-shifted downward by 2 semitones (self, pitch-shifted), the alien unaltered voice (alien, unaltered), and the alien voice pitch-shifted downward by 2 semitones (alien, pitch-shifted). The self, unaltered voice needed to be pitch-shifted down 0.3 semitone to best match the subjective experience of self-generated speech.44
After each trial, subjects were prompted to indicate via button press whether the feedback heard was their own voice or the alien voice or whether they were unsure. Subjects were required to respond within 1.5 seconds after the prompt. Responses falling outside that window were considered misses. Subjects were told that their own or the alien voice would sometimes be pitch-shifted, but they were still required to decide whether its source was “self” or “alien.”
In the listening task, the recorded feedback sounds from the speaking task were played back, and subjects were instructed to merely listen and then decide about the source of the voice heard. All other features remained the same as in the speaking task, including the same visual cues and volume. The listening task was carried out to replicate the approach of other studies comparing cortical responses during speaking and listening, as well as to determine whether there were differential effects of the feedback conditions when participants were merely listening. Each task consisted of 240 trials with 60 trials per condition.
To create the different feedback conditions, we used an audio presentation system described earlier28 that allowed us to detect the subject's vocalization and modulate the subject's voice or substitute it with a prerecorded speech sample of a male voice (alien). When the subject vocalized, the speech signal was picked up by a microphone and sent through a preamplifier to a personal computer equipped with sound processing software and hardware. The incoming audio signal was used to generate a trigger pulse that initiated either a pitch shift or the alien voice sample, which was amplified and played to the subject via headphones. This audio setup allowed us to detect and modulate in real time the subject's vocalizations with only 6 milliseconds of delay as measured with an oscilloscope (Tektronix, Beaverton, Ore). A delay this small is not perceptible,45,46 and it is unlikely to influence the subject's performance or the ERP amplitudes or latencies.
The mean sound pressure level of the subject’s utterances was 76 dB measured 5 cm from the subject's mouths. During both the speaking and listening tasks, mean sound pressure level of speech sounds played back over the headphones was increased 15 dB over the average sound pressure level of the subject's speech. This was necessary to mask the effect of bone conduction during vocalization.
We acquired electroencephalogram data continuously from 27 sites (F7, F3, Fz, F4, F8, FT7, FC3, FCz, FC4, FT8, T5, C3, Cz, C4, T6, TP7, CP3, CPz, CP4, TP8, P7, P3, Pz, P4, P8, Tp9, and Tp10) referenced to the nose. Additional electrodes were placed on the outer canthi of the eyes to measure horizontal eye movements, and above and below the right eye to monitor blinks and vertical eye movements. Epochs were synchronized to vocalization onset and corrected for eye movements and blinks,47 and then re-referenced relative to the mastoid electrodes to minimize artifacts associated with talking, as well as to be consistent with the reference sites used in our previous studies.15,48 After rejection of trials containing artifacts (voltages exceeding +50 μV), averages using only correctly identified trials were created and then bandpass filtered 0.5 to 12 Hz. None of the averages included in the statistical analyses contained fewer than 32 trials. For the control group, 89.0% of all trials were included; for the hallucinators, 91.5%; and for the nonhallucinators, 90.2%. We ran an analysis of variance (ANOVA) on the proportion of included trials and did not find significant differences between groups, tasks, or conditions.
Our primary neurobiological dependent measure was N100 amplitude, which was defined as the most negative peak between 80 and 120 milliseconds following the onset of the speech sound and was measured relative to a baseline of 100 milliseconds prior to stimulus onset.
To maintain consistency with our previous analysis (Heinks-Maldonado et al28), N100 amplitudes were analyzed in a 5-way ANOVA with the between-subjects factor group (controls, hallucinators, nonhallucinators) and within-group factors task (speaking, listening), condition (self, unaltered; self, pitch-shifted; alien, unaltered; alien, pitch-shifted), laterality (left, right), and electrode site. Twenty electrode sites were included in the analysis, 10 for each hemisphere.
To maintain consistency with the analysis of Johns et al,38 the accuracy of the subject's judgments regarding the source of the speech sounds was analyzed in a 4-way ANOVA with the between-subjects factor group (controls, hallucinators, nonhallucinators) and within-group factors task (speaking, listening), source (self, alien), and pitch (unaltered, pitch-shifted).
To understand the simple main effects of a complex interaction, we parsed the interaction by doing separate ANOVAs for each level of a factor involved in the interaction, according to the suggestions of Keppel.49
To simplify correlational analyses with ERP data, a value representing the size of the N100 suppression effect was created. A “suppression value” was calculated by subtracting N100 amplitude in the self, unaltered condition from the mean of the remaining 3 conditions (self, pitch-shifted; alien, unaltered; and alien, pitch-shifted). (A negative suppression value is the normal pattern.) This was done by collapsing across 10 electrodes per hemisphere (left: F7, F3, FT7, FC3, T7, C3, TP7, CP3, P7, P3; right: F8, F4, FT8, FC4, T8, C4, TP8, CP4, P8, P4), resulting in 2 suppression values (left and right). Associations between suppression effects and symptom scores and response accuracy were analyzed by bivariate correlational analyses.
Figure 2 shows the percentage of correct, unsure, and misattribution responses.
In the 4-way ANOVA described in the preceding section, we analyzed the percentage of misattribution errors made by the subjects in the speaking and listening tasks and found significant main effects of source (F1,34 = 4.37, P = .04) and pitch (F1,34 = 19.71, P<.001). These 2 significant main effects indicated that subjects made more errors when the feedback was alien and when it was pitch-shifted. Furthermore, the interaction of source and pitch was significant (F1,34 = 15.98, P<.001), reflecting a higher number of misattribution errors when the feedback was both alien and pitch-shifted.
The groups, however, had different response patterns (source × pitch × group interaction, F2,34 = 3.22, P = .05) that were investigated further by follow-up 1-way ANOVAs. As described earlier, we had expected a difference in behavioral performance between the speaking and listening tasks, but there were no significant main effects or interactions involving task. However, we analyzed the speaking and listening task data separately to allow comparison of our findings with those of Johns et al,38 who found abnormalities in the performance of hallucinators during active speech production. In our study during speech production, hallucinators made more misattribution errors than controls in all 4 feedback conditions (self, unaltered: F1,27 = 11.73, P = .002; self, pitch-shifted: F1,27 = 47.72, P<.001; alien, unaltered: F1,27 = 7.62, P = .01; alien, pitch-shifted: F1,27 = 4.39, P = .046). (The same pattern was found in the analysis of the listening task.) The 2 patient groups differed significantly in their responses to their own pitch-shifted voice (F1,21 = 5.06, P = .04) and the feedback of the unaltered alien voice (F1,21 = 8.33, P = .009). In both conditions, hallucinators made more misattribution errors than the nonhallucinators, ie, hallucinators responded by pushing the “other” button when their own voice was distorted and the “self” button in the alien, unaltered condition.
The results of the principal ANOVA are listed in Table 2. The main effect of task reflects significantly smaller N100s during talking than listening. This considerable difference led us to use different scales for Figure 3 and Figure 4, which show the grand average ERPs at C3.
There was also a main effect of condition, with N100 to the self, unaltered feedback condition being smallest. Most important, these effects were different across groups, as reflected in task × condition × group and task × condition × laterality × group interactions. Follow-up ANOVAs were conducted for each hemisphere separately (laterality factor). While the task × condition × group interaction for the right hemisphere did not reach significance, the ANOVA for the left hemisphere did (F6,102 = 2.57, P = .02).
The data for the left hemisphere were then analyzed for each task separately. As expected, the condition × group interaction was not significant during the listening task, as can be seen in Figure 4 and Figure 5B. During speaking, however, there was a significant condition × group interaction (F6,102 = 2.63, P = .02), demonstrating that the group difference in N100 amplitude depends on feedback. As can be seen in Figures 3 and 5A, controls showed the smallest N100 amplitude to their own unaltered voice feedback (the “internally predicted” feedback) and larger amplitudes to the altered feedback conditions. Helmert contrasts comparing the mean of the self, unaltered condition with the mean of the 3 altered feedback conditions confirm this (F1,16 = 11.89, P = .003). This effect was not seen in the hallucinators (F1,9 = 0.28, P = .61) or the nonhallucinators (F1,9 = 0.04, P = .84). Also, none of the remaining Helmert contrasts (successively comparing the remaining different levels of the factor condition) was significant.
This condition × group interaction during speaking was further parsed with separate ANOVAs for each pairwise group comparison to explore the locus of the interaction, as suggested by Keppel49 (1) controls vs hallucinators, (2) controls vs nonhallucinators, and (3) hallucinators vs nonhallucinators. The condition × group interaction was significant only for the comparison of hallucinators and controls (F3,75 = 3.73, P = .02); only hallucinators differed from healthy controls.
The group × condition interaction was also parsed with follow-up ANOVAs for each group separately and, as predicted, there was a significant effect for condition in the control group (F3,48 = 4.74, P = .009). Although it appears that there was a difference in N100 amplitude between alien voice and alien pitch-shifted voice in the hallucinators (Figure 3), the effect of condition was not significant (F3,27 = 1.82, P = .19). Also, although N100 was smaller in unaltered conditions, regardless of source (self or alien), in nonhallucinators, condition was not significant (F3,27 = 1.23, P = .32).
To confirm our findings from a previous study (Ford et al15), we compared the self, unaltered condition during speaking and listening at midline sites. We performed ANOVAs to compare the groups in a pairwise manner for the factors of task (speaking, listening) and electrode (Fz, FCz, Cz, CPz). There was a significant task × group interaction for the comparison of controls and hallucinators (F1,25 = 4.73, P = .04). No such interaction was found for the other 2 group comparisons. Figure 6 illustrates the differences in amplitude between the 2 tasks for the 3 groups.
In the patients, there was a weak association between percentage of misattribution errors and amount of suppression during speaking (r = 0.32, P = .04, 1-tailed): the more errors, the smaller the suppression. On the basis of the findings of Johns et al,38 we predicted this finding, and we adopted a 1-tailed test of this statistic (see Figure 7A).
As expected, the Scale for the Assessment of Positive Symptoms global score for hallucinations was significantly correlated with the amount of N100 suppression in the left hemisphere (r = 0.49, P = .02): the more severe the hallucinations, the more positive (abnormal) the suppression value (Figure 7B). The delusions global score did not significantly correlate with the amount of suppression (r = 0.30, P = .10). However, hallucinations and delusions (Figure 7C) were highly correlated (r = 0.67, P = .001).
There were no significant correlations between chlorpromazine equivalents and N100 suppression, behavioral performance, or symptom scores.
As reported earlier,28 there was greater N100 suppression to unaltered voice feedback than to altered feedback during talking, especially over the left hemisphere, in healthy controls. This finding is consistent with magnetoencephalography studies12,13 reporting greater differences between speaking and listening on the left than the right.
We interpreted this suppression as a reflection of a precise forward model mechanism that allows the auditory system to distinguish between internal and external sources of auditory information. To the extent that hallucinations result from a failure of such a mechanism,29- 31 we predicted that schizophrenic patients who hallucinate would fail to show this effect. That is, rather than having the graded suppression of N100 during talking compared with listening that we saw in controls, we predicted an undifferentiated response to the different types of altered feedback in the hallucinators. Our prediction was borne out; during speech production, auditory cortical responses to voice feedback did not distinguish between feedback sounds that matched and did not match the intended sound in patients who hallucinated. The fact that patients do show some suppression of N100 during talking compared with listening suggests that their corollary discharge is operating, although not at normal levels. With this experiment, we were able to show that the suppression produced by the corollary discharge mechanism in patients did not show a graded pattern to altered feedback, suggesting that it is not very precise and, instead, is more generally dysfunctional.
The pattern of auditory cortical responses in nonhallucinators was less homogeneous; some showed a suppression effect comparable to that of controls, and some showed no evidence of suppression, like the hallucinators. This heterogeneous pattern may reflect differences in hallucination history, ranging from none to some, and raises the question about whether the pattern of N100 suppression reflects hallucination state or trait. Importantly, when hallucinators and nonhallucinators were grouped together, we found that the strength of the N100 amplitude suppression effect was correlated with hallucination severity (P = .02, 2-tailed) and misattribution errors (P = .04, 1-tailed). That is, the smaller the N100 suppression effect, the more errors and the more severe the hallucinations.
Earlier studies in our laboratory investigated suppression during speech production compared with passive listening and found that schizophrenic patients did not show the normal difference in N100 amplitude between talking and listening.15 To confirm this finding, we compared N100 with the self-unaltered feedback during speaking and listening. All groups showed smaller N100 amplitudes during speaking than listening; however, hallucinators had the smallest effect, as suggested by the interaction of task and group. Nonhallucinators did not significantly differ from controls. The sample size in the previous study (n = 8) was too small to subgroup patients into hallucinators and nonhallucinators, but the data reported herein suggest that the sample in the previous study was dominated by patients who tended to hallucinate.
All subjects made more misattribution errors and unsure responses when feedback was alien or pitch-shifted. Although even controls responded “self” when hearing alien altered and unaltered feedback, misattribution errors were more likely in patients, especially hallucinators. Furthermore, hallucinators were likely to respond “alien” when hearing self, unaltered feedback. That is, even when their voices were not altered or substituted, hallucinators made fewer correct responses. Most important, hallucinators did not have a higher percentage of misattributions to self, altered feedback than to alien feedback.
This is inconsistent with the data of Johns et al38 and Allen et al,50 who reported a higher proportion of misattribution errors in hallucinators when patients' voice feedback was pitch-shifted than when it was substituted by alien voice feedback. That is, their hallucinators were more likely to respond “alien” than “self” when faced with uncertainty. Johns et al38 and Allen et al50 interpreted their results as a sign of an externalizing bias, which does not fit with our data. Our hallucinators were just as likely to respond “self” as “alien” when faced with altered or alien voices.
Another difference between the results of our study and that of Johns et al38 is the performance of the hallucinators in the self, unaltered condition: 79% of the responses by our hallucinators were correct in that condition vs a much higher percentage in the study by Johns et al.
Several explanations can be advanced for the discrepancies between our results and those of Johns et al. During the alien feedback trials of their study, a research assistant, who sat outside the room where the subject sat, “timed his/her articulation to be approximately synchronous with that of the participant by observing their lip movements through a one-way mirror and by listening to their speech.”38(pp706-707) There is an inevitable delay between the subject's and the research assistant's utterances. The delay is a good clue that the voice was alien, and this clue could have been responsible for the good performance of the subjects in their study,38 who were twice as accurate as ours when feedback was alien. An efference copy contains not only information about the quality of the sound being produced but also critical information about when the sound should be perceived. The delay in our experiment for the alien voice was imperceptible, less than 6 milliseconds, giving our subjects no timing clues about the origin of the sound.
An alternative explanation for the discrepancy in results between the studies by Johns et al and Allen et al and our study may be that the 0.3-semitone pitch shift for the self-unaltered voice in our experiment sounded more similar to the altered speech, making the distinction more difficult. The use of words instead of simple ah's in the other 2 studies may have also contributed to the difference in results. Since words are stimuli of a considerably longer duration, subjects had more time to detect the correct source of the auditory feedback.
Frith and Done’s31 model suggests that hallucinations result from a breakdown in the awareness of self-generated action, which should be apparent during speaking, not during listening. Accordingly, we predicted that misattribution errors would be greater during speaking than listening in hallucinators. Counter to predictions, hallucinators were less accurate than controls during both speaking and listening. Allen et al50 also found that hallucinators performed poorly during listening and concluded that an efference copy deficit cannot underlie the pattern of misattribution errors.
We argue that behavioral and ERP data are reflecting dysfunction in different processes, ie, strategic and perceptual processes, respectively. We suggest that, throughout the lifetime of a patient, the efference copy system may have failed to develop appropriately, resulting in a lifetime of uncertainty about the source of current perceptions. From a biological perspective, uncertainty can be dangerous because it prevents quick but sometimes incorrect actions. For some schizophrenic patients, misattributions may result from a “coping” strategy learned over time to navigate through life and to be able to act and react, depending on information they receive from the environment.
However, we suggest that, to the extent that “misattribution” connotes a conscious decision, “misperception” more accurately describes auditory hallucinations. However, we do agree that a decision, on some level, is made about the source of the auditory verbal experience resulting from a mixture of inner thoughts and experiences colliding with ambient noise and Brownian motion. Random noise can increase a system's sensitivity to weak signals through stochastic resonance,51 and it is well known that patients with schizophrenia have “noisier” systems. Coupled with a Bayesian bias based on prior beliefs or delusions,52 the noisy auditory experience could be perceived as voices coming from sources other than self. That is, believing is hearing. In this way, delusions and hallucinations form a bidirectional self-reinforcing system. Thus, we suggest that “misattribution” is not part of the hallucinatory experience, but part of the delusional system used to explain the aberrant experience. We maintain that the failure of N100 to distinguish between self and alien feedback reflects the dysfunction of the efference copy system, which underlies this aberrant experience and not the delusional system built around it. Perhaps because of the interrelationship between delusions and hallucinations, N100 suppression and delusional severity were weakly correlated.
Correspondence: Judith M. Ford, PhD, Department of Psychiatry, Yale University School of Medicine, 950 Campbell Ave, VA CT Healthcare System, 116A, West Haven, CT 06517 (email@example.com).
Submitted for Publication: November 8, 2005; final revision received June 14, 2006; accepted July 31, 2006.
Financial Disclosure: None reported.
Funding/Support: This work was supported by grants MH40052, MH58262, and MH067967 from the National Institute of Mental Health, and by grants from The German National Merit Foundation, the National Alliance for Research in Schizophrenia and Affective Disorders, and the Department of Veterans Affairs Schizophrenia Biological Research Center.