Figure 1. Proportion of patients classified as having probable or definite amyotrophic lateral sclerosis (ALS). M-H indicates Mantel-Haenszel; rEEC, revised El Escorial criteria.
Figure 2. Pooled sensitivity and diagnostic odds ratio (DOR).
Figure 3. Summary receiver operating characteristic plots (and 95% CI) of sensitivity and specificity. AUC indicates area under the curver.
Costa J, Swash M, de Carvalho M. Awaji criteria for the diagnosis of amyotrophic lateral sclerosis: a systematic review. Arch Neurol.. Costa J, Swash M, de Carvalho M. Awaji criteria for the diagnosis of amyotrophic lateral sclerosis: a systematic review.
eAppendix. Statistical analyses and data synthesis (online supplementary material).
eFigure. QUADAS-list results.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Costa J, Swash M, de Carvalho M. Awaji Criteria for the Diagnosis of Amyotrophic Lateral Sclerosis: A Systematic Review. Arch Neurol. 2012;69(11):1410–1416. doi:10.1001/archneurol.2012.254
SECTION EDITOR: DAVID E. PLEASURE, MD
Author Affiliations: Neuromuscular Unit, Institute of Molecular Medicine, Faculty of Medicine (Drs Costa, Swash, and de Carvalho), and Department of Neurosciences, Hospital de Santa Maria (Dr de Carvalho), Lisbon, Portugal; and Department of Neurology, Royal London Hospital, Queen Mary School of Medicine, University of London, London, England (Dr Swash).
Objective To estimate the potential diagnostic added value of the Awaji criteria for diagnosis of amyotrophic lateral sclerosis (ALS), which have been compared with the previously accepted gold standard the revised El Escorial criteria in several studies.
Data Sources MEDLINE and Web of Science (until October 2011).
Study Selection We searched for studies testing the diagnostic accuracy of the Awaji criteria vs the revised El Escorial criteria in patients referred with suspected ALS.
Data Extraction Evaluation and data extraction of identified studies were done independently. The Quality Assessment of Diagnostic Accuracy Studies list was used to assess study quality. We determined the proportion of patients classified as having probable/definite ALS and derived indices of diagnostic performance (sensitivity, specificity, and diagnostic odds ratio). Quantitative data synthesis was accomplished through random-effects meta-analysis, and heterogeneity was assessed with the I2 test.
Data Synthesis Eight studies were included (3 prospective and 5 retrospective) enrolling 1187 patients. Application of Awaji criteria led to a 23% (95% CI, 12% to 33%; I2 = 84%) increase in the proportion of patients classified as having probable/definite ALS. Diagnostic performance of the Awaji criteria was higher than the revised El Escorial criteria (pooled sensitivity: 81.1% [95% CI, 72.2% to 90.0%; I2 = 91%] vs 62.2% [95% CI, 49.4% to 75.1%; I2 = 93%]; pooled diagnostic odds ratio, 35.8 [95% CI, 15.2 to 84.7; I2 = 3%] vs 8.7 [95% CI, 2.2 to 35.6; I2 = 50%]). Diagnostic accuracy of Awaji criteria was higher in bulbar- than in limb-onset cases.
Conclusion The Awaji criteria have a significant clinical impact allowing earlier diagnosis and clinical trial entry in ALS.
The Awaji recommendations for the use of electrodiagnostic studies in the diagnosis of amyotrophic lateral sclerosis (ALS) were proposed in 20081 to enable earlier diagnosis of ALS to be achieved to meet an acknowledged need to obviate diagnostic delay and to promote earlier entry into clinical trials.2 For this purpose, a new algorithm was proposed on the background of the well-established revised El Escorial criteria (rEEC).3 This algorithm used the recommendations regarding topographical regions and the same number of abnormal muscles in each region as proposed in the rEEC recommendations, emphasizing the importance of a suitable clinical context. The Awaji criteria recommended that neurophysiological data should be used in the context of clinical information, not as a separate, stand-alone set of data. In addition, fasciculation potentials associated with signs of reinnervation were considered as evidence of lower motor neuron lesion, in particular in cranial-innervated or strong limb muscles. It was suggested that this new set of interpretative guidelines, which essentially followed conventional clinical practice, would increase diagnostic sensitivity without major change in specificity.1Table 1 summarizes the similarities and differences between both sets of diagnostic criteria.
Following the seminal article, a number of reports from different centers have been published testing the utility of the Awaji criteria in supporting the diagnosis of ALS. Most of these studies were retrospective, based on analysis of databases, and, therefore, did not necessarily follow all the methodological recommendations in the Awaji criteria for diagnostic studies in ALS.4 Uncertainty about the true diagnostic performance of these new criteria led us to undertake this systematic review and meta-analysis5,6 to assess strengths and weaknesses in the recommendations. The results are relevant for everyday clinical practice, as well as for clinical trials.
We considered as eligible for analysis all diagnostic studies, of any study design, that included an assessment at the initial clinical presentation and that addressed the diagnostic accuracy of rEEC and of the Awaji criteria in patients referred with a clinically suspected diagnosis of ALS/motor neuron disease. We included studies regardless of the electromyography (EMG) protocol used in the evaluation. We accepted the diagnosis of ALS as defined by good clinical practice as described in these studies, provided that the neurophysiological and imaging examinations excluded other mimicking conditions and that disease progression was consistent with that expected in ALS. In general, this has been the standard applied in every study of formal criteria for the diagnosis of ALS, because it reflects the empirical nature of clinical practice.
Potentially eligible studies were identified through an electronic search of bibliographic databases (MEDLINE through PubMed and Web of Science) from 2006, the year of the consensus meeting that gave rise to the Awaji recommendations, to October 2011. The following terms were used in the free-text search field tag and combined “awaji OR escorial OR sensitivity OR specificity OR criteria OR accuracy OR electrodiagn* OR neurophysiol* diagnosis OR electromyograp* diagnosis OR EMG diagnosis” with “amyotrophic lateral sclerosis OR motor neuron disease.” Moreover, reference lists of the discovered articles were cross-checked for potential additional studies. Selection of studies and data extraction were done independently (J.C. and M.C.) and cross-checked for accuracy. Disagreements were resolved by consensus and, if necessary, by a third reviewer (M.S.).
Study quality was assessed independently by 2 reviewers (J.C. and M.C.) using the 11 items from the Quality Assessment of Diagnostic Accuracy Studies list, with each item scored as yes, no, or “unclear.” The Quality Assessment of Diagnostic Accuracy Studies is considered a validated tool to evaluate the presence of bias and variation in studies of diagnostic accuracy.7
We considered as a “representative patient spectrum” subjects suspected of having ALS/motor neuron disease who had been consecutively evaluated in the different centers. An “adequate reference standard” had to include consistent clinical progression over an adequate follow-up. Both index tests (rEEC and Awaji criteria) were required to be applied at the same time.
First, we compared both sets of criteria by determining the individual study and weighted pooled difference with 95% CIs in the proportion of patients who would be classified as having probable ALS (including probable laboratory-supported ALS by rEEC) or definite ALS.
Second, we extracted or derived the sensitivity and specificity from data presented in each primary study for each set of diagnostic criteria and calculated weighted pooled results by plotting sensitivity and specificity estimates and 95% CIs in both forest plots and the receiver operating characteristic space.
Third, we compared both sets of criteria by calculating the individual study and the weighted pooled diagnostic odds ratio (DOR), as well as the relative DOR between the 2 sets of criteria. The DOR is a single indicator of overall diagnostic performance that is particularly convenient when combining diagnostic studies in a systematic review.6 The DOR expresses how much greater the probability of having the disease is for the people with a positive test result than for the people with a negative test result, combining both positive and negative likelihood ratios.8
Quantitative data analysis was accomplished through random-effects meta-analysis to incorporate variation among studies.9 Heterogeneity was assessed with the I2 test, which measures the percentage of total variation across studies due to heterogeneity.10 For all diagnostic indices calculated, subgroup analyses were done according to the region of disease onset, bulbar or limb. We used for statistical analysis the Cochrane Revman 5.1 and Meta-DiSc 1.4 software.11 Additional information on the statistical analysis and data synthesis is available in the eAppendix.
The search yielded a total of 849 citations. After screening titles and abstracts, 27 potentially relevant full-text articles including review articles were retrieved, 8 of which were selected for further analysis on the basis of direct relevance to the study question. Three of these were prospective cohort studies12-14 and 5 were retrospective observational studies.15-19 Together, these studies report a total of 1187 consecutive patients referred to their respective neurological centers because of a clinical suspicion of ALS/motor neuron disease. After clinical and neurophysiological evaluation, 792 had a progressive course consistent with ALS and received a final diagnosis of ALS. All studies were single center and collected data that allowed comparison of the diagnostic accuracy of the rEEC and Awaji criteria at presentation. Clinical and neurophysiological evaluation protocols differed considerably between studies, in particular regarding the number of anatomical regions and muscles evaluated. In only 3 studies13-15 were case ascertainment and diagnosis independent from the physician performing the neurophysiological evaluation. Patient characteristics were, in general, comparable across studies, except for an increased prevalence of patients with bulbar-onset disease in 1 study (42%)18 and a lower rate in another (10%).17 Only 3 studies12,14,18 provided separate data for patients with bulbar and limb onset. The main characteristics of the studies and patients are shown in Table 2.
We considered the range of patients included to be adequate in all studies, because all used an acceptable reference standard for ALS diagnosis. There were no effects from partial or differential verification, because all the studies used similar reference standards in the same way. Both index tests (rEEC and Awaji criteria) were evaluated at the same time. Since the index tests contribute directly to the reference standard, it is not possible to avoid incorporation of, or to blind, the index test results and the resultant reference standard. However, because all studies evaluated both criteria in all patients, comparisons between the 2 sets of diagnostic criteria were not biased by differences between studies. Only 1 study neither reported noninterpretable results nor explained study withdrawals.15 The overall rating quality according to the Quality Assessment of Diagnostic Accuracy Studies list is shown in the eFigure.
The use of the Awaji algorithm added to the rEEC criteria, ie, the Awaji criteria for diagnosis of ALS, was associated with a 23% increase in the number of patients (95% CI, 12% to 33%; I2 = 84%; P < .001) classified as having probable or definite ALS. This is a relevant outcome because only patients within these diagnostic categories are usually considered eligible for clinical trials. Subgroup analysis based on data from 3 studies12,14,18 showed that this difference is more pronounced in patients with bulbar-onset (48% [95% CI, 18% to 79%; I2 = 78%; P = .002]) than limb-onset disease(24% [95% CI, −3% to 50%; I2 = 85%; P = .08]), although this conclusion did not reach statistical significance (P = .23) (Figure 1) probably because of the small sample size. In relative terms, for all studies, the application of the Awaji criteria drives a 56% reduction (95% CI, 32% to 72%; I2 = 87%; P < .001) in the proportion of patients who would fail to be eligible to enter a clinical trial on the basis of a requirement for a diagnosis of probable or definite ALS.
Pooled sensitivity was higher with the Awaji criteria than with the rEEC: 81.1% (95% CI, 72.2% to 90.0%; I2 = 91%) and 62.2% (95% CI, 49.4% to 75.1%; I2 = 93%), respectively (Figure 2). The diagnostic specificity was the same using either set of criteria (98.2%; [95% CI, 96.7% to 99.7%; I2 = 0%]). Only 1 study17 reported false positives (1.95% for both rEEC and Awaji criteria). In a subgroup analysis, sensitivity using the rEEC was lower for patients with bulbar-onset than limb-onset disease: 46.1% (95% CI, 23.0% to 69.1%; I2 = 78%) and 63.8% (95% CI, 54.5% to 73.0%; I2 = 18%), respectively. On the other hand, sensitivity using the Awaji criteria was greater for patients with bulbar-onset than limb-onset disease: 82.9% (95% CI, 79.4% to 86.3%; I2 = 0%) and 69.4% (95% CI, 45.4% to 93.4%; I2 = 93%), respectively.
Similar results were obtained for DOR, which was higher with the Awaji criteria than with the rEEC: 35.8 (95% CI, 15.2 to 84.7; I2 = 3%) and 8.7 (95% CI, 2.2 to 35.6; I2 = 50%), respectively (Figure 2). The relative DOR was 3.8 (95% CI, 1.1 to 13.6; P = .04); in other words, the overall diagnostic performance in correctly classifying those with and without probable or definite ALS is about 4 times higher with Awaji criteria than with the rEEC.
Heterogeneity of the pooled estimates for DOR was significantly lower in comparison with the other indices estimated. This was particularly marked if we excluded from the analysis the study by Boekestein et al,16 which reported a uniquely high prevalence of true-negative cases. In this case, DOR heterogeneity was 0% for both sets of criteria in comparison with values more than 84% for the other diagnostic indices. As for the other parameters, in subgroup analysis, the DOR of the rEEC was lower for patients with bulbar-onset than limb-onset disease (1.3 [95% CI, 0.2 to 9.4; I2 = 0%] and 3.8 [95% CI, 0.5 to 26.8; I2 = 0%], respectively), while the DOR for the Awaji criteria was higher for patients with bulbar-onset than limb-onset disease (10.6 [95% CI, 1.4 to 81.7; I2 = 0%] and 7.3 [95% CI, 0.99 to 54.3; I2 = 0%], respectively). The DOR was constant (P > .16 for both sets of criteria), and therefore, symmetrical summary receiver operating characteristic (and 95% CI) curves were derived, showing a better overall performance by the Awaji criteria across all the different thresholds (Figure 3).
Taken together, these results strongly suggest that the added global improvement in diagnostic accuracy with the Awaji criteria is mostly due to an increase in diagnostic accuracy in patients with bulbar-onset disease.
Our study exemplifies the utility of the application of statistical methods to address meta-analysis of diagnostic tests, in particular when ideal studies are difficult, taking into account variations in disease or test characteristics in the early diagnosis of ALS.
The generation of the Awaji criteria, based on a revised algorithm for the application of clinical neurophysiological assessment in the diagnosis of ALS, developed during a consensus expert discussion based on the published literature, has been criticized as potentially bearing several drawbacks.20 Although one might agree with these critics that an extensive prospective study testing several different criteria would represent a more scientific approach, it is pragmatically unlikely that such a costly and time-consuming study will be possible. All studies of suggested formal diagnostic criteria for ALS inevitably test the criteria against conventional clinical diagnosis, which requires follow-up of outcome as the ultimate gold standard.21 The Awaji revised criteria for the diagnosis of ALS have the merit of reinforcing the value of EMG investigation in ALS. Indeed, the publication of this set of criteria was rapidly followed by 8 reports testing the sensitivity of the rEEC and of the Awaji criteria. This data set motivated us to apply recently derived methods of meta-analysis to evaluate these results.
The published studies have followed both retrospective and prospective designs.16 The EMG investigation protocols varied but were generally detailed, and the search for fasciculation potentials was certainly variably intensive. Moreover, the investigating neurophysiologist was usually not blind to the prior clinical assessment, although uncertain of the clinical diagnosis at the time of the investigation. However, all the reported studies followed the same diagnostic approach. A number of patients suspected of having ALS were referred to specialized centers for confirmatory diagnostic evaluation, but the index electrophysiological tests were always applied before the final clinical diagnosis, which was ultimately defined by clinical signs and disease progression, following exclusion of other conditions by neuroimaging. In only 1 study was this information unclear.15 The quality of all the studies included in the meta-analysis fulfilled the Quality Assessment of Diagnostic Accuracy Studies requirements.
Since the El Escorial clinical criteria for the diagnosis of ALS are considered very reliable, with virtually absent risk of false-positive diagnosis,22,23 we accepted these as the gold standard in our analysis. The EMG criteria as established by the rEEC3 were also used consistently. Makki and Benatar23 tested the electrophysiological subset required by the El Escorial criteria in a population of 73 patients with suspected ALS of whom 35 had a final diagnosis of ALS as classified by clinical follow-up. They found that 2 muscles in a limb and/or 1 muscle affected in bulbar or thoracic regions, in a combination in which at least 2 regions are involved, provided a sensitivity of 57% and a specificity of 97%. The low sensitivity at the time of initial diagnostic workup reported in this study is disappointing. The El Escorial criteria have been evaluated in terms of impact on clinical trial entry by Traynor et al,24 who found that 44% of 388 patients later clinically diagnosed as having ALS would fail clinical trial entry at initial assessment. It was for these reasons that the Awaji set of diagnostic criteria were devised.1 Overall, our meta-analysis shows that the diagnostic sensitivity was increased when applying the Awaji criteria (81.1%) compared with the rEEC (62.2%).
Importantly, the application of the Awaji diagnostic criteria results in a 56% reduction in patients who would fail to achieve eligibility to enter a clinical trial. We found no indication of any reduction in diagnostic specificity when applying the Awaji criteria. The DOR clearly supports the Awaji criteria, in particular in patients with bulbar-onset disease. The advantage of the Awaji criteria in testing patients with bulbar-onset disease derives from the infrequent presence of signs of ongoing denervation in cranial-innervated muscles, although fasciculation potentials are not unusual in these muscles.25,26 In addition, fasciculation potentials in association with signs of reinnervation are frequently found in early-affected limb muscles.27 In patients with spinal-onset disease, in whom weak limb muscles usually present signs of ongoing denervation, the advantage of the Awaji criteria is less marked. A European blind, multicenter, prospective study including a large group of patients that will further address the issue of accurate early diagnosis is in progress. Meanwhile, patients and physicians should benefit from using the Awaji algorithm added to the rEEC.
Correspondence: João Costa, MD, PhD, Neuromuscular Unit, Institute of Molecular Medicine, Faculty of Medicine, University of Lisbon, Av. Prof Egas Moniz, 1649-028 Lisbon, Portugal (firstname.lastname@example.org).
Accepted for Publication: February 8, 2012.
Published Online: August 13, 2012. doi:10.1001/archneurol.2012.254
Author Contributions:Study concept and design: Costa, Swash, and de Carvalho. Acquisition of data: Costa, Swash, and de Carvalho. Analysis and interpretation of data: Costa, Swash, and de Carvalho. Drafting of the manuscript: Costa and Swash. Critical revision of the manuscript for important intellectual content: Costa, Swash, and de Carvalho. Statistical analysis: Costa. Obtained funding: de Carvalho. Administrative, technical, and material support: Swash. Study supervision: Swash and de Carvalho.
Conflict of Interest Disclosures: None reported.
Funding/Support: This was an academic project not funded or sponsored, directly or indirectly, by the industry. This work was partially supported by scientific grant Fundação para a Ciência e Tecnologia PIC/IC/82765/2007.
Create a personal account or sign in to: