Diagnostic Accuracy of the Social Attention and Communication Surveillance–Revised With Preschool Tool for Early Autism Detection in Very Young Children

Key Points Question Can a developmental surveillance approach be used to train professionals to accurately identify infants, toddlers, and preschoolers on the autism spectrum? Findings In this diagnostic accuracy study including 13 511 children aged 11 to 42 months, maternal and child health nurses were trained to use the Social Attention and Communication Surveillance–Revised (SACS-R) and SACS-Preschool (SACS-PR) tools during well-child checkups at 11 to 30 months of age and at follow-up (at 42 months of age). Those children identified as being at high likelihood for autism underwent diagnostic assessments; results indicated the SACS-R with SACS-PR (SACS-R+PR) had very high diagnostic accuracy for early autism detection. Meaning Results of this study suggest that the SACS-R+PR population-based developmental surveillance program may be used universally for the early identification of autism.

MCH nurses provided all parents/caregivers of children referred to the University team with an informed consent form and participant information sheet (which described the study and its procedure, how to withdraw from the study, and other relevant information) to provide permission for the University team to contact the family for the "gold standard" diagnostic assessment. For children referred following the 12-to 24-month SACS-R consultations, 15.6% (n = 51) of families did not provide consent for contact; a further 11.01% (n = 36) of families provided consent to be contacted to arrange the University assessment but subsequently declined the diagnostic assessment. For the children referred at the 42-month SACS-PR consultation, 25.00% (n = 49) of families did not provide consent for contact and 15.31% (n = 30) consented to contact but later declined the diagnostic assessment.

University Diagnostic Assessments
Prior to attending the "gold standard" diagnostic assessment, parents/caregivers completed a demographic questionnaire, which included questions regarding parental/caregiver country of birth and ethnicity to enable comparison of the sample to the population of greater Melbourne. Ethnicity options were defined by the parent/caregiver completing the questionnaire and were later grouped by the researchers for analyses. All diagnostic assessments were conducted in a large laboratory playroom with one clinician assessing the child while a second clinician simultaneously conducted the parental interview (see eTable 3 for details of instruments included in the assessments). Children underwent a diagnostic assessment every six months until they reached 30 months of age, with a final assessment undertaken at 42 months.
Arabic and Mandarin translators for both the child and parent/caregiver were utilized where required (Arabic: n = 5; Mandarin: n = 2). No other translation services were required as sufficient English was spoken and understood by the parents/caregivers and/or immediate family who also attended the assessment.
For assessments with children under 24 months of age, the Mullen Scales of Early Learning (MSEL) 9 and the Autism Diagnostic Observation Schedule -Toddler (ADOS-T) 10 were conducted with the child, with a developmental interview conducted with the parent/caregiver, due to the Autism Diagnostic Interview -Revised (ADI-R) 11 being suitable for children from 24 months of age. The developmental interview contained questions regarding the child's developmental history, with questions focusing on the diagnostic criteria for autism, specifically social attention and communication development, language, play, and the presence of restricted, repetitive, and sensory behaviors and/or interests that are relevant for very young children (see Barbaro, Ridgway, and Dissanayake). 6 For the 24-and 30-month assessments, the MSEL and the appropriate module of the ADOS-T/ADOS-2 12 and ADI-R were utilized.
At the 42-month follow-up assessment, the ADOS and MSEL were used, with the ADI-R only repeated if the diagnostic outcome was not clear at the child's previous diagnostic assessment (n = 24; 14.72%). In its place, a brief developmental interview was conducted for the remaining children (n = 138; 84.66%). The ADI-R was conducted for all children having a 42-month first-time assessment. Clinicians were not blinded to the child's SACS-R/SACS-PR results or, where applicable, to the child's previous diagnostic assessment data. This was due to the young age of the children and to provide the clinician with all available information to make a clinical judgment, which has consistently been found to produce the most stable diagnosis in very young children. [13][14][15][16][17] All diagnostic assessment sessions were video recorded using two mounted, remotely adjustable cameras, which were angled to focus on the child. This was done to aid with scoring, coding, supervision, and diagnostic decisions. Detailed reports of the diagnostic assessment outcomes were provided to parents/caregivers. A total of 567 diagnostic assessments were conducted for 357 children (eFigure 2). eTable 5 shows children's age at their first diagnostic assessment, with eTable 5 displaying the number of children who underwent single or multiple diagnostic assessments.

Diagnostic Assessment Tools
Autism Diagnostic Observation Schedule -Second Edition (Modules 1-3 and Toddler) ADOS-2 modules 1, 2, and 3 12 and ADOS-2 Toddler (ADOS-T), 10 are standardized, semi-structured, play-based assessments designed to elicit behaviors relevant to an autism spectrum disorder (ASD) diagnosis through multiple activities. The ADOS-T is for use with children 12-30 months of age, with the ADOS-2 suitable from 31 months. All have excellent test-retest and inter-rater agreement, with intraclass correlation coefficients (ICCs) ≥0.80 for ADOS-2 for all but Module 1 restricted and repetitive behaviors (RRB) inter-rater (0.79) and test-retest for RRB in Module 1 and 2 (0.68 and 0.73, respectively) and ≥0.83 for ADOS-T for all but RRB total on: inter-rater agreement for V21-30 algorithm = 0.74 and 12-20/NV21-30 algorithm = 0.75, and test-retest V21-30 algorithm = 0.60. They also have excellent sensitivity and specificity cut-off scores for identifying ASD versus other developmental conditions (≥0.81). 10,12,18 Calibrated severity scores (CSS) are reported in this study as they facilitate comparison across the various ADOS modules 19,20 and provide a measure of autism symptoms that is independent of age and language ability. 21 CSS scores range from 1 to 10 with higher scores indicating higher autism symptoms.

Austim Diagnostic Interview -Revised
The ADI-R 11 is a semi-structured diagnostic parental interview for autism assessing communication, reciprocal social interaction, play, and restricted, repetitive, and sensory behaviors over 93 items. With excellent test-retest and inter-rater agreement (ICCs ≥0.92), the ADI-R has excellent discriminant validity between individuals on the autism spectrum and individuals not on the spectrum for each of its domains. 22 The ADI-R toddler algorithm overall score was used for children at the 24-month diagnostic assessment.

Mullen Scales of Early Learning
The MSEL 9 is a standardized developmental assessment, norm-referenced for ages 0 to 68 months. The tool measures children's verbal (expressive and receptive) and non-verbal (gross and fine motor, visual reception) skills, in five subscales. It has excellent test-retest and inter-rater reliability for children aged ≤24 months (r ≥0.82).

Clinician Training and Reliability
All clinicians were registered psychologists, speech pathologists, medical doctors, postdoctoral research fellows, or PhD candidates who had been independently trained in the use of the ADOS-T/2, ADI-R, and MSEL. These clinicians were experienced in using these tools and had reached research reliability. As part of their training, clinicians shadowed the first author (a registered psychologist with 15 years' experience in the assessment of young children on the autism spectrum) for several assessments. Following this, the first author then simultaneously and independently co-coded assessments with each clinician to ensure they maintained research reliability obtained during their independent training (>80% on all items and diagnostic algorithms). The first author ensured regular supervision of the clinicians throughout the course of the study where coding, diagnosis, and other issues were discussed; monthly ADOS coding meetings were also held at the University that clinicians could attend.

Diagnostic Criteria
Information attained from children's developmental history, previous health records/reports, and the administration of the ADOS-T/ADOS-2, ADI-R, developmental interview, and MSEL, were used to make a clinical decision based on the Diagnostic and Statistical Manual of Mental Disorders (5 th edition) 23 diagnostic criteria for ASD.

Diagnostic Stability
Out of the 240 children referred and assessed between 12-24 months, only two children shifted from "developmental and/or language delay" ("DD/LD") to "autism", and four from "possible autism" to "DD/LD", from 24 month to 42 months. One child shifted from "possible autism" at 18 months to "DD/LD" at 42-months (this child did not attend an assessment at 24 months).

Statistical Analyses Currency Conversion
Australian dollars were converted to US dollars (USD) using the mid-market exchange rate available on 1 July 2014 (the approximate mid-point of the study) of USD0.949. 24

Participants Excluded From MCH Consultation Data
The data from 273 children were excluded from the overall cohort of potentially eligible children (n = 13 808); 176 children were outside of the eligible age range at time of recruitment and 97 were entered into the database after recruitment was completed (see Figure 1 and eFigure 1).

Missing Data Missing Data From University Diagnostic Assessments
The following data from the diagnostic assessments completed by the University team were missing: • 24 months: four children had a developmental interview instead of the ADI-R.
• 42 months (follow-up): two children were missing MSEL visual reception and fine motor scores; and one child was missing the MSEL visual reception score, which meant that their Non-verbal developmental quotient (DQ) and the Overall DQ could not be calculated. • 42 months (first-time):one child was missing the MSEL visual reception score (so their Non-verbal DQ and Overall DQ could not be calculated); one child was missing the ADOS-2; and one child was missing the ADI-R.
Two children who were withdrawn from the study after completing the diagnostic assessment (one child who completed the 24-month assessment and one child who completed the 42-month first-time assessment). For these children the diagnostic outcome remained known but all identifiable, clinical, and assessment data were deleted.

Missing Data Due to Parents/Caregivers Declining a University Diagnostic Assessment and children lost to follow-up at 42-months
Of the 523 children referred for a University diagnostic assessment, parents/caregivers of 166 children declined (31.73%), offering a range of reasons for declining the initial or follow up assessment; these included having no concerns, having too many appointments, already doing early intervention, or being under the care of a pediatrician.
-"Known outcomes": For 73 of these 166 (43.98%) children who were referred but did not complete a University diagnostic assessment, a diagnostic outcome was established through follow-up with parents/caregivers and MCH nurses (34 with a "high likelihood" result at 12-24 months, 35 with a "high likelihood" result at 42 months, and four with a "low likelihood" result plus parental/MCH nurse concerns at 42 months). Information regarding the name of the health professional who gave the diagnosis, and date the diagnosis was given, was also collected. -Multiple imputation (MI): MI 26 was used to resolve the diagnostic outcome for the 93 children (56.02%) between 12-42 months who were referred for a diagnostic outcome but did not complete a University diagnostic assessment and for whom a diagnostic outcome from the community could not be established through follow-up with parents/caregivers and MCH nurses. MI was also used for children who did not attend their SACS-PR assessment to determine their SACS-PR outcome ("high likelihood" or "low likelihood") and their final diagnostic outcome. -MI replaces the missing data in each replication with plausible values drawn from an imputation model and was fit using logistic regression in SPSS Statistics for Windows, version 26.0. 27 As no typically developing children had been referred amongst those who did undergo a diagnostic assessment through either the study team or the community, "typically developing" was not an available option in the model. A single model was used for all missing data based on complete values for the following covariates: LGA IRSAD, sex, number of MCH consultations attended over the study duration, total number of atypical key items and overall number of atypical items based on the child's last attended MCH consultation using the SACS-R checklist items. Imputations were run five, seven, nine, and 11 times for the different sets; imputation 11 was chosen for this study as it provided replicable results, specifically for the smaller samples to avoid a 50-50 split. 28 Additional covariates and one dependent variable were added separately, depending on the outcome needed to be derived from each set. A full explanation is as follows: o Phase 1 -Parents/caregivers of 53 of the 327 children with "high likelihood" for autism using the SACS-R between 12-24 months declined to attend a University assessment. MI for this set included all the above-listed covariates, as well as the child's age bracket when first at "high likelihood" and the dependent variable of the observed diagnostic outcomes of the 274 children who attended a University assessment or had a known outcome from the community (total set = 327). The MI resulted in 44 children in the autism group and 9 children in the DD/LD group. o Phase 2 -Parents/caregivers of 39 of the 168 children with "high likelihood" using the SACS-PR and 1 child from the 28 children with "low likelihood" on the SACS-PR with parental/MCH nurse concerns declined to attend a University assessment. MI for this set included all of the above-listed covariates, as well as the "high"/"low likelihood" status of the children using SACS-PR and the dependent variable of the observed diagnostic outcomes of the 156 children who attended a University assessment or had a known outcome from the community (total set = 196). The MI resulted in in 19 children in the "autism" group and 20 children in the "DD/LD" group from the "high likelihood" arm and 1 child in the "autism" group from the "low likelihood" plus parental/MCH nurse concerns arm.
Phase 2 not attendedthere were 4951 children who did not attend their SACS-PR assessment at their 42month MCH consultation. MI was used to determine the outcome ("high likelihood" or "low likelihood") using SACS-PR for these children. MI for this set included all the above-listed covariates, as well as the dependent variable of the observed "high" and "low likelihood" from the 8233 children who attended their SACS-PR assessment (total set = 13 184). The frequency of the likelihood was derived from imputation 11 presenting us with the pooled data for the average number of high/low likelihood outcomes. The results of this imputation yielded 97 children at "high likelihood" and 4854 at "low likelihood" out of the missing 4951 children.
To determine the number of these children who would have been in the "autism" group, we used the results of the imputation and the proportion of the observed/known outcomes data of children at "high" and "low likelihood" on the SACS-PR to estimate the number of children who would have been allocated to the "autism" group. This resulted in 63 children being allocated to the "autism" group: 57 in the "high likelihood" arm and 6 in the "low likelihood" arm. Below is an explanation of how we calculated these figures. ▪ Imputed "high likelihood" SACS-PR result: • There were 168 children with a "high likelihood" result using SACS-PR. The total number of attended + known data is 129 (94 attended + 35 known). • There were 76 children in the "autism" group of the 129 children (61 attended + 15 known). • There were 97 children with an imputed "high likelihood" SACS-PR outcome in the missing data of 4951 children. • The proportion of children allocated to the "autism" group out of the 97 children is: (97 x 76) /129 = 57. ▪ Imputed "low likelihood" SACS-PR result: • There were 8065 children with "low likelihood" result using SACS-PR. One child had a diagnosis imputed and not observed/known, 8065 -1 = 8,064 children. • There were 10 children (8 attended + 2 known) with a "low likelihood" outcome on the SACS-PR in the "autism" group. • There were 4854 children with an imputed "low likelihood" SACS-PR outcome in the missing data of 4951 children.
The proportion of children allocated to the "autism" group out of the 4854 children is: (4854 x 10) / 8064 = 6.
Chi square tests and independent samples t-tests were used to identify potential differences between the children identified at "high likelihood" for autism using the SACS-R/SACS-PR who completed a University diagnostic assessment and those who did not due to their parents declining the assessment. In cases where chi square tests were run, all expected cell frequencies were >10 for 2x2 tables and >5 for all other tables. Effect size was evaluated using Cramer's V (φc). In instances where independent samples t-tests were run, the effect size was calculated using Omega Squared (ω²). When comparing age identified at "high likelihood" for the 42-month cohort, the assumption of homogeneity of variances was violated and the p-value was determined using equal variance not assumed. For the 12-to 24-month cohort, results indicated no significant differences between the 'attended' and 'declined' groups in relation to child sex (P = .84) and LGA (P = .21) with small to moderate effect sizes (φc = -0.02 and 0.17, respectively). However, a significant difference was found between the groups for the child's age when they were first identified at "high likelihood" for autism on the SACS-R (P = .03) with a small effect size (φc = 0.01), indicating that children whose parents declined a University diagnostic assessment were more likely to be younger when first identified as "high likelihood" on the SACS-R. For the 42-month cohort, there were no significant differences between the 'attended' and 'declined' groups on sex (P = .41) and LGA (P = .09), with small to large effect sizes (φc = 0.09 and 0.27, respectively). A significant difference was found between the groups for the child's age when they were first identified at "high likelihood" for autism on the SACS-PR (P = .03) with a small effect size (ω² = .02), indicating children whose parents declined a University diagnostic assessment were more likely to be older when identified at "high likelihood" on the SACS-PR.

Assumption Testing
The level of measurement assumption and the assumption of independence of observations was met. Outliers were detected across multiple variables, as assessed by inspection of boxplots. Analyses were run to compare the results with and without outliers, with similar results; as the outliers were not found to influence the results, these were not removed. Normality was assessed using visual inspection of histograms and Shapiro-Wilks Tests of Normality. The assumption of normality was violated for: • Chronological age at 24-month, 42-month (follow-up) and 42-month (first time) assessments, • Non-verbal DQ at 18-month and 42-month (first time) assessments, • Verbal DQ at 18-month, 24-month and 42-month (follow up) assessments, • Overall DQ at 24-month and 42-month (follow-up) assessments, • ADI-R Toddler overall total score at 24-month assessments, • ADI-R Communication (Verbal) at 42-month (first-time) assessments, • ADI-R RRB at 42-month (follow-up) and 42-month (first-time) assessments, and • ADOS-2 CSS at 12-month, 18-month, 24-month, 42-month (follow-up), and 42-month (first-time) assessments. In cases where normality assumptions were violated, Mann Whitney U tests were used.

Diagnostic Accuracy
Calculation of specificity, sensitivity, PPV, and NPV (see eTables 6 and 7) were calculated via the following formulas: Autism Autism present Autism absent

= TN / (FP + TN)
Where: • True positive (TP): Children on the autism spectrum correctly identified at "high likelihood" for autism.
• False positive (FP): Children not on the autism spectrum incorrectly identified at "high likelihood" for autism.
• True negative (TN): Children not on the autism spectrum correctly identified at "low likelihood" for autism.
• False negative (FN): Children on the autism spectrum incorrectly identified at "low likelihood" for autism.

Sample Characteristics: Demographics, Autism Symptoms, and Developmental Scores
The summary of clinical assessment scores by age group and final diagnostic status are presented in eTable 4. Given the large amount of data generated in this study, only significant findings from eTable 4 are reported here.
Comparisons between the autism and DD/LD groups using independent t-tests revealed statistically significant differences in chronological age at 12-and 18-month assessments and the 42-month (first-time) assessments, with small to large effect sizes (ω² = 0.15, ω² = 0.08, and r= .21, respectively). Correlation analyses were run to determine the strength of the relationship between chronological age and the other dependent variables tested. Based on alpha <.05 none of the dependent variables were significantly related to age thus it was not controlled for in group comparisons.
At the 12-month assessment, the autism group had significantly higher ADOS-2 CSS, with a small effect size (r = 0.18).
At the 18-month assessment, the DD/LD group had significantly higher verbal DQ and overall DQ, with small to medium effect sizes (r = 0.33 and ω² = 0.04, respectively). The autism group had significantly higher ADOS-2 CSS, with large effect size (ω² = -0.54).
Similarly, at the 24-month assessment the DD/LD group had significantly higher verbal DQ and overall DQ, with small effect sizes (r = 0.26 and 0.23 respectively). The autism group had significantly higher ADOS-2 CSS in addition to significantly higher ADI-R Toddler overall total scores; effect sizes were large (r = 0.58 and 0.51, respectively).
At the 42-month (follow-up) assessments the DD/LD group had significantly higher non-verbal DQ, verbal DQ, and overall DQ; effect sizes were small (ω² = 0.03, r = 0.21, and r = 0.20, respectively). The autism group had significantly higher ADOS CSS, with a large effect size (r = 0.63) and higher ADI-R social interaction and ADI-R communication (verbal) domains; effect sizes were large (ω² = 0.12 and ω² = 0.23, respectively).
For 42-months (first-time) assessments, group differences were significant for every variable tested. The DD/LD group had significantly higher non-verbal DQ, verbal DQ, and overall DQ, with small to medium effect sizes (r = 0.40, ω² = 0.06, and r = 0.13, respectively). The autism group had significantly higher ADOS-2 CSS (r = -.64) and higher ADI-R scores on all domains, with a range of medium to large effect sizes (r = -.53, ω² = 0.58, r = -.44 and 0.35, respectively).