[Skip to Content]
[Skip to Content Landing]
Views 2,061
Citations 0
Original Investigation
February 17, 2020

Comparative Accuracy of Developmental Screening Questionnaires

Author Affiliations
  • 1Department of Health Law, Policy and Management, Boston University School of Public Health, Boston, Massachusetts
  • 2Floating Hospital for Children, Division of Developmental-Behavioral Pediatrics, Tufts University School of Medicine and Medical Center, Boston, Massachusetts
  • 3Department of Clinical Psychology, University of Massachusetts, Boston
JAMA Pediatr. 2020;174(4):366-374. doi:10.1001/jamapediatrics.2019.6000
Key Points

Question  Which screening questionnaires are most accurate for detecting developmental delays among infants and young children?

Findings  In this diagnostic accuracy study of 1495 families enrolled from primary care settings, trade-offs in sensitivity and specificity were observed among 3 screening tools (Ages and Stages Questionnaire, Third Edition, Parents’ Evaluation of Developmental Status, and Survey of Well-being of Young Children: Milestones), but no one questionnaire emerged as superior overall. All questionnaires displayed specificity higher than 70%, but sensitivity exceeded 70% only for the Parents’ Evaluation of Developmental Status with respect to severe delays and for the Survey of Well-being of Young Children: Milestones with respect to severe delays among children younger than 42 months.

Meaning  Results of this study suggest that all 3 developmental screening questionnaires offer modest advantages to pediatric practitioners for detecting developmental delays.


Importance  Universal developmental screening is widely recommended, yet studies of the accuracy of commonly used questionnaires reveal mixed results, and previous comparisons of these questionnaires are hampered by important methodological differences across studies.

Objective  To compare the accuracy of 3 developmental screening instruments as standardized tests of developmental status.

Design, Setting, and Participants  This cross-sectional diagnostic accuracy study recruited consecutive parents in waiting rooms at 10 pediatric primary care offices in eastern Massachusetts between October 1, 2013, and January 31, 2017. Parents were included if they were sufficiently literate in the English or Spanish language to complete a packet of screening questionnaires and if their child was of eligible age. Parents completed all questionnaires in counterbalanced order. Participants who screened positive on any questionnaire plus 10% of those who screened negative on all questionnaires (chosen at random) were invited to complete developmental testing. Analyses were weighted for sampling and nonresponse and were conducted from October 1, 2013, to January 31, 2017.

Exposures  The 3 screening instruments used were the Ages & Stages Questionnaire, Third Edition (ASQ-3); Parents’ Evaluation of Developmental Status (PEDS); and Survey of Well-being of Young Children (SWYC): Milestones.

Main Outcomes and Measures  Reference tests administered were Bayley Scales of Infant and Toddler Development, Third Edition, for children aged 0 to 42 months, and Differential Ability Scales, Second Edition, for older children. Age-standardized scores were used as indicators of mild (80-89), moderate (70-79), or severe (<70) delays.

Results  A total of 1495 families of children aged 9 months to 5.5 years participated. The mean (SD) age of the children at enrollment was 2.6 (1.3) years, and 779 (52.1%) were male. Parent respondents were primarily female (1325 [88.7%]), with a mean (SD) age of 33.4 (6.3) years. Of the 20.5% to 29.0% of children with a positive score on each questionnaire, 35% to 60% also received a positive score on a second questionnaire, demonstrating moderate co-occurrence. Among younger children (<42 months), the specificity of the ASQ-3 (89.4%; 95% CI, 85.9%-92.1%) and SWYC Milestones (89.0%; 95% CI, 86.1%-91.4%) was higher than that of the PEDS (79.6%; 95% CI, 75.7%-83.1%; P < .001 and P = .002, respectively), but differences in sensitivity were not statistically significant. Among older children (43-66 months), specificity of the ASQ-3 (92.1%; 95% CI, 85.1%-95.9%) was higher than that of the SWYC Milestones (70.7%; 95% CI, 60.9%-78.8%) and the PEDS (73.7%; 95% CI, 64.3%-81.3%; P < .001), but sensitivity to mild delays of the SWYC Milestones (54.8%; 95% CI, 38.1%-70.4%) and of the PEDS (61.8%; 95% CI, 43.1%-77.5%) was higher than that of the ASQ-3 (23.5%; 95% CI, 9.0%-48.8%; P = .012 and P = .002, respectively). Sensitivity exceeded 70% only with respect to severe delays, with 73.7% (95% CI, 50.1%-88.6%) for the SWYC Milestones among younger children, 78.9% (95% CI, 55.4%-91.9%) for the PEDS among younger children, and 77.8% (95% CI, 41.8%-94.5%) for the PEDS among older children. Attending to parents’ concerns was associated with increased sensitivity of all questionnaires.

Conclusions and Relevance  This study found that 3 frequently used screening questionnaires offer adequate specificity but modest sensitivity for detecting developmental delays among children aged 9 months to 5 years. The results suggest that trade-offs in sensitivity and specificity occurred among the questionnaires, with no one questionnaire emerging superior overall.

Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Not all submitted comments are published. Please see our commenting policy for details.

Limit 140 characters
Limit 3600 characters or approximately 600 words