Ment LR, Vohr B, Allan W, Katz KH, Schneider KC, Westerveld M, Duncan CC, Makuch RW. Change in Cognitive Function Over Time in Very Low-Birth-Weight Infants. JAMA. 2003;289(6):705-711. doi:10.1001/jama.289.6.705
Author Affiliations: Departments of Pediatrics (Dr Ment and Ms Schneider), Neurology (Dr Ment), Epidemiology and Public Health (Ms Katz and Dr Makuch), and Neurosurgery (Drs Westerveld and Duncan), Yale University School of Medicine, New Haven, Conn; Department of Pediatrics, Brown University School of Medicine, Providence, RI (Dr Vohr); and Department of Neurology, Maine Medical Center, Portland (Dr Allan).
Context Preterm very low-birth-weight (VLBW) infants have a high prevalence
of neurodevelopmental disability when evaluated during the first several years
of life. However, recent experimental data suggest that the developing brain
may recover from or compensate for injury.
Objective To determine if there is cognitive improvement throughout early and
middle childhood following VLBW birth.
Design, Setting, and Participants Follow-up data of 296 infants born weighing 600 to 1250 g who participated
in a prospective, randomized, placebo-controlled intraventricular hemorrhage
(IVH) prevention study performed at 3 northeastern US hospitals between September
1989 and August 1992 and who were serially evaluated at 36, 54, 72, and 96
months of corrected age (CA).
Main Outcome Measures The age-normed Peabody Picture Vocabulary Test–Revised (PPVT-R)
score and measures of intelligence.
Results Overall, the median PPVT-R score increased from 88 at 36 months of CA
to 99 at 96 months of CA; when data from 36 and 96 months of CA were compared,
45% of children gained 10 points or more and 12.5% showed a 5- to 9-point
increase in test scores. Similar findings were noted for full-scale and verbal
IQ scores. Multivariate analyses demonstrated that increasing age, residence
in a 2-parent household, and higher levels of maternal education were all
significantly associated with higher PPVT-R scores (for each, P<.001). In addition, early intervention led to greater increases
over time in PPVT-R scores among children whose mothers had less than a high
school education compared with those with a high school education level or
greater (P = .03 by test for interaction). Although
most children showed improvement in PPVT-R scores with increasing CA, children
with early-onset IVH and subsequent significant central nervous system injury
had the lowest PPVT-R scores initially and the scores declined over time (P = .009 by test for interaction).
Conclusions The majority of VLBW children had improvement in verbal and IQ test
scores over time. Only children with early-onset IVH followed by significant
central nervous system injury had low PPVT-R scores that declined over time.
Preterm birth results in significant neurodevelopmental disability during
childhood.1,2 Depending on the
birth weight of the children studied and the years in which they were born,
the incidence of major disabilities ranges from 20% to almost 50% in the first
several years of life.3- 8 In
addition, almost one fifth develop major cognitive disabilities by age 8 years,9,10 more than 50% are reported to require
special assistance in the classroom, 20% are in special education, and 15%
have repeated at least 1 grade in elementary school.10- 13 However,
recent reports suggest that two thirds of preterm infants require no special
assistance in school at ages 14 to 15 years, almost three quarters graduate
from high school, and more than 40% are enrolled in college programs. These
data suggest change in cognitive function over time in those born preterm.14- 18
Survivors in the multicenter Randomized Indomethacin Intraventricular
Hemorrhage (IVH) Prevention Trial were studied to test the hypothesis that
verbal and cognitive performance of very low-birth-weight (VLBW) infants would
improve relative to an age-standardized group throughout early and middle
childhood.19- 23 Serial
measures of cognitive and verbal skills were performed and the effects of
time and previously described biological and environmental risk factors were
The Randomized Indomethacin IVH Prevention Trial was conducted at Women
and Infants' Hospital, Providence, RI; Maine Medical Center, Portland; and
Yale New Haven Hospital, New Haven, Conn. The protocols were reviewed and
approved by the institutional review boards of each institution. Informed
consent was obtained from all parents and children.
Between September 5, 1989, and August 31, 1992, 505 infants weighing
600 to 1250 g at birth were admitted within 6 hours of birth to 2 parallel
randomized, prospective, low-dose indomethacin IVH prevention trials.19,20 All infants were examined using cranial
echoencephalography (ECHO) between 5 and 11 hours after birth. Of the 505
enrolled infants, 431 had normal ECHO studies and were considered early IVH
negative. Seventy-four infants had ECHO evidence for IVH at this time; these
were called early IVH positive.
Subsequent scans were performed 24 and 48 hours after the first scan;
4, 5, 7, 14, and 21 days after birth; and at 40 weeks' postmenstrual age.
Scans were interpreted by the institutional radiologist and independently
verified by a central radiologist. In cases of disagreement, data were reexamined
by all participating radiologists and a consensus was reached.
Radiologic assessment was performed without prior knowledge of the infant's
clinical condition. The grading system for hemorrhages was as follows19,20: grade 1 (blood in the germinal matrix
regions); grade 2 (blood within the lateral ventricular system without ventricular
dilation); grade 3 (blood within and distending the lateral ventricles); and
grade 4 (blood within the ventricular system and parenchymal involvement).
Ventriculomegaly was assessed at 40 weeks' postmenstrual age (or if not available
21 days after birth). Moderate and severe ventriculomegaly were defined as
measurements of 1.0 to 1.5 cm and more than 1.5 cm, respectively, at the midbody
of the lateral ventricle on sagittal scan.24 Studies
were also evaluated for focal echolucencies. All cases showing focal echolucencies
had cystic areas consistent with periventricular leukomalacia (PVL) on the
ultrasound performed at 40 weeks' postmenstrual age.24 A
study infant was defined as having significant central nervous system (CNS)
injury if he/she had 1 or more of the following 3 ultrasound findings: grade
4 IVH at any time following the first scan at 5 to 11 hours; PVL at term;
or ventriculomegaly at term. No study infant had grade 4 IVH at 6 postnatal
All infants underwent gestational age assessment using a modification
of the Ballard scale.25 Prenatal, perinatal,
and neonatal data were obtained by maternal interviews and review of the maternal
and neonatal charts. Resuscitation scores were as follows: 0, no intervention;
1, blow-by oxygen, tactile stimulation, or both; 2, endotracheal suctioning
only; 3, bag-and-mask positive-pressure ventilation; 4, endotracheal intubation
and positive-pressure ventilation; and 5, endotracheal intubation, positive-pressure
ventilation, and the use of drugs, cardiac massage, or both. Randomization
to low-dose indomethacin was categorized as present/absent. An infant was
diagnosed with bronchopulmonary dysplasia (BPD) if he/she both required oxygen
supplementation and had an abnormal chest radiograph at 28 days of life26; BPD was categorized as present/absent.
Serial neurodevelopmental follow-up evaluations were performed on all
study participants. At 36 months of corrected age (CA, ie, months past the
obstetric due date), each child was tested with the Peabody Picture Vocabulary
Test–Revised (PPVT-R),27 an age-normed
test that requires no verbal responses from the child and measures receptive
vocabulary words of individuals aged 2½ years through adulthood. Raw
scores are converted to standard scores with a mean (SD) of 100 (15) points.
The PPVT-R does not require a motor response and is thus an excellent instrument
for use with children with motor disabilities.28 Study
children also were evaluated at 36 months with the Stanford-Binet Intelligence
Scale (SBIS, Form L-M), which produces standardized scores with a mean (SD)
of 100 (16) points.29
At both 54 and 72 months of CA, each child was tested with the PPVT-R
and the Wechsler Preschool and Primary Scale of Intelligence–Revised
(WPPSI-R).30 The WPPSI-R is an individually
administered, norm-referenced instrument for assessing the intellectual functioning
of children aged 3 years 0 months through 7 years 3 months and provides 3
intelligence scores: a performance IQ (PIQ), verbal IQ (VIQ), and full-scale
IQ. All 3 scores have a mean (SD) of 100 (15).
At 96 months of CA, children were evaluated with the PPVT-R and the
Wechsler Intelligence Scale for Children–Third Edition (WISC-III),31 a norm-referenced instrument for assessing the intellectual
function of children aged 6 years 0 months through 16 years 11 months. The
WISC-III also provides 3 IQ scores: PIQ, VIQ, and full-scale IQ, with a mean
(SD) of 100 (15).
The full battery of tests was administered to all children. Some demonstrated
significant impairments, resulting in raw scores of 0 for some of the tests.
When raw scores on the PPVT-R fall below the lowest score for which a standard
score can be computed, the PPVT-R manual assigns a score of less than 20 for
the score.27 In our statistical analyses, the
earned score of less than 20 was converted to a score of 19. Similarly, if
a basal score was unable to be calculated for any of the IQ assessments, a
score of 1 point less than the lowest score for the test for the overall population
was assigned to that child (ie, 33 for the SBIS, 40 for the WPPSI-R full-scale
IQ, 45 for the WPPSI-R VIQ, and 44 for the WPPSI-R PIQ).31 Only
1 child, who was early IVH negative, was found to have values of 19 for all
PPVT-R tests; the same child received lowest values for IQ tests at all 4
Neurological examinations were performed at each follow-up assessment.24 Assignment of the diagnosis of cerebral palsy was
based on the presence of hypertonicity, hyperreflexia, and dystonic or spastic
movements in the affected extremities. Whether the child received blind services
was collected from parent/caregiver report and was coded yes/no. None of the
7 children receiving blind services received scores of 19 for any PPVT-R test.
Deaf children were defined as those requiring amplification bilaterally.
Prior to each evaluation time, all neurodevelopmental testers were certified
for the study outcome measures on 4 nonstudy age-appropriate participants
by the study's psychologist. Assessors were recertified annually on 2 nonstudy
age-appropriate children. Teams that performed the neurodevelopmental examinations
remained blinded with regard to the participants' current and previous medical
Demographic information was obtained from the primary caregivers. Maternal
education was categorized as less than high school or high school graduate
or higher. Residence in a 2-parent household at 96 months of CA was coded
yes/no, and parents were defined as birth mother, adoptive mother, stepmother,
and/or birth father, adoptive father, and stepfather. Early intervention services
including occupational therapy, physical therapy, speech, and/or language
therapy were obtained from parent/caregiver report. A child was considered
positive for these special services if he/she received 1 or more services
at 36 months of CA, consistent with previous studies.13,32
Because those children with early-onset IVH are at the highest risk
for disability in childhood,33 study participants
were divided into early IVH negative and early IVH positive groups to evaluate
scores across ages. The PPVT-R scores were selected as our primary outcome
measure prior to statistical analysis, because the same test was used at all
4 measurements. The IQ scores were a secondary outcome measure. Finally, z scores were calculated for the full-scale IQ, VIQ, and
PIQ scores for each child at each age.
To understand important biological or environmental factors that may
be associated with testing scores, factors reported to be associated with
neurodevelopmental outcome in VLBW infants11,13,15,22,32,34,35 were
evaluated: birth weight of 750 g or less, male sex, randomization to indomethacin,
BPD, evidence for significant CNS injury in the neonatal period, non-English
monolingual or bilingual household (language), maternal education, residence
in a 2-parent household, and the use of special services at 36 months of CA.
Categorical data with no expectation of a linear trend among groups
(eg, sex) were analyzed using Fisher exact test. Categorical data with an
a priori expectation of a linear trend among groups were analyzed by the χ2 test for linear trend. The 2-sample Wilcoxon rank sum test was used
for between-group comparisons of continuous data. Because PPVT-R scores were
not always normally distributed, summary statistics for this factor were reported
using medians and ranges.
Hierarchical multivariate regression models were used to examine the
effect of several factors simultaneously. All significant higher-order interaction
terms required inclusion of all lower order (ie, main effect) factors in the
model. The first model evaluated the main effects only to determine which
factors played an independent role in predicting PPVT-R scores. Using this
reduced set of important factors, a second model was developed in which the
main effects and the interaction of these factors with one another were explored.
A significant interaction implies that the effect of one factor on PPVT-R
scores varies significantly according to the level of the other factor. All
statistical analyses were performed using SAS software version 8.2 (SAS Institute
Inc, Cary, NC). All P values are 2-sided and statistical
significance was assigned at P<.05.
At 36 months of CA, there were 440 survivors of the original 505 study
participants (87%). Of the 440, 385 (88%) were observed for at least 1 assessment,
363 (82.5%) were evaluated at 36 months of CA, 359 (81.5%) were observed at
54 months of CA, and 358 (81%) and 368 (84%) were evaluated at 72 and 96 months
of CA, respectively. A total of 296 children (67%) underwent serial PPVT-R
testing at all 4 evaluations.
Table 1 summarizes baseline
and demographic characteristics of all participants at the 4 evaluations and
the subgroup with measurements at all assessment times. Because these data
showed no difference for any of the variables among the groups of children,
we describe results for the 296 children observed for all 4 evaluations.
Comparing the 261 early IVH negative and 35 early IVH positive children
demonstrated no marked differences for birth weight, incidence of birth weight
of 750 g or less, gestational age, male sex, maternal education, antenatal
steroid exposure, 1-minute Apgar scores, resuscitation scores, or BPD between
the 2 groups. The median (range) 5-minute Apgar score for children who were
early IVH negative was 7 (1-9); for children who were early IVH positive,
it was 6 (1-9; P = .03). The early IVH positive group
had a higher percentage of ventriculomegaly (6 [2%] vs 6 [17%]; P = .001) and grade 4 IVH compared with the early IVH negative group
(5 [2%] vs 3 [9%]; P<.001) with a trend for more
PVL (9 [3.5%] vs 4 [11%]; P = .06). Eight children
who were early IVH positive (23%) had 1 or more of these 3 adverse outcomes
compared with 16 children who were early IVH negative (6%, P<.001).
At 36 months of CA, 68 early IVH negative (26%) and 15 early IVH positive
(43%) children were receiving special services (P =
.03). Fifteen of the 24 children (62.5%) with significant CNS injury were
receiving special services compared with 63 children (25%) without significant
CNS injury (P<.001). Sixty-five early IVH negative
(25%) and 16 early IVH positive (46%) participants were living in households
in which English was not the primary language (P =
Twenty-two of the 258 early IVH negative children (9%) with neurological
examinations and 6 early IVH positive children (17%) had cerebral palsy (P = .12). Similarly, 4 early IVH negative children (2%)
and 3 early IVH positive children (9%) were receiving blind services at age
8 years (P = .04); 6 (3 negative and 3 positive)
had a history of retinopathy of prematurity, although 1 early IVH negative
child had a stroke involving the periventricular white matter. The incidence
of deafness was 3% in both groups. A total of 213 children (185 early IVH
negative [71%] and 28 early IVH positive [80%]; P =
.26) were living in 2-parent households.
Overall, the median (range) PPVT-R scores were 88.0 (20-135), 91.0 (19-142),
97.0 (19-142), and 99.0 (19-153) at 36, 54, 72, and 96 months of CA, respectively.
Scores for the 25th percentile were 74.2, 74.7, 82.5, and 83.5 for 36, 54,
72, and 96 months of CA, respectively; those for the 75th percentile were
100.2, 101.3, 110.2, and 109.5, respectively. Comparison of the scores at
36 and 96 months of CA for all children showed a significant effect of time
(P<.001, Wilcoxon signed rank test). A total of
134 children (45%) had increases in scores of 10 points or more, 37 (12.5%)
had increases of 5 to 9 points, 27 (9%) had either no increase or an increase
of 0 to 4 points, 30 (10%) had decreases of 1 to 4 points, 19 (6%) had decreases
of 5 to 9 points, and 49 (17%) had decreases of 10 points or more. Furthermore,
35 of 49 children (71%) with scores in the borderline range (70-80) at 36
months were found to have scores in the normal range (>80) at 96 months and
28 of 57 children (49%) with scores less than 70 at 36 months no longer had
scores in the mental retardation range (<70) at 96 months of CA.
The median PPVT-R scores for the early IVH negative group were higher
at all 4 evaluations compared with the early IVH positive group (Table 2). Both groups show a similar trend
for higher scores with increasing CA. A multivariate regression analysis shows
PPVT-R scores are higher with increasing CA (P<.001),
and a possible trend may exist for these scores to differ between early IVH
negative and positive participants (P = .09).
Given the evidence for higher scores with increasing age, we examined
a variety of biological and environmental factors to determine the nature
and extent of their contribution. In a multivariate model, which included
main effects only, a birth weight of 750 g or less, male sex, randomization
to early low-dose indomethacin, early IVH positive vs early IVH negative,
BPD, and language each were not significantly associated with PPVT-R scores
in the presence of group and time (P = .89, .51,
.64, .72, .65, and .11, respectively). However, increasing CA, higher levels
of maternal education, residence in a 2-parent household, absence of special
services, and absence of significant CNS injury all were significantly associated
with higher PPVT-R scores (P<.001 for all).
Statistically significant factors were selected for further evaluation
of interaction between factors. Early IVH status was retained in the model
to determine its importance in the presence of these factors. The PPVT-R scores
changed with increasing CA in a markedly different way according to the presence
of significant CNS injury (P = .009, test for interaction)
(Figure 1). Participants who were
early IVH negative and did not suffer significant CNS injury had the highest
PPVT-R scores; they are similar to the early IVH positive children without
significant CNS injury. Participants who were early IVH negative with significant
CNS injury had smaller increases over time on the PPVT-R scores than the first
2 groups. Finally, the participants who were early IVH positive and had significant
CNS injury had the lowest PPVT-R scores. Moreover, in contrast with the higher
scores with increasing CA in PPVT-R scores for the other groups, these children
had lower scores with increasing CA.
The extent of higher PPVT-R scores observed over time also differed
according to the presence of a 2-parent household. Children residing in a
2-parent household had a greater improvement in scores over time compared
with those children not residing in a 2-parent household (P = .04, test for interaction). Among the children of mothers with
less than a high school education, the use of special services was associated
with a greater increase in PPVT-R scores (P = .03,
test for interaction).
The overall median (range) full-scale IQ scores for the children were
90.0 (33-134), 90.0 (40-133), 93.0 (40-156), and 95.0 (40-139) at 36, 54,
72, and 96 months of CA, respectively. The VIQ scores were 91.0 (45-135),
94.0 (45-151), and 97.5 (46-142) at 54, 72, and 96 months of CA, respectively.
The PIQ scores for the same 3 evaluations were 89.0 (44-126), 91 (44-143),
and 92 (46-139), respectively.
The per-group serial IQ scores for the early IVH negative and early
IVH positive children evaluated at each age are shown in Table 3. A multivariate regression analysis shows significantly
higher full-scale IQ scores (P = .008) and VIQ scores
(P<.001) with increasing CA. A marginal trend
was also observed in PIQ scores (P = .11). There
were no group effects in any of the models; the effect was similar in the
early IVH positive and early IVH negative groups.
When the per-group serial IQ scores are evaluated as z scores, the multivariate regression analysis shows significantly
higher full-scale IQ scores over time (P = .004),
but no marked difference exists between the early IVH negative and early IVH
positive groups (P = .21). The VIQ z scores also were significantly higher with increasing CA (P<.001), with no marked difference between groups (P = .27). A marginal trend was observed for higher PIQ z scores with increasing CA (P = .11), with
no marked difference between groups (P = .23).
The VIQ z scores were analyzed using the same
multivariate model reported for PPVT-R scores. With increasing CA, higher
levels of maternal education (P<.001), residence
in a 2-parent household (P = .002), absence of special
services (P<.001), and absence of significant
CNS injury (P<.001) were all significantly associated
with higher VIQ scores. Among the children of mothers with less than a high
school education, the use of special services was associated with a greater
increase in VIQ z scores compared with absence of
special service use (P = .005, test for interaction).
This longitudinal study of verbal and cognitive function of a large
cohort of VLBW infants through early and middle childhood demonstrates continued
improvement in both PPVT-R and IQ scores compared with standard norms. Both
children with no evidence for cerebral injury at birth and those with early-onset
IVH experienced progressive increases in scores when tested at 36, 54, 72,
and 96 months of CA. However, those children with IVH at 5 to 11 postnatal
hours and significant CNS injury thereafter showed a decline in PPVT-R scores.
Although we report data for the 296 children who were seen at all 4 evaluations,
we obtained similar results (using mixed models that allow for missing data)
when all 385 participants who were seen for at least 1 visit were included
in the analyses (P<.001 for higher PPVT-R and
IQ scores over time).
Analysis of those biological and environmental factors associated with
neurodevelopmental outcome was consistent with prior studies.11,36 Increasing
years of maternal education and residence in a 2-parent household were both
associated with an increase in testing scores with increasing age. Furthermore,
we found that the important effect of education could be altered by the addition
of special services, suggesting the presence of special services was especially
beneficial for children whose mothers had less education. Similar significant
findings were noted in the regression analysis model for full-scale IQ and
VIQ scores as well as for VIQ z scores, providing
additional support for improvement in verbal scores in VLBW children over
Although normative data on the PPVT-R indicate a 4.5-point increase
in median scores of children tested over time,27 our
participants had a 10- to 11-point increase in scores. Forty-nine percent
of children with scores in the mental retardation range at 36 months had scores
greater than 70 at 96 months, and more than two thirds of those with borderline
scores at 36 months had scores in the normal range at age 8 years. The high
rates of morbidity arising from preterm birth result in a large burden on
the education services of our country.37 The
societal implications of a 5-point difference in IQ are large.38 In
a population of 100 million, 2.3 million individuals score below 70, a score
that in many school districts mandates special-education classes. A shift
to the left of the bell-shaped curve for IQ by 5 points increases the number
of scores below 70 to 3.6 million. Remedial education in many school districts
costs close to an additional $6000 for each student per year. In addition,
the label special education carries with it significant individual stigmatization.
Although 20% to 40% of VLBW infants have receptive language delays as
toddlers and young children,39 studies of serial
verbal and cognitive testing in VLBW infants are limited, and several reports
suggest little change with increasing age.11,13,32 Our
findings differ from these published studies, which may be attributed to several
factors. First, both large series include children of higher birth weights
and gestational ages than our participants and children were born prior to
the years in which our participants were born. This raises the possibility
of differences in perinatal intensive care strategies, such as corticosteroid
exposure and surfactant administration.40,41 Neither
study followed up the same cohort of children and tested them with the same
instrument at each evaluation time. Furthermore, children with significant
neurodevelopmental impairments and those residing in adoptive or foster homes
were omitted from the one analysis.11 Additionally,
in the other report,13 only population marginal
means derived from multiple linear regression models were given for the 96-month
data. Nonetheless, the one study11 noted that
evidence for neonatal white matter injury adversely affected outcome, and
early intervention services were found to positively increase testing scores
in a subset of children in the other.13,32 Both
are consistent with our findings.
Our study has several methodological strengths. Our VLBW infant cohort
was prospectively enrolled and studied longitudinally from the sixth postnatal
hour; both serial cranial ultrasonography and later standardized neurodevelopmental
assessments were performed. Detailed information was obtained on the prenatal,
perinatal, and neonatal course of each child, and demographic data were collected
at each follow-up assessment. In addition, the outcome measures were obtained
by comprehensive inperson evaluations using standardized assessment tools.
Furthermore, no matter whether we selected PPVT-R, IQ, or IQ-associated z scores, all 3 end points led to qualitatively identical
results. Although our IQ score results were of only slightly smaller magnitude
than the PPVT-R data, they are nevertheless of the same general magnitude
and direction. With the VIQ z score analysis, we
obtained statistically significant results that correspond with those observed
for the PPVT-R. Finally, our study cohort represents one of the largest and
longest followed-up groups of VLBW infants in the postsurfactant era.
Our study also has limitations. Cranial ultrasonography remains the
neuroimaging study of choice for screening procedures in VLBW infants but
may not be sufficiently sensitive to white matter injury.42 Our
choice of environmental factors was limited to those reported previously in
studies of VLBW infants at the time our study began 10 years ago. Other unmeasured
and uncontrolled biological or environmental influences may explain our findings.
In addition, our study lacked full-term control infants and relied solely
on published data for PPVT-R and IQ score comparisons.35 Our
findings of significant increases over time in cognitive functioning are a
reflection of the outcome measures used; different measures could lead to
different findings. Also, it is well known that the power of statistical interactions
generally is lower than for main effects in regression models. We do not preclude
the possibility that nonsignificant interactions in our results may be true
or may be due to lack of sufficient power. Finally, as in any study in which
more than 1 or 2 a priori specified hypotheses are examined and tested, multiple
tests are subject to the multiple-testing problem in which falsely negative
results are found to be statistically significant.
In summary, this serial study of verbal and cognitive testing of a large
cohort of VLBW infants during early and middle childhood demonstrates significant
improvement with increasing CA relative to standard norms with 1 notable exception.
Those children with evidence for early-onset IVH followed by later significant
CNS injury showed a decline in their scores with increasing age. These data
suggest that additional serial studies of cognitive function in VLBW infants
with appropriate control-matched participants should be undertaken.