Sign In
Individual Sign In
Create an Account
Institutional Sign In
OpenAthens Shibboleth
June 2003

Beyond the Complete Blood Cell Count and C-Reactive ProteinA Systematic Review of Modern Diagnostic Tests for Neonatal Sepsis

Author Affiliations

From the Division of Neonatology (Dr Kirpalani) and the Departments of Pediatrics (Drs Malik, Pennie, and Kirpalani) and Pathology and Molecular Medicine (Dr Hui), McMaster University, Hamilton, Ontario.


Copyright 2003 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2003

Arch Pediatr Adolesc Med. 2003;157(6):511-516. doi:10.1001/archpedi.157.6.511

Objective  To systematically review the accuracy of modern laboratory tests for the diagnosis of serious bacterial infection in newborns.

Methods  The MEDLINE, EMBASE, and Cochrane Library databases were searched using the keywords newborn, infection, sepsis, and diagnosis. We included studies published from 1995 through 2001 that included infants younger than 90 days with proven bacterial growth in a sample from a sterile site. Whenever possible, relevant data were extracted to calculate likelihood ratios (LRs) for whether each test can diagnose a serious bacterial infection. Two independent reviewers selected and reviewed the articles (interobserver agreement, κ = 0.80). All disagreements were resolved by consensus.

Results  Of the 137 citations we retrieved, 37 articles met the inclusion criteria; 17 studies, evaluating 11 different tests, met the highest methodological criteria. The most commonly evaluated test was interleukin 6 (IL-6) level (n = 7 studies). The remaining tests were each evaluated in no more than 3 studies. Positive LRs ranged from 1.5 to ∞. Six individual tests examined in 8 studies had LRs of more than 10 (range, 12.5-∞). Combined tests also had a wide range of LRs (3.4-9.9). All studies were performed in single medical centers and had small sample sizes, making recommendations according to gestational age criteria difficult.

Conclusions  We found few methodologically rigorous studies of the accuracy of laboratory tests for the diagnosis of bacterial infection in newborns; in a significant proportion of studies, the accuracy of the tests could not be independently determined because of a lack of adequate data. There was marked heterogeneity in sample selection and cutoff levels for diagnosis of neonatal sepsis. A few tests showed promising accuracy, but there are insufficient data to support their confident use as clinical tools.

CLINICIANS ARE frustrated by the limitations in the diagnosis of neonatal sepsis and would benefit from reliable tests to diagnose sepsis early in its course. Currently, no single test fulfills the criteria of an ideal diagnostic test.1,2 In neonatology, tests using hematological indices or acute-phase reactants, such as C-reactive protein (CRP), remain in widespread use despite continuing concerns about their reliability. These concerns largely stem from the demonstrated marked variations in the predictive accuracy of hematological parameters.1 We wished to assess the validity of several newly available immunological markers, including acute-phase reactants other than CRP, and inflammatory mediators, levels of which have been claimed to assist in the diagnosis of neonatal sepsis (Table 1).3 Therefore, we examined studies of various diagnostic tests with reference to their methodological rigor.

Table 1. 
Image not available
Clinical Characteristics of Modern Diagnostic Tests for Neonatal Sepsis

We identified all relevant articles on bacterial infection in newborns using the following MeSH headings: newborn, infection, sepsis, and diagnosis. We searched the National Library of Medicine's PubMed database, EMBASE, and the Cochrane Library's Cochrane Controlled Trials Register for the years 1995 through 2001. Our search was limited to English-language articles with human data; editorials, commentaries, letters, and reviews were excluded. We searched the bibliographies of review articles, and we also attempted to obtain pertinent missing data from authors.

In addition, we used the following criteria and definitions to select relevant articles. Inclusion criteria were that (1) the diagnostic tests being evaluated were considered to be "new" tests (ie, excluding hematological parameters, such as immature-total neutrophil ratio, or the commonly used acute-phase reactant CRP); (2) the postnatal age of the infants studied was younger than 90 days; and (3) studies were focused on serious bacterial infections, and true infections were proven by a criterion standard. Because of the uncertainty surrounding the concept of clinical sepsis in neonates, we focused only on studies in which the criterion standard for diagnostic tests was unequivocal proof of bacterial infection (ie, bacterial growth) in cultures of blood or cerebrospinal fluid (CSF) samples from sterile body sites. (We did not include urinary tract infections because they are an uncommon cause of sepsis in newborns unless they are accompanied by general sepsis and because for validity, these samples must be obtained by invasive measures that many neonatologists are reluctant to employ.) We defined clinical sepsis as sepsis not meeting the definition of true infection. If studies reported positive bacterial growth from endotracheal tube aspirates, with or without changes on chest radiographs, in the absence of blood or CSF cultures positive for bacterial growth, we considered this to be clinical sepsis.

We excluded studies that examined antenatal tests, including amniotic fluid tests, and studies in which data for antenatal and neonatal infection could not be separated.


Two of us (A.M. and C.P.S.H.) assessed each article for methodological quality independently to establish that all selected studies allowed us to distinguish true bacterial infections from clinical sepsis. These 2 authors also attempted to extract data for an independent calculation of sensitivity, specificity, and likelihood ratio (LR). Each article was then independently assessed by 2 of us (A.M. and C.P.S.H.). Adjudication (H.K.) and subsequent consensus resolved all disagreements regarding the inclusion of studies and the extracted data.


If studies provided adequate data, 2 × 2 tables were created to calculate sensitivity, specificity, positive predictive value, negative predictive value, and LRs. Following the recommendations of Sackett et al,2 an a priori rule was defined for this study whereby an LR of less than 10 was considered unlikely to affect clinical diagnosis. Confidence intervals were recorded for the LRs in studies that presented adequate data to perform an independent calculation of pretest probabilities and LRs (StatsDirect statistical software, version 1.9.8; StatsDirect Ltd, Sale, England; Confidence Interval Analysis, version 2.0, Wilson Method, T. Bryant, 2000). If we were not able to independently extract raw data, LRs were calculated from sensitivity and specificity values provided by the authors.


The literature search generated 137 citations. Of these, 49 were potentially relevant for inclusion based on a review of the abstract. More detailed review resulted in good interobserver agreement between the 2 reviewers regarding eligible articles; full agreement was reached for 42 articles, and consensus discussion was required for only 7 articles (Cohen κ = 0.80). We excluded one study that reported on 548 blood samples but did not specify the number of infants from whom they were drawn,4 making it impossible for us to determine whether multiple samples were from single subjects. Another article5 also included multiple samples from single subjects; however, this article provided the sample size from which the specimens were drawn. Although this limits our ability to fully interpret the data, we chose to include this article because the denominator was provided. In total, 37 articles met the inclusion criteria and were assessed for methodological quality.


Of the 37 studies that were included, 17 (46%) clearly distinguished clinically septic infants from those who had true bacterial growth according to our criteria (Table 2).521 The cumulative sample sizes of these studies were small, even for the most frequently applied diagnostic tests. Because we deliberately chose bacterial growth as a criterion standard, the remaining 20 studies could not be further analyzed. A list of references found but that did not meet inclusion criteria can be obtained from the corresponding author (H. K.).

Table 2. 
Image not available
Included Studies (n = 17)*
Included Studies

The 17 studies that met our methodological criteria assessed 11 new tests and a total of 299 septic infants. Of these studies, 7 (41%), enrolling a total of 68 septic infants, provided adequate raw data to allow independent calculation of sensitivities, specificities, and LRs along with confidence intervals (Table 3). In 10 studies, we could only use the values as originally calculated by the authors. From the study by Messer et al,7 we could extract data for interleukin (IL) 6 levels but not for levels of the tumor necrosis factor (TNF) receptors p55 and p75. Similarly, the study by Silveira and Procianoy10 examined 3 diagnostic tests (IL-6, TNF-α, and IL-1β levels), but no data are provided for IL-1β levels. The authors concluded that "IL-1β is not a good marker of neonatal sepsis."10(p650) In the absence of numeric data, we have omitted it from Table 3. Three studies reviewed more than 1 diagnostic test used in combination in an attempt to enhance diagnostic accuracy (Table 4). In the 17 included studies, the most common new reported test was IL-6 level, which was examined by 7 separate studies that enrolled a total of 92 septic and 524 nonseptic infants. The remaining tests were assessed by no more than 3 studies each (cumulative range, 4-72 septic infants).

Table 3. 
Image not available
Accuracy of Diagnostic Tests*
Table 4. 
Image not available
Accuracy of Combined Tests
Cutoff Values Used by the Studies

The cutoff laboratory values that were chosen to distinguish between the presence and absence of infection appear to be unique to each study. Franz et al13 employed 2 separate cutoff values for IL-8 in 2 separate study periods, which allowed a comparison of these values (Table 3). In a later study, Franz et al14 provided more data using the second cutoff value. Franz et al13,14,22 also reported on similar data sets employing IL-8 and procalcitonin levels. We included 2 of these studies13,14 because we could not determine how much they overlapped. We annotated data from all 3 subsets of infants described by these researchers (Table 3 and Table 4). We used the authors' own cutoff values whenever available. Gendrel et al,20 who examined 13 septic infants, did not specify any cutoff level to demarcate between infected and noninfected newborns. However, the authors did provide a scattergram that allowed us to make a distinction. In order to use the study by Gendrel et al,20 we employed the cutoff value for procalcitonin used by Franz et al14 in their similar study.

Performance of the Individual Tests Evaluated

The range of test sensitivity and specificity, both calculated and reported, was large (Table 3). Sensitivities ranged from 57% to 100%, and specificities ranged from 43% to 100%. Similarly, positive LRs ranged from 1.5 to ∞ (Table 3). Six tests in 8 studies had LRs of more than 10.5,8,9,11,15,18,20,21 Of these studies, we were able to perform an independent verification that the positive LR was more than 10 in the study of procalcitonin by Gendrel et al20; the studies of neutrophil CD11b by Weirich et al18 and Nupponen et al15; the study of IL-6 by Bhartiya et al11; and the study of IL-8 by Nupponen et al.15 In total, 2 of 3 studies that evaluated procalcitonin levels20,21 had a positive LR of more than 10.

Combined Tests

We also assessed the accuracy of combinations of tests evaluated in 3 studies (Table 4), although these studies did not provide adequate raw data to allow calculation of LRs. All were small studies; the largest, by Franz et al,13 enrolled only 26 septic infants. None of these test combinations had a positive LR of more than 10.


The rapidly evolving understanding of the molecular physiological processes underlying sepsis and technical advances in biochemical testing hold promise for rapid accurate diagnosis, although it should be remembered that bacterial growth requires at least 12 hours by commercial automated testing. Table 1 addresses the clinical utility of these putative new tests. However, in this systematic review of the accuracy of these newer diagnostic tests, we were unable to provide a summary in a simple statistical form that clinicians could use. This reflects several methodological issues. Perhaps the most striking is that the predominance of small studies, all using such differing approaches, makes formal meta-analysis unproductive. Had studies been of large enough size and power, it would have been possible to report on test characteristics by birth weight and gestational age, because risk of neonatal sepsis is likely to be different with both.

The marked heterogeneity in studies of tests for neonatal sepsis has also been noted by authors reviewing older diagnostic tests.1,23 It is worth emphasizing that, among studies, enrolled subjects were heterogeneous, varying by postnatal age, gestational age, and risk factors. Each study consisted of small numbers of patients from single medical centers, which differed with regard to types of patients (eg, surgical, cardiac, medical, and inborn and outborn) and their demographic characteristics. Often, we were uncertain about the exact nature of the neonatal population in which the diagnostic test was studied because most studies reported only gestational age, sex, and birth weight. Even the prevalence of sepsis in the nurseries studied was recorded in only a small minority of articles. In addition, a wide range of cutoff values was employed, and no study used previously reported values; instead, authors chose to use unique cutoff values, making comparison of these tests difficult. Most articles (Table 3) did not report whether blood culture and lumbar puncture were performed before infants received antibiotics. We were left to assume that these tests were performed first if there was clinical suspicion of sepsis and that antibiotic treatment was instituted afterward. Attempts to contact the authors of several studies to obtain further raw data were by and large unhelpful. These methodological problems prevented us from providing a concise statistical summary in the form of a meta-analysis.

Because some clinicians are more familiar with sensitivity than LRs, we provide these values. Where possible, however, we emphasize LRs because this statistic offers potential advantages compared with measures such as sensitivity and specificity.24,25 Likelihood ratios may be more useful than sensitivity and specificity, largely because of their independence from prevalence.26,27 This is relevant to our review because the prevalence of neonatal sepsis (whether nosocomial or peripartum) appears to vary considerably, from 5% to 42% in a recent study.28 We could not find the true prevalence in most studies included in this review. The prevalence of sepsis varies according to birth weight, gestational age, and the characteristics of the neonatal nursery, including its proportion of infants undergoing medical vs surgical treatment.

Sackett et al2 suggest that positive LRs of less than 10 are unlikely to greatly enhance posttest probabilities. We chose to follow this rule in assessing how useful these studies are to clinical practice. Only 8 individual studies examining 6 tests had LRs outside this range. We have not emphasized negative LRs, although we also report these (Table 3 and Table 4). Taking the top prevalence range of Brodie et al28 at 42%, even an extremely negative LR would likely reduce the posttest probability of the patient having sepsis to 5%, as calculated using the LR nomogram in Sackett et al.2 We remain unconvinced that a clinician would accept a 5% risk in choosing not to treat.

A common diagnosis in neonatology is clinical sepsis, without confirmation from blood or CSF cultures. Combining clinical sepsis and documented infections dilutes the true sepsis rate, which should be the denominator for rates. We have avoided these possible problems by specifically examining only studies that enrolled infants with clinical signs leading to identification of an unequivocally true bacterial infection (ie, bacterial growth). Less than half the initial studies met this criterion, and less than half the remaining studies provided adequate data to enable an independent confirmation of the statistics, including the LRs.

In contrast to our insistence on an unequivocal criterion standard, Mehr and Doyle,29 in reviewing selected cytokine levels as markers of bacterial sepsis in newborn infants, chose to examine data from all infants, including those with clinical sepsis. They reviewed articles examining TNF-α, IL-6, and IL-8 levels that were published from 1966 to 1999 in English-language journals listed in MEDLINE. They argued that including "probable/suspected" sepsis was more relevant to "real life" clinical circumstances. However, a strict and uniform definition of culture-negative sepsis was not adhered to in the reviewed studies. In addition, the clinical terms were often not defined, nor were the cutoff levels for various tests consistent across studies. Mehr and Doyle also raised concerns about false-positive results due to skin contaminants and the lack of stringent definitions to distinguish contaminants from true infections, leading to a falsely high positive predictive value. Although we did not attempt to address this in our review, a previous study, for which age-matched and birth weight–matched controls were specifically recruited, showed a lower rate of positive blood cultures in nonsymptomatic infants.30 Mehr and Doyle's overall concerns about the methodological rigor of this field of study were similar to ours.

We recognize that there are other tests for infection, such as heart rate analysis,31 sepsis scores,32 and urinary tests.33 However, we did not include these, because they were not in the scope of our review. In addition, we did not include articles examining CRP because they had previously been reviewed1 and CRP tests appear to be as inaccurate as more standard hematological tests.

We conclude that serious methodological flaws plague current studies that aim to improve the diagnosis of neonatal bacterial sepsis with modern tests. In particular, the predominance of studies at single medical centers with small sample sizes makes it difficult to apply the tests in clinical decision making. Furthermore, diagnostic tests were not applied to differing populations with a mix of inborn and outborn infants, gestational ages, birth weights, and levels of acuity (level 1, 2, and 3 neonatal intensive care units), which made generalizability a problem. A few diagnostic tests remain promising, of which IL-6 level is the most intensively studied, probably because of its acknowledged importance as an alarm cytokine. In addition, procalcitonin levels appear to show considerable promise as a diagnostic test for neonatal sepsis. To give clinicians a firmer recommendation, studies of adequate size and using rigorous methods are now needed to enable estimates of the diagnostic accuracy of these new tests.

Back to top
Article Information

Corresponding author and reprints: Haresh Kirpalani, BM, MSc, FRCP, Department of Pediatrics, McMaster University Medical School, Room 3N27A, 1200 Main St W, Hamilton, Ontario L8N 1C5, Canada (e-mail:

Accepted for publication February 5, 2003.

Dr Hui received salary support in 2002 and 2003 from a GlaxoSmithKline, Canadian Institutes of Health Research, and Canadian Infectious Disease Society Infectious Disease Research Fellowship.

What This Study Adds

We reviewed all the literature on diagnostic tests for neonatal sepsis published since the last substantive review in 1995. Although many newer tests are being evaluated, this study highlights the fact that few of these tests have been evaluated with methodological rigor. The problems of significant heterogeneity in sample selection and cutoff levels for diagnosis of sepsis persist. A few tests look promising, but until larger multicenter trials are performed, they should not be employed in routine clinical practice.

Fowlie  PWSchmidt  B Diagnostic tests for bacterial infection from birth to 90 days: a systematic review. Arch Dis Child Fetal Neonatal Ed. 1998;78F92- F98Article
Sackett  DLStraus  SERichardson  WSRosenberg  WHaynes  RB Evidence-Based Medicine: How to Practice and Teach EBM.  London, England Churchill Livingstone2000;
Jurges  ESHenderson  DC Inflammatory and immunological markers in preterm infants: correlation with disease. Clin Exp Immunol. 1996;105551- 555Article
Jordan  JADurso  MB Comparison of 16S rRNA Gene PCR and BACTEC 9240 for detection of neonatal bacteremia. J Clin Microbiol. 2000;382574- 2578
Fischer  JEBrunner  AJanousek  MNadal  DBlau  NFanconi  S Diagnostic potential of neutrophil elastase inhibitor complex in the routine care of critically ill newborn infants. Eur J Pediatr. 2000;159659- 662Article
Kallman  JEkholm  LEriksson  MMalmstrom  BSchollin  J Contribution of interleukin-6 in distinguishing between mild respiratory disease and neonatal sepsis in the newborn infant. Acta Paediatr. 1999;88880- 884Article
Messer  JEyer  DDonato  LGallati  HMatis  JSimeoni  U Evaluation of interleukin-6 and soluble receptors of tumor necrosis factor for early diagnosis of neonatal infection. J Pediatr. 1996;129574- 580Article
Panero  APacifico  LRossi  NMancuso  GStegagno  MChiesa  C Interleukin 6 in neonates with early and late onset infection. Pediatr Infect Dis J. 1997;16370- 375Article
Kuster  HWeiss  MWilleitner  AE  et al.  Interleukin-1 receptor antagonist and interleukin-6 for early diagnosis of neonatal sepsis 2 days before clinical manifestation. Lancet. 1998;3521271- 1277Article
Silveira  RProcianoy  RS Evaluation of interleukin-6, tumor necrosis factor-α and interleukin-1β for early diagnosis of neonatal sepsis. Acta Paediatr. 1999;88647- 650Article
Bhartiya  DKapadia  CSanghvi  KSingh  HKelkar  RMerchant  R Preliminary studies on IL-6 levels in healthy and septic Indian neonates. Indian Pediatr. 2000;371361- 1367
Kashlan  FSmulian  JCShen-Schwarz  SAnwar  MHiatt  MHegyi  T Umbilical vein interleukin 6 and tumor necrosis factor α plasma concentrations in the very preterm infant. Pediatr Infect Dis J. 2000;19238- 243Article
Franz  ARSteinbach  GKron  MPohlandt  F Reduction of unnecessary antibiotic therapy in newborn infants using interleukin-8 and C-reactive protein as markers of bacterial infection. Pediatrics. 1999;104447- 453Article
Franz  ARKron  MPohlandt  FSteinbach  G Comparison of procalcitonin with interleukin 8, C-reactive protein, and differential white blood cell count for the early diagnosis of bacterial infections in newborn infants. Pediatr Infect Dis J. 1999;18666- 671Article
Nupponen  IAndersson  SJarvenpaa  AKautiainen  HRepo  H Neutrophil CD11b expression and circulating interleukin-8 as diagnostic markers for early-onset neonatal sepsis. Pediatrics. 2001;108e12Available at August 2001Article
Atici  ASatar  MCetiner  SYaman  A Serum tumor necrosis factor-α in neonatal sepsis. Am J Perinatol. 1997;14401- 404Article
Kocak  UEzer  UVidinlisan  S Serum fibronectin in neonatal sepsis: is it valuable in early diagnosis and outcome prediction? Acta Paediatr Jpn. 1997;39428- 432Article
Weirich  ERabin  RLMaldonado  Y  et al.  Neutrophil CD11b expression as a diagnostic marker for early-onset neonatal infection. J Pediatr. 1998;132445- 451Article
Laforgia  NCoppola  BCarbone  RGrassi  AMautone  AIolascon  A Rapid detection of neonatal sepsis using polymerase chain reaction. Acta Paediatr. 1997;861097- 1099Article
Gendrel  DAssicot  MRaymond  J  et al.  Procalcitonin as a marker for the early diagnosis of neonatal infection. J Pediatr. 1996;128570- 573Article
Chiesa  CPanero  ARossi  N  et al.  Reliability of procalcitonin concentrations for the diagnosis of sepsis in critically ill neonates. Clin Infect Dis. 1998;26664- 672Article
Franz  ARSteinbach  GKron  MPohlandt  F Interleukin-8: a valuable tool to restrict antibiotic therapy in newborn infants. Acta Paediatr. 2001;901025- 1032Article
Klassen  TPRowe  PC Selecting diagnostic tests to identify febrile infants less than 3 months of age as being at low risk for serious bacterial infection: a scientific overview. J Pediatr. 1992;121671- 676Article
Sonis  J How to use and interpret interval likelihood ratios. Fam Med. 1999;31432- 437
Fletcher  RHFletcher  SWWagner  EH Clinical Epidemiology: The Essentials. 3rd ed. Philadelphia, Pa Williams & Wilkins1996;64- 67
Jaeschke  RGuyatt  GHSackett  DLfor the Evidence Based Medicine Working Group, User's Guide to the Medical Literature, III: how to use an article about a diagnostic test, B: what are the results and will they help me in caring for my patients? JAMA. 1994;271703- 707Article
Gierd  RWHermans  J The diagnostic information of tests for the detection of cancer: the usefulness of the likelihood ratio concept. Eur J Cancer. 1996;32A2042- 2048Article
Brodie  SBSands  KEGray  JE  et al.  Occurrence of nosocomial bloodstream infections in 6 neonatal intensive care units. Pediatr Infect Dis J. 2000;1956- 62Article
Mehr  SDoyle  LW Cytokines as markers of bacterial sepsis in newborn infants: a review. Pediatr Infect Dis J. 2000;19879- 887Article
Schmidt  BKKirpalani  HMCorey  MLow  DEPhilip  AGFord-Jones  EL Coagulase-negative staphylococci as true pathogens in newborn infants: a cohort study. Pediatr Infect Dis J. 1987;61026- 1031Article
Griffin  MPMoorman  JR Toward the early diagnosis of neonatal sepsis and sepsis-like illness using novel heart rate analysis. Pediatrics. 2001;10797- 104Article
Ergenekon  EGucuyener  KErbas  DKoc  EOzturk  GAtalay  Y Urinary nitric oxide in newborns with infections. Biol Neonate. 2000;7892- 97Article
Mahieu  LMDe Muynck  AODe Dooy  JJLaroche  SMVan Acker  KJ Prediction of nosocomial sepsis in neonates by means of a computer-weighted bedside scoring system (NOSEP score). Crit Care Med. 2000;282026- 2033Article