eFigure. Flow Diagram for Patient Inclusion
eTable. Simplified FNAST Evaluation Tool
Customize your JAMA Network experience by selecting one or more topics from the list below.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Devlin LA, Breeze JL, Terrin N, et al. Association of a Simplified Finnegan Neonatal Abstinence Scoring Tool With the Need for Pharmacologic Treatment for Neonatal Abstinence Syndrome. JAMA Netw Open. 2020;3(4):e202275. doi:10.1001/jamanetworkopen.2020.2275
Can a simplified Finnegan Neonatal Abstinence Scoring Tool (FNAST), composed only of dichotomized items that are independently associated with the decision to initiate pharmacologic therapy, discriminate between infants who did and did not receive therapy as effectively as the original FNAST?
In this cohort study of 424 neonates with opioid exposure, the simplified FNAST discriminated between neonates who did and did not receive pharmacologic treatment nearly as well as the original FNAST.
Use of a simplified FNAST may provide an accurate means of identifying neonatal abstinence syndrome and may enhance the clinical utility of the tool.
Observer-rated scales, such as the Finnegan Neonatal Abstinence Scoring Tool (FNAST), are used to quantify the severity of neonatal abstinence syndrome (NAS) and guide pharmacologic therapy. The FNAST, a comprehensive 21-item assessment tool, was developed for research and subsequently integrated into clinical practice; a simpler tool, designed to account for clinically meaningful outcomes, is urgently needed to standardize assessment.
To identify FNAST items independently associated with the decision to use pharmacologic therapy and to simplify the FNAST while minimizing loss of information for the treatment decision.
Design, Setting, and Participants
This multisite cohort study included 424 neonates with opioid exposure who had a gestational age of at least 36 weeks with follow-up from birth to hospital discharge in the derivation cohort and 109 neonates with opioid exposure from the Maternal Opioid Treatment: Human Experimental Research Study in the validation cohort. Neonates in the derivation cohort were included in a medical record review at the Universities of Louisville and Kentucky or in a randomized clinical trial and observational study conducted at Tufts University (2014-2018); the Maternal Opioid Treatment: Human Experimental Research was conducted from 2005 to 2008. Data analysis was conducted from May 2017 to August 2019.
Prenatal opioid exposure.
Main Outcomes and Measures
All FNAST items were dichotomized as present or not present, and logistic regression was used to identify binary items independently associated with pharmacologic treatment. The final model was validated with an independent cohort of neonates with opioid exposure.
Among 424 neonates (gestational age, ≥36 weeks; 217 [51%] female infants), convulsions were not observed, and high-pitched cry and hyperactive Moro reflex had extremely different frequencies across cohorts. Therefore, these 3 FNAST items were removed from further analysis. The 2 tremor items were combined, and 8 of the remaining 17 items were independently associated with pharmacologic treatment, with an area under the curve of 0.86 (95% CI, 0.82-0.89) compared with 0.90 (95% CI, 0.87-0.94) for the 21-item FNAST. External validation of the 8 items resulted in an area under the curve of 0.86 (95% CI, 0.79-0.93). Thresholds of 4 and 5 on the simplified scale yielded the closest agreement with FNAST thresholds of 8 and 12 (weighted κ = 0.55; 95% CI, 0.48-0.61).
Conclusions and Relevance
The findings of this study suggest that 8 signs of NAS may be sufficient to assess whether a neonate meets criteria for pharmacologic therapy. A focus on these signs could simplify the FNAST tool and may enhance its clinical utility.
The medical and nonmedical use of opioids in the United States has increased significantly during the last decade.1-4 Opioid use disorder (OUD) during pregnancy increased from 1.5 per 1000 live births to 6.5 per 1000 live births between 1999 and 2014.5 Antenatal exposure to opioids has led to a 5-fold rise in the incidence of neonatal abstinence syndrome (NAS) between 2004 and 2014, a number that continues to increase.6,7 Neonatal abstinence syndrome is a withdrawal syndrome that occurs after the interruption of the passive transfer of maternal opioids at the time of birth. The diagnosis of NAS is dependent on the presence of key signs of withdrawal, with severity being highly variable.8,9 Every neonate exposed to opioids in utero is somewhere along the continuum of withdrawal. While some neonates have mild signs and normal physiologic functions, others have more severe NAS that requires pharmacologic treatment to avoid major complications.10 Differences in the expression of NAS is associated with many factors, including the type(s) of opioid exposure, coexposure with other illicit drugs and/or psychotropic medications, genetic and epigenetic variability, the gestational age and sex of the neonate, breastfeeding, and parental engagement.11-17
Observer-rated scales have been used for more than 40 years to assess the severity of NAS and guide the initiation and adjustment of pharmacologic therapy.18-21 Differences between raters in the assessment of NAS have been associated with significant differences in initiation and duration of pharmacologic therapy, length of hospital stay, and health care utilization.22 The Finnegan Neonatal Abstinence Scoring Tool (FNAST) is commonly used for the assessment of neonates with NAS.23-25 It is a screening tool that comprises 21 items, with many items having 2 to 4 subcategories and weighting for each category, which varies from 1 to 5.18,19 Although differences among raters have been observed with the use of the FNAST,26,27 these rater differences can be minimized with a comprehensive educational approach that optimizes and then maintains interobserver reliability.22,26-29 Several tools have been developed to shorten and simplify the FNAST,27,30-33 but existing studies are limited by small cohorts that lack generalizability, study samples that do not differentiate between neonates with and without pharmacologic treatment, and a lack of external validation of the findings.
The goal of this study was to significantly shorten and simplify the original FNAST by focusing on signs of withdrawal that prompt clinical intervention. In this article, we address the following questions. First, is information lost by dichotomizing components of the FNAST and eliminating weighting? Second, are there large differences among cohorts in how frequently specific signs are observed? Third, which binary-coded FNAST items are independently associated with the receipt of pharmacologic therapy?
Retrospective medical record reviews of consecutive neonates with antenatal opioid exposure were conducted at the University of Louisville and University of Kentucky for infants born in 2014. In addition, prospective data were obtained from an 8-site clinical trial that compared methadone with morphine for the treatment of NAS (conducted 2014-2018, led by Tufts University)34 and from a concurrent observational study of neonates whose parents gave consent for the clinical trial but did not require treatment or whose parents refused consent for randomization in the clinical trial but consented to data collection (eFigure in the Supplement). Included neonates had a gestational age of at least 36 weeks and had no other significant medical or surgical illness. All sites used the FNAST for assessment, and nonpharmacologic care was the initial treatment for NAS in this study population.
Data from an external cohort of neonates enrolled in the Maternal Opioid Treatment: Human Experimental Research (MOTHER) study (conducted 2005-2008) were used to validate the final model.26 Of the 131 infants in the original study, 109 met inclusion criteria for this analysis (ie, gestational age ≥36 weeks; adequate data on timing of treatment initiation; complete FNAST data on day of treatment or on day 3 of life if not treated). The MOTHER NAS (MNS) tool included 28 items (the FNAST plus 7 additional items). Therefore, data for every FNAST item were available at each assessment in the MOTHER data set. The scores used to initiate pharmacologic therapy in the MOTHER study were calculated from 19 of the 28 MNS items, including 3 that are not on the FNAST.
Each site’s institutional review board approved the study. The deidentified retrospective medical record review from the Kentucky sites was conducted under a waiver of informed consent, and informed consent was obtained for patients included in the clinical trial, observational cohort, and the MOTHER study. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline. Analyses were conducted from May 2017 to August 2019.
The present analysis used a single assessment point for each neonate, ie, the time of the highest score on the day the neonate initially received pharmacologic therapy for NAS or on day 3 of life if never treated. The third day was selected for analyzing untreated neonates because it was the median day when treatment was initiated among neonates who were treated. The criteria for initial treatment for NAS varied by site and included: 3 consecutive scores of at least 8 or 2 consecutive scores of at least 12 on the FNAST at the Kentucky sites; 2 consecutive scores of at least 8 or 1 score of at least 12 on the FNAST for the multisite study; and 2 consecutive scores of at least 9 or 1 score of at least 13 on the NAS scale for the MOTHER study.
Items on the FNAST with multiple categories were dichotomized as present or absent; if any sign was recorded, the neonate was coded as having the sign (eg, hyperactive Moro reflex and markedly hyperactive Moro reflex were both coded as hyperactive Moro reflex). The 2 items related to tremors were combined to form a single binary item.
Differences among the 3 cohorts in the frequency of each binary item were tested using a χ2 test for proportions. Endorsement of each item across cohorts was also compared using the area under the curve (AUC). Items that differed by more than 50 percentage points among cohorts and items that were never observed were excluded from subsequent analyses. Forward stepwise multivariable logistic regression was used to determine which of the remaining items were independently associated with receipt of pharmacologic therapy, adjusting for cohort. The criteria for stepping items into the model and retaining items in the model were P < .10 and P < .05, respectively. Model discrimination was evaluated using the AUC. The selected items were validated by calculating the AUC of a multivariable logistic regression on the validation data set. Thresholds for the new score were selected to optimize agreement between it and the FNAST in categorizing scores into 3 groups: low (<8 on FNAST), medium (≥8 to <12), and high (≥12).
Statistical analysis was conducted with SAS statistical software version 9.4 (SAS Institute). Except as stated for model building, statistical significance was set at P < .05, and all tests were 2-tailed.
A total of 424 neonates were included in the primary analysis (mean [SD] birth weight, 3122  g; 217 [51%] female infants). Among 238 treated neonates, the median (interquartile range) time to treatment was 3 (2-4) days from birth. The frequency of each item in the FNAST is shown in Table 1. Generalized convulsions, fever (body temperature ≥38.4 °C), and vomiting (projectile) were not reported. Many of the items had statistically significant differences in percentage endorsement across cohorts (eg, sleeps <3 h: Louisville group, 70 [55.1%]; Tufts group, 108 [53.2%]; Kentucky group, 80 [85.1%]; P < .001; body temperature ≥37.2 °C: Louisville group, 30 [23.6%]; Tufts group, 81 [39.9%]; Kentucky group, 18 [19.1%]; P < .001), but such differences only appeared extreme (ie, greater than 50 percentage points) in the case of high-pitched crying (Louisville group, 98 [77.2%]; Tufts group, 42 [20.7%]; Kentucky group, 75 [79.8%]; P < .001) and hyperactive Moro reflex (Louisville group, 36 [28.3%]; Tufts group, 35 [17.2%]; Kentucky group, 64 [68.1%]; P < .001) (Table 2). These 2 items also had the highest AUCs in models comparing endorsement of the item among cohorts (crying: 0.79 [95% CI, 0.75-0.83]; Moro reflex: 0.72 [95% CI, 0.62-0.77]). Thus, these 2 items as well as generalized convulsions (which were not observed) were excluded from subsequent model building.
The stepwise regression selected 8 items that were independently associated with receipt of pharmacologic therapy (sleeps <3 hours after feeding: odds ratio [OR], 3.1; 95% CI, 1.9-5.0; P < .001; tremors when disturbed: OR, 2.3; 95% CI, 1.3-3.9; P = .003; tremors when undisturbed: OR, 3.5; 95% CI, 2.0-6.2; P < .001; increased muscle tone: OR, 10.0; 95% CI, 3.9-26.0; P < .001; body temperature ≥37.2 °C: OR, 2.1; 95% CI, 1.3-3.5; P = .002; respiratory rate >60/min: OR, 2.6; 95% CI, 1.6-4.1; P < .001; excessive sucking: OR, 2.4; 95% CI, 1.5-3.8; P < .001; poor feeding: OR, 3.0; 95% CI, 1.7-5.3; P < .001; regurgitation: OR, 3.2; 95% CI, 1.7-6.2; P < .001) (Table 3). The AUC for the 8-item model was 0.86 (95% CI, 0.82-0.89) compared with 0.90 (95% CI, 0.87-0.94) for the 21-item FNAST. When validated with the cohort from the MOTHER study, the model had an AUC of 0.86 (95% CI, 0.79-0.93). Comparison of the original and simplified scales is seen in Table 4. For example, the 3 levels of sleeps after feeding were collapsed into 1 category; similarly, the 2 categories of fever were collapsed into 1. Thresholds of 4 and 5 on the simplified scale yielded the closest agreement with FNAST thresholds of 8 and 12, with a weighted κ of 0.55 (95% CI, 0.48-0.61) (Table 5).
In this cohort study, a simplified, binary, 8-item FNAST scale was developed to discriminate between neonates who did and did not receive pharmacologic therapy based on traditional FNAST parameters (Table 4; eTable in the Supplement). The items were subsequently validated with an independent cohort of neonates with opioid exposure. The logistic model with the items on the simplified scale discriminated nearly as well as the model that incorporated the original FNAST items (AUC, 0.86 [95% CI, 0.82-0.89] vs 0.90 [95% CI, 0.87-0.94]) despite dichotomizing and eliminating many of the items. The components of the FNAST identified in this 8-item model performed well despite variation in the algorithm used to initiate pharmacologic therapy among the cohorts (ie, 2 consecutive FNAST scores of ≥8 or 1 of ≥12; 3 consecutive scores ≥8 or 2 of ≥12), and they did not lose predictive power in an independent validation sample that used a different algorithm (ie, the MNS) for assessing the need for pharmacologic therapy. The high AUC that we observed in the validation cohort is evidence of the transportability of the 8-item simplified scale, especially since the MNS scale differed from the FNAST by including some signs that were not in the FNAST and excluding others that were.
When the FNAST and simplified FNAST scores were categorized according to treatment cutoffs, the agreement was only moderate (weighted κ = 0.55; 95% CI, 0.48-0.61). There are 2 cutoffs for the original FNAST that were predetermined at 8 and 12 points, leaving little room to set the simplified FNAST thresholds to optimize agreement. In practice, the treatment decision is based on frequent assessments, not a single score. Because there is no criterion standard, there is currently no method for determining whether an algorithm based on the original FNAST or the 8-item version would yield better clinical outcomes.
The model did not select some signs because they occurred infrequently or almost always and hence were not useful in predicting pharmacologic treatment for NAS. Not every item that was associated with the outcome in univariate analyses made it into the final logistic regression model. Having a high odds ratio in a univariate analysis does not ensure retention in a multivariable model if an item is associated with other items and/or the confidence interval is wide. Signs that were absent from the final model may still be clinically important, although they do not have to be checked to make a treatment decision.
Each item on the original FNAST was weighted and based on clinical observations at the time the tool was developed. Discerning the degree of expression of several items on the original FNAST has proven difficult.35 Differences in the assessment of withdrawal severity have been associated with suboptimal short-term outcomes in neonates with NAS.36 In this study, we evaluated differences in the scoring of individual items of the FNAST among different cohorts (Table 3) and found significant differences in the frequency of some items. The most marked differences were noted with high-pitched crying and hyperactive Moro reflex, which were excluded from further analysis.
Several simplified Finnegan-based scoring tools have been developed.27,30,32,33 The 10-item simplified Finnegan Neonatal Abstinence Scoring System (sFNAST),33 published in 2017, included 7 of 8 items in the scale proposed in this study (ie, tremors, increased muscle tone, poor sleep, tachypnea, excessive sucking, poor feeding, and feeding intolerance). This overlap is notable considering the differences in method for deriving the scores. The sFNAST study focused on neonates who received treatment, used multiple assessments from each patient, selected items based on correlation with the original FNAST, and did not have enough sites to evaluate heterogeneity. The present study included large numbers of neonates with and without pharmacologic treatment, used the highest score on the day treatment was initiated (or day 3 of life if never treated), selected items to optimize discrimination, excluded items with marked differences among cohorts, and was validated using an independent cohort.
A 5-item MNS Short Form was developed based on the ability to discriminate between neonates receiving pharmacologic treatment and those who were not treated.32 The MNS Short Form shares 3 items with our scale (ie, tremors, increased muscle tone, and tachypnea). One item on the MNS Short Form (excessive irritability) is not on the original FNAST. Similar to our study, there was a single score per neonate and the goal was optimization of discrimination. The AUC of the 5-item MNS was 0.90, which was close to the AUC of 0.94 for the 19-item MNS. However, there was no independent validation sample. Furthermore, because data came from a single clinical trial, site-to-site homogeneity may have been more favorable than it would have been in a sample of unrelated sites.
Our proposed scale overlaps with the Eat, Sleep, and Console (ESC) assessment method.37-39 This method assesses if a neonate with opioid exposure can eat at least 1 ounce (or age appropriate volume) per feed or breastfeed well, sleep at least 1 hour, and be consoled within 10 minutes. While the ESC approach has gained acceptance at many sites in the United States since the initial quality improvement study introduced the approach in 2017, the generalizability and safety of the approach has yet to be demonstrated. Tachypnea, which is a key sign of central nervous system irritability and autonomic dysfunction, is not assessed with the ESC approach. However, it was retained in our simplified FNAST, in the sFNAST, and in the 5-item MNS.
There are several limitations in this study. First, data were retrospective for some of the cohorts and prospective for others. Next, most binary FNAST items had statistically significant differences among cohorts. The 2 items that were eliminated because of pronounced sample differences in percentage endorsement (ie, high-pitched crying and hyperactive Moro reflex) could perhaps be reintegrated in the future, if consistency in assessment can be achieved. Despite the statistically significant variability of some of the included items, the simplified score’s ability to discriminate neonates who were treated vs those who were untreated (as measured by the AUC) was very high, even in the validation cohort. Shorter tools have been published (ie, the 5-item MNS and the ESC), but we retained 8 items because of the trade-off between brevity and potential loss of generalizability. Further prospective studies that incorporate nursing and caregiver input are needed to establish clinical utility and determine the validity and reliability of this simplified tool in comparison with other tools and approaches.
In this study, the 21-item FNAST was simplified to an 8-item scale that discriminated nearly as well as the original and was validated with an independent cohort. This shorter assessment tool could simplify clinical assessment by focusing on components that are relatively consistent across sites. It is important to prospectively validate this scale, which could be widely used and lead to the standardization of the clinical approach and management of neonates prenatally exposed to opioids.
Accepted for Publication: February 6, 2020.
Published: April 8, 2020. doi:10.1001/jamanetworkopen.2020.2275
Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2020 Devlin LA et al. JAMA Network Open.
Corresponding Author: Lori A. Devlin, DO, Department of Pediatrics, University of Louisville School of Medicine, 571 S Floyd St, Ste 342, Louisville, KY 40202 (firstname.lastname@example.org).
Author Contributions: Drs Devlin and Davis had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Devlin, Breeze, Terrin, Finnegan, Jones, Davis.
Acquisition, analysis, or interpretation of data: Devlin, Breeze, Terrin, Gomez Pomar, Bada, O’Grady, Jones, Lester, Davis.
Drafting of the manuscript: Devlin, Breeze, Terrin, O’Grady, Jones, Davis.
Critical revision of the manuscript for important intellectual content: Devlin, Breeze, Terrin, Gomez Pomar, Bada, Finnegan, Jones, Lester, Davis.
Statistical analysis: Breeze, Terrin, O’Grady.
Obtained funding: Devlin, Lester.
Administrative, technical, or material support: Devlin, Gomez Pomar, Bada, Finnegan, Jones, Lester, Davis.
Supervision: Terrin, Davis.
Conflict of Interest Disclosures: Dr Devlin reported serving on the advisory board of Chiesi Farmaceutici outside the submitted work. No other disclosures were reported.
Funding/Support: This work was supported by grant R21DA041706-02 from the National Institute on Drug Abuse, award UL1TR002544 from the National Center for Advancing Translational Sciences, National Institutes of Health, the Charles H. Hood Foundation, and Chiesi Farmaceutici.
Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Additional Contributions: Julie Burt, AA, and Karen D’Apolito, PhD, assisted with the preparation of the article. Dr D’Apolito was compensated for her time.
Create a personal account or sign in to: