[Skip to Navigation]
Sign In
Table 1.  Issues Revealed by the Unit Testing Process
Issues Revealed by the Unit Testing Process
Table 2.  Processing Time by Software Type
Processing Time by Software Type
1.
Cohen  E, Berry  JG, Camacho  X, Anderson  G, Wodchis  W, Guttmann  A.  Patterns and costs of health care use of children with medical complexity.  Pediatrics. 2012;130(6):e1463-e1470.PubMedGoogle ScholarCrossref
2.
Feudtner  C, Hays  RM, Haynes  G, Geyer  JR, Neff  JM, Koepsell  TD.  Deaths attributed to pediatric complex chronic conditions.  Pediatrics. 2001;107(6):E99.PubMedGoogle ScholarCrossref
3.
Feudtner  C, Feinstein  JA, Zhong  W, Hall  M, Dai  D.  Pediatric complex chronic conditions classification system version 2.  BMC Pediatr. 2014;14:199.PubMedGoogle ScholarCrossref
4.
The R Foundation for Statistical Computing. The R Project for Statistical Computing. https://www.r-project.org/. 2017. Accessed January 10, 2018.
5.
Rcpp.org. Rcpp for Seamless R and C++ Integration. http://www.rcpp.org/. Accessed January 10, 2018.
6.
Healthcare Cost and Utilization Project. Overview of Nationwide Emergency Department Sample (NEDS). http://www.hcup-us.ahrq.gov/nedsoverview.jsp. December 2017. Accessed January 10, 2018.
Research Letter
June 2018

R Package for Pediatric Complex Chronic Condition Classification

Author Affiliations
  • 1Adult and Child Consortium for Health Outcomes Research and Delivery Science, Children’s Hospital Colorado, University of Colorado School of Medicine, Aurora
  • 2Division of General Pediatrics, University of Colorado School of Medicine, Aurora
  • 3Data Science to Patient Value, University of Colorado Anschutz Medical Campus, Aurora
  • 4Neptune and Company Inc, Lakewood, Colorado
  • 5Division of General Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania
  • 6Section of Pediatric Critical Care, University of Colorado School of Medicine, Aurora
JAMA Pediatr. 2018;172(6):596-598. doi:10.1001/jamapediatrics.2018.0256

Identification of children with complex chronic conditions (CCCs) is necessary to improve health care delivery and perform clinical research, because this patient population uses significant inpatient and outpatient medical resources.1 The original CCC classification was published in 2000.2 A second version was published in 2014 to reflect additions to the International Classification of Diseases system and the US adoption of the International Statistical Classification of Diseases and Related Health Problems, Tenth Revision.3 The CCC classification is widely used in research (currently cited in more than 100 peer-reviewed journal publications). However, the current approach to assigning the CCC categories in health care–related data sets is limited by proprietary software and computational inefficiency. SAS and Stata software to assign CCC categories were published as appendices to the 2014 update,3 but not all investigators have access to these statistical packages. In addition, increasingly large data sets are available to investigators. Although the data processing capability of individual computers continues to improve, the SAS and Stata software can take significant time to run on data sets with millions of observations. The objective of this project was to develop computationally efficient software to generate the CCC categories using R, a free, open-source statistical environment.4 We then compared the SAS, Stata, and R software with respect to accuracy and speed of classification on a typical desktop system.

Methods

We developed the pccc R package based on the 2014 version 2 CCC system.3 To maximize computational efficiency, we leveraged the ability to call C++ from within R using the Rcpp package.5 We used standard software engineering practices, including distributed version control, issue tracking, and unit testing. We tested the pccc package using the same Healthcare Cost and Utilization Project data sets from the Agency for Healthcare Research and Quality used to develop the 2014 software (2009 Kids’ Inpatient Database [KID] and 2010 Nationwide Emergency Department Sample [NEDS]).6 On the same desktop system (i7 dual-core, 16-GB RAM), we classified each record using the SAS, Stata, and R software and compared the results. We tested the accuracy (percentage correctly classified) of the R software using SAS as the criterion standard. To test the relative speed of the 3 implementations, we compared processing time (in minutes) for the 3 407 146-record KID data set and the 28 584 301-record NEDS data set. The latest release of the R package is available on the Comprehensive R Archive Network (https://cran.r-project.org/web/packages/pccc/index.html), and the developmental version is on GitHub (https://github.com/CUD2V/pccc). Institutional review board approval was not required for this study using publicly available data sets.

Results

Unit testing of the new pccc package revealed several different types of issues present in the 2014 SAS and Stata software (Table 1). We collaborated with the authors of the 2000 (C.F.) and 2014 (J.A.F., C.F., and D.D.) CCC systems to resolve those issues. Subsequently, the R package and the updated SAS and Stata software yielded identical patient CCC categorizations when run on each row of patient data in the KID and NEDS data sets. Processing the same data, the R package was comparable to SAS and significantly more efficient than Stata (Table 2). The updated SAS and Stata software packages are available at https://feudtnerlab.research.chop.edu/ccc_version_2.php.

Discussion

The free and open-source pccc R package provides accurate, efficient, and reproducible pediatric CCC categorization for large files of administrative records. The ability of R to call C++ directly can improve computational efficiency and is an advantage for package developers. Software development practices, including unit testing, can identify errors before code release. Code in the pccc package was developed collaboratively and that process, including issue tracking, is publicly visible in the GitHub repository. Suggestions or improvements can be submitted through GitHub’s pull request mechanism.

Back to top
Article Information

Accepted for Publication: January 17, 2018.

Corresponding Author: James A. Feinstein, MD, MPH, Adult and Child Consortium for Health Outcomes Research and Delivery Science, Children’s Hospital Colorado, University of Colorado School of Medicine, 13199 E Montview Blvd, Ste 300, Room 312A, Aurora, CO 80045 (james.feinstein@ucdenver.edu).

Published Online: April 23, 2018. doi:10.1001/jamapediatrics.2018.0256

Author Contributions: Drs Feinstein and Bennett had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Feinstein, Russell, DeWitt, Feudtner, Bennett.

Acquisition, analysis, or interpretation of data: Feinstein, Russell, Dai, Bennett.

Drafting of the manuscript: Feinstein, Russell, DeWitt.

Critical revision of the manuscript for important intellectual content: Feinstein, Russell, Feudtner, Dai, Bennett.

Statistical analysis: Feinstein, Russell, Dai, Bennett.

Obtained funding: Bennett.

Administrative, technical, or material support: Russell, DeWitt, Dai, Bennett.

Study supervision: Bennett.

Conflict of Interest Disclosures: None reported.

Funding/Support: This work was supported by grants K23HD091295 (Dr Feinstein) and K23HD074620 (Dr Bennett) from the Eunice Kennedy Shriver National Institute for Child Health and Human Development and by the Data Science to Patient Value Initiative, University of Colorado Anschutz Medical Campus.

Role of the Funder/Sponsor: The sponsors had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

References
1.
Cohen  E, Berry  JG, Camacho  X, Anderson  G, Wodchis  W, Guttmann  A.  Patterns and costs of health care use of children with medical complexity.  Pediatrics. 2012;130(6):e1463-e1470.PubMedGoogle ScholarCrossref
2.
Feudtner  C, Hays  RM, Haynes  G, Geyer  JR, Neff  JM, Koepsell  TD.  Deaths attributed to pediatric complex chronic conditions.  Pediatrics. 2001;107(6):E99.PubMedGoogle ScholarCrossref
3.
Feudtner  C, Feinstein  JA, Zhong  W, Hall  M, Dai  D.  Pediatric complex chronic conditions classification system version 2.  BMC Pediatr. 2014;14:199.PubMedGoogle ScholarCrossref
4.
The R Foundation for Statistical Computing. The R Project for Statistical Computing. https://www.r-project.org/. 2017. Accessed January 10, 2018.
5.
Rcpp.org. Rcpp for Seamless R and C++ Integration. http://www.rcpp.org/. Accessed January 10, 2018.
6.
Healthcare Cost and Utilization Project. Overview of Nationwide Emergency Department Sample (NEDS). http://www.hcup-us.ahrq.gov/nedsoverview.jsp. December 2017. Accessed January 10, 2018.
×