Can a valid algorithm be derived to identify keratinocyte carcinoma at a population level using health insurance claims data?
By applying recursive partitioning to a data set of 602 371 community laboratory pathology episodes linked to health insurance claims, an algorithm was derived with 82.6% sensitivity, 93.0% specificity, 76.7% positive predictive value, and 95.0% negative predictive value. The derived algorithm also performed well when validated using an independent hospital clinic data set.
The derived algorithm can reliably identify keratinocyte carcinoma for epidemiological research in the absence of cancer registry data. Recursive partitioning is an effective tool for deriving valid claims-based algorithms.
Keratinocyte carcinoma (nonmelanoma skin cancer) accounts for substantial burden in terms of high incidence and health care costs but is excluded by most cancer registries in North America. Administrative health insurance claims databases offer an opportunity to identify these cancers using diagnosis and procedural codes submitted for reimbursement purposes.
To apply recursive partitioning to derive and validate a claims-based algorithm for identifying keratinocyte carcinoma with high sensitivity and specificity.
Design, Setting, and Participants
Retrospective study using population-based administrative databases linked to 602 371 pathology episodes from a community laboratory for adults residing in Ontario, Canada, from January 1, 1992, to December 31, 2009. The final analysis was completed in January 2016. We used recursive partitioning (classification trees) to derive an algorithm based on health insurance claims. The performance of the derived algorithm was compared with 5 prespecified algorithms and validated using an independent academic hospital clinic data set of 2082 patients seen in May and June 2011.
Main Outcomes and Measures
Sensitivity, specificity, positive predictive value, and negative predictive value using the histopathological diagnosis as the criterion standard. We aimed to achieve maximal specificity, while maintaining greater than 80% sensitivity.
Among 602 371 pathology episodes, 131 562 (21.8%) had a diagnosis of keratinocyte carcinoma. Our final derived algorithm outperformed the 5 simple prespecified algorithms and performed well in both community and hospital data sets in terms of sensitivity (82.6% and 84.9%, respectively), specificity (93.0% and 99.0%, respectively), positive predictive value (76.7% and 69.2%, respectively), and negative predictive value (95.0% and 99.6%, respectively). Algorithm performance did not vary substantially during the 18-year period.
Conclusions and Relevance
This algorithm offers a reliable mechanism for ascertaining keratinocyte carcinoma for epidemiological research in the absence of cancer registry data. Our findings also demonstrate the value of recursive partitioning in deriving valid claims-based algorithms.
Chan A, Fung K, Tran JM, Kitchen J, Austin PC, Weinstock MA, Rochon PA. Application of Recursive Partitioning to Derive and Validate a Claims-Based Algorithm for Identifying Keratinocyte Carcinoma (Nonmelanoma Skin Cancer). JAMA Dermatol. 2016;152(10):1122-1127. doi:10.1001/jamadermatol.2016.2609