aAs constructed in Loiacono MM et al.33
bPatients aged 65 years or more without indices of multiple deprivation data were excluded (47 170 patients [20.2%] for training data; 9547 patients [27.2%] for validation data).
SIV indicates seasonal influenza vaccine.
eFigure 1. Overview of the selection of the reference cohort
eFigure 2. Binomial deviance versus log(Lambda) for 10-fold within-sample cross-validation of LASSO model among patients aged 18-64 years
eFigure 3. Binomial deviance versus log(Lambda) for 10-fold within-sample cross-validation of ridge model among patients aged 18-64 years
eFigure 4. Binomial deviance versus log(Lambda) for 10-fold within-sample cross-validation of LASSO model among patients aged 65+ years
eFigure 5. Binomial deviance versus log(Lambda) for 10-fold within-sample cross-validation of ridge model among patients aged 65+ years
eFigure 6. ROC curve for out-of-sample validation of stepwise model among patients aged 18-64 years
eFigure 7. ROC curve for out-of-sample validation of LASSO model among patients aged 18-64 years
eFigure 8. ROC curve for out-of-sample validation of ridge model among patients aged 18-64 years
eFigure 9. ROC curve for out-of-sample validation of stepwise model among patients aged 65+ years
eFigure 10. ROC curve for out-of-sample validation of LASSO model among patients aged 65+ years
eFigure 11. ROC curve for out-of-sample validation of ridge model among patients aged 65+ years
eFigure 12. Density plots of predicted SIV uptake probabilities for out-of-sample validation for stepwise, LASSO, and ridge models among patients aged (a) 18-64 (b) 65+ years
eTable 1. Performance measures from sensitivity anaylsis (10-fold within-sample cross-validation) for stepwise, LASSO, and ridge models among patients aged (a) 18-64 and (b) 65+ years
eMethods 1. Measures of uncertainty
eMethods 2. Statistical software
Customize your JAMA Network experience by selecting one or more topics from the list below.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Loiacono MM, Mitsakakis N, Kwong JC, Gomez GB, Chit A, Grootendorst P. Development and Validation of a Clinical Prediction Tool for Seasonal Influenza Vaccination in England. JAMA Netw Open. 2020;3(6):e207743. doi:10.1001/jamanetworkopen.2020.7743
Can a prediction tool using patient characteristics measured via routinely collected primary care accurately estimate a patient’s probability of receiving the seasonal influenza vaccine during an upcoming influenza season?
In this prognostic study that used retrospectively collected primary care data of 324 284 patients from 2011 to 2016 to train and validate logistic regression models, patient-level seasonal influenza vaccine uptake among at-risk adults in England was predicted with misclassification rates ranging from 0.12-0.16, sensitivity ranging from 0.71-0.94, and specificity ranging from 0.68-0.90.
The results of this study suggested that it may be possible to predict patient-level seasonal influenza vaccine uptake solely based upon patient characteristics identified via primary care records.
Timely identification of patients likely to miss seasonal influenza vaccination (SIV) could help health care practitioners tailor services and gain efficiency.
To develop and validate a predictive model of SIV uptake among at-risk adults.
Design, Setting, and Participants
This prognostic study constructed a prediction model for vaccine uptake by adults at increased risk of influenza-associated complications. Drawing from the Clinical Practice Research Datalink database’s records of primary care data of 324 284 adults routinely collected at general practices across England from January 2011 to December 2016, logistic regression models were trained on data from patients registered from January 2012 to December 2013 and validated with out-of-sample data from patients registered from January 2015 to December 2016. Data were extracted from the database December 2018 and analyzed between September 2019 and December 2019.
Covariates included sex, age, race/ethnicity, smoking status, socioeconomic status, previous pneumococcal vaccination, prior season SIV uptake, and clinical risk conditions.
Main Outcomes and Measures
The main outcome was patient-level SIV uptake. Model performance was measured via misclassification rate, Brier score, sensitivity, specificity, and area under the curve.
The training data sets consisted of 324 284 (aged 18 to 64 years) and 186 426 (aged 65 years or older) patients. The mean (SD) age in the training data among patients aged 18 to 64 years was 45 (13) years; 161 487 (49.8%) were women, and 102 133 (31.5%) were categorized as white. Among patients aged 65 years or older, the mean (SD) age was 77 (8) years; 96 169 (51.6%) were women, and 64 996 (34.9%) were categorized as white. The validation data sets consisted of 35 210 patients aged 18 to 64 years and 25 497 aged 65 years or older. The mean (SD) age in the validation data set among patients aged 18 to 64 years was 42 (14) years; 17 296 (49.1%) were women, and 13 346 (37.9%) were categorized as white. Among patients aged 65 years or older, the mean (SD) age was 73 (8) years; 13 135 (51.5%) were women, and 9641 (37.8) were categorized as white. Among patients aged 18 to 64 years, SIV uptake was 35.9% (95% CI, 35.7%-36.0%) and 32.6% (95% CI, 32.1%-33.1%) for the training and validation data sets, respectively. Among patients aged 65 years or older, SIV uptake was 83.1% (95% CI, 82.9%-83.2%) and 76.1% (95% CI, 75.5%-76.6%) for the training and validation data sets, respectively. Prior season SIV uptake and pneumococcal vaccination status were the best predictors of SIV uptake. Predicted SIV uptake probabilities for patients aged 18 to 64 years were reliable, but biased toward underpredicting, whereas, among patients aged 65 years or older, they were variable and biased toward overpredicting. Briefly, in out-of-sample validation among patients aged 18 to 64 years, misclassification rates were 0.163 to 0.164, Brier scores were 0.124 to 0.125, area under the receiver operating characteristic curve values ranged from 0.876 to 0.877, sensitivity ranged from 0.705 to 0.720, and specificity ranged from 0.896 to 0.902. In patients aged 65 years or older, misclassification rates were 0.120 to 0.125, Brier scores were 0.0953 to 0.0959, area under the receiver operating characteristic curve was 0.877, sensitivity ranged from 0.919 to 0.936, and specificity ranged from 0.680 to 0.753.
Conclusions and Relevance
This study suggests that data obtained from primary care records could accurately predict SIV uptake among at-risk adults. Further research is needed to assess the feasibility and efficacy of implementing this model in clinical settings.
Each year, millions of individuals develop severe influenza disease globally, and as many as 500 000 individuals die from influenza-associated complications.1 While most are susceptible to influenza, the risk of influenza-related morbidity and mortality is greatest among specific subgroups, including young children, adults older than 65 years, pregnant women, those with certain chronic health conditions, and those who are immunosuppressed.2,3
The seasonal influenza vaccine (SIV) remains the most effective means of reducing influenza-associated morbidity and mortality, especially among clinical risk groups.1 Despite the known efficacy and safety of SIVs, as well as the provision of free or highly subsidized vaccines to eligible patients in many jurisdictions, SIV uptake among at-risk adults is suboptimal and has, at best, stagnated for more than a decade across many regions, including North America and Europe.4-6
Countless efforts have been made to improve SIV uptake through patient- and health care professional (HCP)–level interventions, yet the gap between realized and optimal coverage remains.7,8 Notwithstanding, numerous studies have highlighted the pivotal role of HCPs in influencing and advising a patient’s health-related behaviors, such as smoking cessation and cancer screening.9,10 A similar role is taken with vaccinations, where HCP communications and recommendations have been shown to increase uptake of various adolescent and adult vaccines.11-13 Leveraging this unique role, HCP-level interventions represent a promising avenue through which the SIV coverage gap may be addressed.
Two commonly used HCP-level interventions include patient recall-reminder systems and software-based HCP-directed prompts, which have exhibited varying degrees of effectiveness.14-19 Among recall-reminder systems, automated communication systems are most easily implemented, whereas personalized phone calls or even home visits may be more effective, but are substantially more resource intensive to implement.17,20 Software-based prompts via electric health record (EHR) systems that remind HCPs to vaccinate a patient at the time of consultation are similarly easy to implement and have been shown to be effective across various health systems.21,22 However, these prompts may be rendered ineffective if used too frequently (known as prompt fatigue) and, further, do not provide any unique insights into the patient’s likelihood of being vaccinated.23,24
As for characterizing a patient’s likelihood of being vaccinated, prior SIV uptake determinant studies have shown that patient characteristics—including sociodemographic characteristics, health-related behaviors, and comorbidities—are associated with uptake.25-28 Therefore, it stands that a patient’s SIV uptake could reasonably be predicted based on such characteristics, allowing HCPs to effectively identify patients who are less likely to be vaccinated during an upcoming season.
Given stringent time constraints faced by HCPs, this insight may help them to optimally allocate their time and resources, through selective use of resource-intensive interventions as well as custom tailoring of software prompts to reflect the patient’s likelihood of SIV uptake.29 Rapid improvements in the quality of primary care data can presumably be leveraged to generate real-time insights into a patient’s likelihood of receiving vaccination.30 However, research in this area is limited; to our knowledge, only 1 study has attempted to construct a predictive model of SIV uptake using routinely collected primary care data.31
In this study, we investigated the feasibility of developing and validating an SIV uptake prediction model based only on patient characteristics attainable from primary care data to estimate the probability of patient-level SIV uptake among a population of at-risk adults in England. Using data from the UK’s Clinical Practice Research Datalink (CPRD) database for model training and validation, we assessed the predictive performance of 3 forms of logistic regression models.
Data used for model training and validation were derived from the CPRD database,32 as described in Loiacono et al.33 Briefly, a reference cohort (3 391 975 participants) was originally constructed to identify adults aged 18 years or older in the CPRD database who were registered to English practices for a minimum of 365 consecutive days between January 2011 and December 2016. The inclusion and exclusion criteria for this reference cohort were specified to identify patients with minimal gaps in registration and records of high enough quality for research purposes, as determined by the CPRD’s acceptability metric.32 A detailed diagram of the reference cohort construction is available in eFigure 1 in the Supplement.
From this reference cohort, we identified nonoverlapping cohorts of at-risk patients and constructed model training and out-of-sample validation data sets (Figure 1). Specific clinical risk conditions for inclusion, as defined by the National Health Service, included pregnancy, chronic renal disease, chronic heart disease, chronic respiratory disease, chronic liver disease, diabetes, immunosuppression, chronic neurological disease, and morbid obesity.34 Training data consisted of patients registered to their practice from January 2012 to December 2013. Out-of-sample validation data consisted of patients registered to their practice from January 2015 to December 2016 and who were not present in the training data set. Two years of enrollment were required to assess the patient’s SIV uptake during the prior season. Data were extracted from the CPRD database in December 2018 and analyzed from September 2019 to December 2019.
Among the training and validation data sets, patients without at least 1 clinical risk condition during both the observed and prior influenza season were excluded. While the National Health Service also identifies patient age as a risk condition (≥65 years), in this study only older adults with at least 1 additional clinical risk condition were included to focus model prediction on older patients at greatest risk of influenza-associated morbidity and mortality. Training and validation data sets were stratified by patient age (18-64 years and ≥65 years). This study received approval by the Independent Scientific Advisory Committee of CPRD. Informed consent from study participants was not required, given that individual-level consent was provided prior to data collection, and all data were deidentified prior to CPRD’s collection. The development and validation of the prediction model in this study was performed in accordance with the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline.35
The dependent variable measured was patient-level SIV uptake. Uptake was assessed annually between September 1 and December 31, a timeframe encompassing most SIVs administered.36 Patients vaccinated outside of this window were excluded as outliers (10 203 patients [0.5%] excluded from training data; 1204 patients [0.1%] excluded from validation data). Covariates included sex, age, race/ethnicity, smoking status, socioeconomic status, history of pneumococcal vaccination, prior season SIV uptake, and clinical risk conditions.34 Age was treated as a continuous variable. Ethnicity was defined as specified by the Office of National Statistics.37 Patient socioeconomic status was approximated using the Indices of Multiple Deprivation (IMD), a socioeconomic measure of the patient’s area of residence.38 Identification of variables for inclusion in models was based on prior SIV uptake determinants research.33
Pneumococcal vaccination history was determined by whether a patient had at least 1 record of a pneumococcal vaccination prior to December 31 of the given year. Clinical risk conditions were identified using Read codes as arranged by Primary Care Information Services, including: pregnancy, chronic renal disease, chronic heart disease, chronic respiratory disease, chronic liver disease, diabetes, immunosuppression, chronic neurological disease, and morbid obesity.39 Time-varying patient characteristics were assessed prior to September 1 of the given year, based on the most recent record. Missing values for ethnicity, smoking status, and morbid obesity were coded as unknown. All codes used for data extraction from the CPRD database are described in detail in Loiacono et al.33
We evaluated 3 types of logistic regression models: stepwise, least absolute shrinkage and selection operator (LASSO), and ridge. We opted to use logistic regression models due to their inherently transparent development process and ease of implementation, all while maintaining strong predictive abilities.40 All variables were considered for modeling across both age strata except for pregnancy and patient IMD. Pregnancy was not used in models for patients aged 65 years or more. Patient IMD was only used in models for patients aged 65 years or more, given prior evidence of the association with SIV uptake specifically in this age stratum.33
Stepwise models were trained using a backward stepwise algorithm that systematically reduced the model via minimization of the Akaike information criterion.41 For the LASSO and ridge models, an optimal value of λ, or the penalty coefficient for the loss functions, was determined via 10-fold cross-validations using the 2013 training data, in which the optimal λ minimized model deviance (eFigure 2, eFigure 3, eFigure 4, and eFigure 5 in the Supplement).42 Both the backward stepwise algorithm and LASSO autonomously performed feature or variable selection, which resulted in a reduced model, whereas ridge maintained the full model.
Models were trained on the 2013 training data set and validated on the 2016 out-of-sample validation data set. Given the limited prior research to guide cutoff selection, an uninformative cutoff of 0.5 was specified, a priori, to classify a patient’s SIV uptake status based on their predicted probability (ie, a predicted probability ≤0.5 did not receive SIV; a predicted probability >0.5 received SIV). For models trained on the 2013 training data set, estimated coefficients were reported.
To assess the out-of-sample predictive performance of the models, the following performance metrics were calculated: misclassification rate, Brier score, area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. Misclassification rate (ie, proportion of patients with an incorrectly predicted SIV uptake status, in which 0 indicates no misclassification) was calculated as the sum of false positives and false negatives divided by the total number of predictions. Brier score (accuracy of a probabilistic prediction ranging from 0 to 1, in which 0 indicates perfect accuracy) was calculated as Σ(yi – pi)2 / n, where yi and pi were the observed SIV uptake status and probabilistic prediction for patient i, respectively, among n total patients.43 The AUC is a measure of the model’s discrimination power, ranging from 0 to 1, in which 1 indicates perfect prediction. Sensitivity and specificity are true-positive and true-negative rates, ranging from 0 to 1, in which 1 indicates perfect true-positive or true-negative prediction.
Uncertainty measures for all performance metrics (95% CIs) were calculated using normal approximation methods (eAppendix 1 in the Supplement). Calibration plots were constructed based on the methods previously described in Gerds et al.44 Receiver operating characteristic curves and kernel density plots were constructed and are available (eFigures 6-12 in the Supplement). Additionally, results from a sensitivity analysis, in which models were trained and validated through a within-sample 10-fold cross-validation using the 2013 training data set, are available in eTable in the Supplement. All analyses were performed in R version 3.4.3 (R Project for Statistical Computing). Further details on specific R packages are available in eAppendix 2 in the Supplement.
Training data consisted of 324 284 patients aged 18 to 64 years (mean [SD] age, 45  years; 161 487 women [49.8%]; 102 133 categorized as white [31.5%]) and 186 426 patients aged 65 years or older (mean [SD] age, 77  years; 96 169 women [51.6%]; 64 996 categorized as white [34.9%]) (Table 1). Validation data consisted of 35 210 patients aged 18 to 64 years (mean [SD] age, 42  years; 17 296 women [49.1%]; 13 346 categorized as white [37.9%]) and 25 497 patients aged 65 years or older (mean [SD] age, 73  years; 13 135 women [51.5%]; 9641 categorized as white [37.8%]). Across both age strata, SIV uptake (prior and current season) and lower pneumococcal were noted in the validation data relative to the training data. Among patients aged 18 to 64 years, SIV uptake was 35.9% (116 316 patients; 95% CI, 35.7%-36.0%) and 32.6% (11 493 patients; 95% CI, 32.1%-33.1%) for the training and validation data sets, respectively. Among patients aged 65 years or older, SIV uptake was 83.1% (154 872 patients; 95% CI, 82.9%-83.2%) and 76.1% (11 493 patients; 95% CI, 75.5%-76.6%) for the training and validation data sets, respectively. Pneumococcal vaccine uptake among patients aged 18 to 64 years was 23.4% (75 984 patients) and 16.7% (5882 patients) for training and validation sets, respectively; for patients aged 65 years or older, uptake was 85.1% (158 661 patients) for the training set and 74.9% (19 099 patients) for the validation set.
Estimated coefficients for trained models are presented in Table 2. The most influential predictors across both age strata were prior season SIV uptake and pneumococcal vaccination status. Otherwise, among patients aged 18 to 64 years, pregnancy (estimated coefficient for stepwise model, 1.80; LASSO, 1.76; ridge, 1.24), diabetes (estimated coefficient for stepwise model, 0.88; LASSO, 0.85; ridge, 0.69), and chronic neurological disease (estimated coefficient for stepwise model, 0.68; LASSO, 0.64; ridge, 0.48) had the greatest contributions. Among patients aged 65 years or older, estimated coefficients were overall smaller and less variable. For stepwise and LASSO models, only a small number of predictors were autonomously excluded from the models among patients aged 18 to 64 years (eg, race [mixed, other, and white] and chronic liver disease) as well as among patients aged 65 years or older (eg, race [mixed, other, and white], chronic renal disease, chronic liver disease, and chronic neurological disease), indicating that most predictors contributed nontrivially to the model fit.
Within each age strata, the 3 model types performed similarly overall (Table 3). Misclassification rates were lowest for the stepwise and LASSO models among both patients aged 18 to 64 years (0.163, 95% CI, 0.159-0.166) and 65 years or older (0.120, 95% CI, 0.116-0.124). Brier scores were highest for the stepwise and LASSO models among patients aged 18 to 64 years (0.125, 95% CI, 0.122-0.127), whereas they were highest for the ridge model among those aged 65 years or older (0.0959, 95% CI, 0.0932-0.0985). All models across both age strata had comparable AUCs (0.877, 95% CI, 0.873-0.881). Sensitivity was highest for the LASSO model among those aged 18 to 64 years (0.721, 95% CI, 0.717-0.725) and the ridge model among those aged 65 years or older (0.936, 95% CI, 0.933-0.939). Specificity was highest for the ridge model among those aged 18 to 64 years (0.902, 95% CI, 0.899-0.905) and the LASSO model among those aged 65 years or older (0.755, 95% CI, 0.750-0.760).
Predicted probabilities for patients aged 18 to 64 years were overall reliable and comparable among the 3 model types (Figure 2). For the intermediate range of predicted probabilities (eg, 0.25-0.50), prediction was biased toward underpredicting. The ridge model exhibited the highest degree of underprediction bias, particularly for probabilities of 0.50 or more. For patients aged 65 years or older, predicted probabilities were substantially more variable and overall biased toward overpredicting (Figure 2). Prediction bias of the 3 model types was similar for probabilities greater than 0.75. However, for probabilities between approximately 0.20 and 0.60, the ridge model’s bias toward underpredicting was notably greater than those of the stepwise or LASSO models.
This study found that a clinical prediction tool using patient-level characteristics identified via primary care records was able to estimate the probability of patient-level SIV uptake among at-risk adults with an overall high degree of accuracy. As for the specific methods tested, performance of the 3 types of models were comparable, exhibiting only minor discrepancies across the various performance metrics. Relative to the stepwise model, neither the LASSO nor ridge models performed substantially better. These findings suggest that the simplest approach, the stepwise regression, is well suited for this type of prediction model.
Model performance did however differ notably between the 2 age strata. Among younger adults, estimated probabilities were closer to the observed SIV uptake probabilities, whereas estimated probabilities were substantially more variable among older adults. This may be explained in part by the imbalanced outcome variable (eg, approximately 70%-80% vaccine uptake among older adults) but may also be indicative of a lower overall degree of explanatory power for predictors in the models among patients aged 65 years or more. Considering this, either model may reliably be used to identify patients who are least likely to be vaccinated (eg, probability ≤25%), but the model among younger adults may be better suited for evaluating the patient’s specific probability of being vaccinated.
Our work establishes a new possible foundation from which future clinical tools and interventions may be developed, in which the insights gained from these predictions may guide HCP resource allocation. By identifying patients with a low probability of being vaccinated, HCPs can deploy more resource-intensive outreach efforts, such as personalized phone calls. Additionally, the geographical distribution of patients with low probabilities to be vaccinated could be further investigated for instances of clustering, which would highlight areas with potential limitations to health care access.
As for HCP-level interventions, this model could be integrated into EHR systems, to deliver HCP-directed reminders via software prompts at the time of patient consultation. With the model performing real-time calculations behind the scenes, EHR-based prompts could be customized to not only remind the HCP to vaccinate the patient, but also serve the HCP with a unique insight into the patient’s inherent likelihood of being vaccinated and helping them tailor the conversation accordingly. Given the time constraints placed upon HCPs during a consultation, this may help them more efficiently use their time.45
For example, a patient can be characterized as unlikely, moderately likely, or highly likely to be vaccinated based on their predicted probability, and EHR prompts could be constructed accordingly. For patients flagged as highly likely, the prompt may advise the HCP to make presumptive and time-saving recommendation (eg, “today we will be giving you your flu shot”).46 For patients flagged as unlikely, the prompt may include a list of the patient’s clinical risk conditions, suggesting that the HCP initiate a personalized dialogue with regard to the patient’s specific condition(s) and the importance of vaccination, which would be followed by a presumptive recommendation.47 As for those patients flagged as moderately likely, the prompt may advise the HCP to make a presumptive recommendation, but also to be prepared to discuss the patient’s specific risk conditions if they exhibit hesitancy. This scenario demonstrates 1 way these models potentially can be used to autonomously deliver time-optimizing, patient-tailored guidance to HCPs.
Despite the strong predictive performance of these models, this study has some limitations that must be acknowledged. First, we explicitly opted to treat missing values as unknown, rather than the more common approach of imputation or exclusion of observations with missing data.48 However, in a clinical setting, it is likely that values for some predictors may be missing in the patient’s records, such as ethnicity, smoking status, or body mass index. By explicitly modeling these missing values, we preserve their explanatory power among patients that have a known value while retaining the model’s ability to predict among patients with missing values, thereby maximizing the model’s clinical utility. As for excluding patients aged 65 years or older without IMD data, this was decided to reflect real-world circumstances, in which a patient’s IMD measure can simply be obtained via cross-referencing their area of residence with an IMD lookup table.
Second, while the use of CPRD’s database allowed us to capture a wide breadth of patients and perform true out-of-sample validation, a fundamental strength of this study, it also introduced its own respective disadvantages. Given that our model was trained on England-specific data, it may not be applicable to other regions. Nevertheless, the framework that we have implemented here can be replicated elsewhere using similar sources of data to train and validate country-specific prediction models. Doing so would ensure that the model’s predictive capabilities are best suited to the respective health system and its patients.
Additionally, we noted a decrease in both pneumococcal vaccine uptake in the validation data sets, relative to the training data sets. As explained in Loiacono et al,33 this observed drop in vaccine coverage over time may be explained by the increasing number of CPRD-enrolled practices dropping out of data collection over time, or perhaps even the increase in pharmacy-administered vaccines and the subsequent lack of appropriate data transfers to general practitioners. Nevertheless, within the framework of predictive modeling, these differences are of less concern, as similarity of baseline characteristics between training and validation data are not a prerequisite.
Third, although the model was capable of accurate prediction, it does not explicitly explain why the patient was likely or unlikely to be vaccinated. This is an inherent limitation of using large primary care databases, given that we must model a patient’s vaccine uptake as a function of only the characteristics that we can confidently measure. Thus, this study does not account for other known determinants of SIV uptake, such as personal beliefs, opinions, and other social factors that could not be accurately measured in CPRD’s database.49 Similarly, it is possible that the patient-level attitudes and behavior underlying vaccine uptake may vary over time, making comparison between different years difficult. Nevertheless, the inclusion of a lagged independent variable (ie, prior season SIV uptake) in the model allowed for flexibility in this regard, given that it effectively encompassed and adjusted for changes in patient attitudes based on their historical actions.
The results of this study suggest that primary care records can be leveraged to provide future insights into patient preventive health behaviors such as SIV uptake. Logistic regression models can predict SIV uptake with high accuracy, and the modeling approach implemented here can likely be adapted to other countries and databases. Future research is needed to assess the feasibility of implementing this model in a clinical setting as well as to evaluate its potential effectiveness with regard to improving SIV uptake. Similarly, future studies may wish to investigate the performance of additional methods, such as decision tree learning, to identify a smaller subset of critical predictors that may be used as a decision tool, thus simplifying the model’s implementation.
Accepted for Publication: April 7, 2020.
Published: June 29, 2020. doi:10.1001/jamanetworkopen.2020.7743
Open Access: This is an open access article distributed under the terms of the CC-BY-NC-ND License. © 2020 Loiacono MM et al. JAMA Network Open.
Corresponding Author: Matthew M. Loiacono, MSc, Vaccine Epidemiology and Modeling, Sanofi Pasteur, 1 Discovery Dr, ATTN: Matthew M. Loiacono, B60 - 360, Swiftwater, PA 18370 (firstname.lastname@example.org).
Author Contributions: Mr Loiacono had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Loiacono, Mitsakakis, Chit, Grootendorst.
Acquisition, analysis, or interpretation of data: Loiacono, Mitsakakis, Kwong, Gomez, Chit.
Drafting of the manuscript: Loiacono, Gomez, Grootendorst.
Critical revision of the manuscript for important intellectual content: Loiacono, Mitsakakis, Kwong, Gomez, Chit, Grootendorst.
Statistical analysis: Loiacono.
Obtained funding: Chit.
Administrative, technical, or material support: Gomez, Chit.
Supervision: Mitsakakis, Chit, Grootendorst.
Conflict of Interest Disclosures: Mr Loiacono and Drs Gomez and Chit reported being full-time employees of Sanofi Pasteur. Mr Loiacono reported receiving nonfinancial support from Sanofi Pasteur during the conduct of the study. Dr Grootendorst reported receiving grants from Sanofi Pasteur Canada in the form of financial support for an academic conference he organized and grants from Sanofi US to provide stipends for a PhD student (Mr Loiacono) under his supervision outside the submitted work. No other disclosures were reported.
Funding/Support: This study was funded by Sanofi Pasteur.
Role of the Funder/Sponsor: The funder reviewed and approved the design of the study; enabled collection of the data via a multistudy CPRD data license; and reviewed and approved the manuscript. The funder had no role in conduct of the study; management, analysis, and interpretation of the data; preparation of the manuscript, or decision to submit the manuscript for publication.
Additional Contributions: Daniel Gibbons, MPhil, from UK Behavioral Insights team, provided advice regarding the clinical utility of the prediction model in a practice setting as well as the methodology to construct the model. He received no compensation for his contribution.
Create a personal account or sign in to: