Comparison of Mortality and Major Cardiovascular Events Among Adults With Type 2 Diabetes Using Human vs Analogue Insulins

This cohort study examines the association of analogue compared with human insulin use with mortality and major cardiovascular events among adults with type 2 diabetes.


eMethods 1 -Cohort Construc on
We searched the electronic health records and administra ve databases of the HP, KPCO, KPNC, KPSC health plans to iden fy all diabetes pa ents between 1/1/2000 and 12/31/2013 with a first insulin dispensing between 1/1/2005 and 12/31/2013. The algorithm used to iden fy diabetes pa ents is described below (second bullet). The date of first insulin dispensing is referred to as the index date. Each pa ent who met all of the following criteria was included in the main study cohort: • age on index date ≥21 and ≤89 • diabetes recogni on occurred before or on index date where the diabetes recogni on date was defined from the pa ent's diagnoses from inpa ent, ambulatory, laboratory, and pharmacy encounters. Specifically, diabetes recogni on was defined as the earlier of one inpa ent diagnosis (ICD-9-CM 250.x, 357.2, 366.41, 362.01-362.07) or any combina on of two of the following events occurring within a 24-month period of me, using the date of the first event in the pair as the iden fica on date: 1) A1C > 6.5% (48 mmol/mol); 2) fas ng plasma glucose > 126 mg/dl (7.0 mmol/L); 3) random plasma glucose > 200 mg/dl (11.1 mmol/L); 4) an outpa ent diagnosis code (same codes as inpa ent); 5) any anti-hyperglycemic medica on dispense. For example, an individual with an A1C of 7.5% (57 mmol/mol) followed by an outpa ent diagnosis of diabetes would be iden fied with diabetes on the (earlier) date of the A1C, with a laboratory result as the primary source. When the two events used for iden fica on came from the same source (e.g., two outpa ent diagnoses), they were required to occur on separate dates, but no more than 24-months apart. Note the following excep on: two dispensings of me ormin, thiazolidinediones, or liraglu de -with no other indica on of diabetes -was not counted because these agents could be used for diabetes preven on, weight loss or to treat polycys c ovarian syndrome. Events that were iden fied during a pregnancy (within 270 days prior to a delivery) were excluded from considera on • minimum of 12 months of health plan enrollment before index date and allowing for mul ple gaps not exceeding 90 days combined • minimum of 12 months of drug coverage before index date and allowing for mul ple gaps not exceeding 90 days combined • not pregnant on index date • no evidence of bariatric surgery in the 2 years before the index date, i.e., no record of the following ICD-9 procedure and CPT- • no evidence of end stage renal disease in the 2 years before the index date, i.e., no record of the following ICD-9 diagnosis, ICD-9 procedure, and CPT-4 codes (kidney transplant): v42.0, 996.81 ; 55.6, 55.61, 55.69 ; 50360, 50365, 50380 and most recent GFR laboratory result (if any) ≥15 and no record of 2 or more of the following ICD-9 diagnosis, ICD-9 procedure, and CPT-4 codes dated >90 days apart as primary or secondary diagnosis (dialysis): 585. 6 • no evidence of a stage 4 cancer diagnosis in the 2 years before the index date, i.e., no record of the following ICD-9 diagnosis codes 197. x, 198.x, 199.x • no evidence of hospice or pallia ve care in the 2 years before the index date, i.e., no record of an hospice encounter and no record of the ICD-9 diagnosis code v66.7 and no record of the CPT code 99377 and 99378 • at least one A1c laboratory measurement recorded in the 2 years before the index date • insulins dispensed on the index date do not include animal or inhaled insulins • diabetes of type 2 defined by the following ra o being strictly lower than 50%: the number of ICD-9 diagnosis codes 250.x1 and 250.x3 (type 1) in the 2 years before the index date divided by the sum of this number and the number of ICD-9 codes 250.x0 and 250.x2 (type 2) in the 2 years before the index date. If this ra o is not defined (i.e., denominator is 0), the diabetes type is unknown and the pa ent excluded from the study cohort.
In addi on to these criteria above, KPCO pa ents living outside the Denver/Boulder area were excluded due to incomplete data capture.

eMethods 2 -Data Structure and Nota on
All analyses in this report are based on analy c datasets constructed with the MSMstructure SAS macro 1 to coarsen daily EHR data using the 90-day unit of me, i.e., me-dependent variables are updated every 90 days in the resul ng analy c datasets. More specifically, for each of the five failure me outcomes considered (eTable 1), a separate analy c dataset is constructed by collec ng the realiza ons of the random variables described below for all pa ents in the main or CVD study cohort.
Follow-up me (expressed in 90-day units) is denoted by t and, by conven on, the first 90 days of follow-up are denoted by t = 0. The me when the pa ent's follow-up ends is denoted byT and is defined as the earliest of the me to failure denoted by T or the me to a right-censoring event denoted by C. When a pa ent is right-censored, i.e., C < T, the type of right-censoring event experienced by the pa ent is recorded and denoted by Γ with possible values 1-7 to represent the administra ve end of study, disenrollment from the health plan, start of a pregnancy, switch in therapy type (i.e., crossover from human-only to analog-containing therapy or vice versa), ini a on of a non standard insulin (i.e., inhaled or animal insulin), interrup on of insulin therapy, or death, respec vely. The indicator that the end of follow-up is due to the occurrence of a failure event is denoted by ∆ = I(T ≤ C), i.e., ∆ = 1 implies thatT = T and ∆ = 0 implies thatT = C. The indicator that the pa ent ini ated analog-containing insulin therapy on the index date is represented by the binary variable A 1 (0) (i.e., A 1 (0) = 0 indicates exposure to human-only insulin therapy). The indicator of the pa ent's right-censored status at me t is denoted by A 2 (t). We thus have A 2 (t) = 0 for t = 0, . . . ,T − 1 whenT ≥ 1 and A 2 (T) = 1 − ∆. The exposure variable denoted by A(t) is defined by A(0) = (A 1 (0), A 2 (0)) and A(t) = A 2 (t) for t > 0. At each me point t = 0, . . . ,T, covariates such as A1c measurements (eTables 2-3) are denoted by a component L j (t) of the random vector L(t) and defined from measurements that occur before the exposure at me t, A(t), or are otherwise assumed not to be affected by the exposures at me t or therea er, (A(t), A(t + 1), . . .). If no such measurements were collected, each variable L j (t) is defined by conven on using last observed value carried forward at t > 0. If no baseline measurements were collected for a con nuous variable in L(0), the variable is defined by conven on as the median of the baseline values from pa ents with observed measurements at t = 0. For categorical variables in L(0), a separate level is defined to encode missing baseline measurements. For each me-independent or me-dependent covariate L j with at least one missing measurement (at baseline or at t > 0), an indicator of missing covariate measurement at me t is created and included as a dis nct variable (e.g., to encode intensity of clinical monitoring) in the random vector L(t) for all me points t. In addi on, the vector of covariates L(t) at me t include an outcome measurement denoted by Y(t), i.e., Y(t) ∈ L(t) for t = 0, . . . ,T. For each me point t = 1, . . . ,T + 1, the outcome is the indicator of past failure, i.e., Y(t) = I(T ≤ t − 1) and Y(0) = 0 by conven on. By defini on, the outcome is thus 0 for t = 0, . . . ,T, not observed at t =T + 1 if ∆ = 0 and, 1 at t =T + 1 if ∆ = 1.
In short, the observed data in each analy c dataset are realiza ons of n copies O i of the random process O = T , ∆, (1 − ∆)Γ,L(T),Ā(T), ∆Y(T + 1) where n = 127, 600 in each of the four analy c datasets to evaluate AMI, CHF, CVA, all-cause mortality and n = 95, 300 in the analy c dataset to evaluate CVD mortality. In the analyses of each dataset, we assumed 2 that the random variables O i are independent and iden cally distributed.
To simplify expressions below, we use the overbar nota on· to denote the history of a variable · from baseline to me t (e.g.,Ā(t) = (A(0), . . . , A(t))) and, by conven on, L(t) and A(t) are nil when t < 0.
Inferences for the AUC and RD effect measures were derived from prior work 5 based on the delta method and the influence curve of the IPW es mator β n .

eMethods 4 -Denominator of the Inverse Probability Weights
The condi onal probabili es P(A(t) = a k (t) |L(t),Ȳ(t) = 0,Ā(t − 1) =ā k (t − 1)) for t = 0, . . . , 9 and k = 0, 1 that define the denominators of the IP weights used to fit the MSMs described above can be factorized based on the following 10 propensity scores (PS) for: • baseline ini a on of analog-containing insulin therapy denoted by µ 1 (0): • right-censoring due to administra ve end of study denoted by µ 2 (t): • right-censoring due to disenrollment from the health plan denoted by µ 3 (t): • right-censoring due to start of pregnancy denoted by µ 4 (t): where L ♀ (0) denotes the indicator that the pa ent is female • right-censoring due to crossover from analog-containing to human-only insulin therapy denoted by µ 5 (t): • right-censoring due to crossover from human-only to analog-containing insulin therapy denoted by µ 6 (t): • right-censoring due to ini a on of a non-standard insulins (animal or inhaled) denoted by µ 7 (t): • right-censoring due to early (i.e., at t = 2) interrup on of insulin therapy denoted by µ 8 (2): • right-censoring due to late (i.e., at t > 2) interrup on of insulin therapy denoted by µ 9 (t): • right-censoring due to death denoted by µ 10 (t): We note that the last PS above is not considered to define the IP weights in the analyses that evaluate all-cause mortality because death is then the failure outcome of interest (i.e., there is no right-censoring due to death). For the AMI, CHF, CVA, and CVD mortality outcomes, we constructed the denominators of the IP weights for all outcomes contribu ng to the MSM fits as follows for t = 0, . . . , 9: Each of the first three approaches considered for es ma ng these denominators of the IP weights consists in fi ng a separate logis c model for each of the the 10 PS µ j (t) just described. The three approaches only differ by the set of covariates that define each of the main terms included in each logis c model. We describe these sets in the next sec on.

eMethods 5 -Standard Propensity Score Es ma on with Three Covariate Adjustment Sets
In the first approach implemented to es mate the denominators of the IP weights, the main terms included in a given PS logis c model were those associated with covariates presumed to impact both failure and the PS outcome as indicated in eTables 4-5. For instance, in the analyses of CHF, the PS logis c model for baseline ini a on of analogcontaining (versus human-only) insulin therapy included main terms for all covariates in these tables where a value of 1 is found in both the µ 1 (0) and CHF columns. For the me-dependent covariates selected based on this ra onale, only main terms for their current values L(t) were included in the PS logis c models, i.e., no main terms for other summary measures of the covariate histories were considered (e.g., latest change in value L(t) − L(t − 1) or a lagged value L(t − 1)). In addi on, all PS logis c models except for non-standard insulin ini a on included main terms for the pa ent's age at index date and the PS logis c model for µ 1 (0) also included main terms for and interac on terms between the dummy variables that encode health plan membership (i.e., HP, KPCO, KPNC, or KPSC) and the index date year. All PS logis c models fi ed with pooled data over me (i.e., µ j (t) for j = 2, . . . , 7, 9, 10) also included main terms for me t (expressed in 90-day intervals). In addi on, except for the PS logis c model for µ 1 (0), all other PS models included a main term for the baseline insulin therapy A 1 (0). For the PS logis c models for administra ve end of study and start of pregnancy, only main terms for age at index, t, and A 1 (0) were included in the models. For the PS logis c model for the ini a on of non-standard insulins, only main terms for t and A 1 (0) were included in the model because <5 pa ents ini ated non-standard insulins which limited the number of covariate that could be considered. All con nuous variables considered by the various PS logis c models were discre zed using the cutoffs given in eTable 6 and main terms for the resul ng dummy variables (for the non-reference level) were included in the models. eTable 7 provides an example of the logis c model fit for µ 5 (t) based on the PS es ma on approach 1.
The second approach implemented to es mate the denominators of the IP weights followed the same principles with the difference that the main terms included in a given PS logis c model (including for start of pregnancy and administra ve end of study) were those associated with covariates presumed to, at least, impact failure as indicated in eTables 4-5. However, for the PS logis c model for the ini a on of non-standard insulins, only main terms for t and A 1 (0) were included in the model because <5 pa ents ini ated non-standard insulins which limited the number of covariate that could be considered. All other modeling decisions were iden cal to those of the first approach described above. eTables 8-9 provide an example of the logis c model fit for µ 5 (t) based on the PS es ma on approach 2.
The third approach implemented to es mate the denominators of the IP weights followed the same principles with the difference that the main terms included in a given PS logis c model were those associated with the covariates presumed to impact either failure or the PS outcome as indicated in eTables 4-5. The PS logis c models for the start of pregnancy and administra ve end of study included main terms for all covariates presumed to affect failure. However, for the PS logis c model for the ini a on of non-standard insulins, only main terms for t and A 1 (0) were included in the model because <5 pa ents ini ated non-standard insulins which limited the number of covariate that could be considered. All other modeling decisions were iden cal to those of the first approach described above. eTables 10-11 provide an example of the logis c model fit for µ 5 (t) based on the PS es ma on approach 3.
Thus, the three sets of variables that define the main terms included in any given PS logis c model according to the three approaches just described are nested and of increasing size.

eMethods 6 -Data-adap ve Propensity Score Es ma on
In the fourth approach implemented to es mate the denominators of the IP weights, a separate super learner 6 was used to es mate each of the 10 PS µ j (t) instead of a separate logis c model (as done in the first three approaches). Each super learner was constructed based on 10-fold cross-valida on and three learners corresponding with the same three logis c models considered in the first three PS es ma on approaches described above. eTable 12 provides an example of the super learner fit for µ 5 (t) based on the PS es ma on approach 4. eMethods 7 -Results eTable 13 describes the propor ons of pa ents ini a ng HI versus AI therapy by site and year of study entry for pa ents in the main cohort. This table indicates that the great majority of pa ents from site 4 were first prescribed AI with li le fluctua on over the years of the study. This is in contrast to the other 3 sites where most pa ents were first prescribed HI with rela vely li le temporal fluctua on at sites 2 and 3, but more temporal fluctua on at site 1 in insulin prescrip on pa erns over the years of the study. Results from eTable 13 mo vated the conduct of two sets of sensi vity analyses using, first, only the subset of pa ents from sites 1-3 (125,257), and second, only the subset of pa ents from site 1 (64,092).
The distribu ons of follow-up mes by exposure regimen for each of the five primary analyses are described in eTables 14-28.
Results of all primary and sensi vity analyses implemented with the four PS es ma on approaches described above along with their corresponding unadjusted analyses (i.e., same models fi ed without weights) are displayed in eTables 29, 30, 31, 32, and 33 for AMI, CHF, CVA, CVD-mortality, and all-cause mortality, respec vely. Inference for the hazard ra o is given in the column "HR" and derived from the MSM fit that assumes constant hazard ra os over me (propor onality assump on). Inference in the "AUC", "RD1", and "RD2" columns are derived from the same saturated MSM fit. The "AUC" column contains the p-value from the sta s cal test that the area between the survival curves is equal to 0. The "RD1" and "RD2" columns provide inferences for the cumula ve risk differences at 1 and 2 years (i.e., 4 and 8 quarters) a er the index date. 95% confidence intervals for the HR and RDs are given in between squared brackets, standard errors are given by "SE", and the p-values of the sta s cal tests that HR=1/RD=0 are given by "p". We note that p-values were not adjusted for mul ple tes ng. The crude (i.e., unadjusted) and SL-based IPW es mates of the counterfactual survival curves associated with the AUC p-values given in the eTables are displayed in eFigures 1-5. Summary sta s cs for the inverse probability weights involved in all primary and sensi vity analyses are displayed in eTables 34, 35, 36, 37, and 38 for AMI, CHF, CVA, CVD-mortality, and all-cause mortality, respec vely.
Null findings from the primary PP analyses are generally supported by the adjusted es mates from sensi vity PP analyses. CHF results from the site 1 sensi vity analyses based on PS es ma on with logis c models using covariate sets 2 and 3 and data-adap ve PS es ma on with SL provided the greatest sta s cal evidence of a poten al difference between the two exposure regimens considered and suggest a poten al beneficial effect of AI against CHF, but not all cause mortality, CVD, MI, or CVA.
Part II of II -List of covariates considered in the various analyses and whether they are assumed to impact exposure decisions, censoring events, or outcomes.
Ini al Insurance Death insulin coverage Adherence to ini al insulin Time- eTable 14: Distribu on of follow-up me (expressed in 90-day intervals) for pa ents con nuously exposed to analogcontaining insulin therapy in the primary AMI analyses (all sites combined). eTable 17: Distribu on of follow-up me (expressed in 90-day intervals) for pa ents con nuously exposed to analogcontaining insulin therapy in the primary CHF analyses (all sites combined). eTable 20: Distribu on of follow-up me (expressed in 90-day intervals) for pa ents con nuously exposed to analogcontaining insulin therapy in the primary CVA analyses (all sites combined). eTable 23: Distribu on of follow-up me (expressed in 90-day intervals) for pa ents con nuously exposed to analogcontaining insulin therapy in the primary CVD mortality analyses (all sites combined).