Association of Intraoperative Transesophageal Echocardiography and Clinical Outcomes After Open Cardiac Valve or Proximal Aortic Surgery

Key Points Question Is intraoperative transesophageal echocardiography (TEE) use associated with improved clinical outcomes among patients undergoing cardiac valve or proximal aortic surgery? Findings This matched cohort study of 872 936 patients undergoing cardiac valve or aortic surgery between 2011 and 2019 found that intraoperative TEE use was associated with lower 30-day mortality, a lower incidence of stroke or 30-day mortality, and a lower incidence of cardiac reoperation or 30-day mortality. Meaning These findings suggest that intraoperative TEE may improve clinical outcomes after open cardiac valve (repair or replacement) and/or aortic surgery.


Baseline Characteristics of the Study Cohort
After applying the inclusion/exclusion criteria, we are left with n = 872, 936 patients. eTable 1 (Part I), 2 (Part II), and 3 (Part III) summarize baseline covariates of the study cohort, including demographics, preexisting comorbid conditions, laboratory values, surgical variables, surgery type, and predicted risk scores. ii. Short name: "COpReBld" ii. Short name: "CreatLst"

Hospitals' Preference for TEE
We explore hospitals' preference for using TEE during valve surgeries in this section. eFigure 1 plots the overall distribution of hospitals' preference (TEE fraction), and eFigure 2 presents boxplots of hospitals' preference by geographic region.

Surgeons' Preference for TEE
We explore surgeons' preference for using TEE during valve surgeries in this section. eFigure 3 plots the overall distribution of surgeons' preference (TEE fraction), and eFigure 4 presents the boxplots of surgeons' preference by geographic region.

eAppendix 4. Details on Statistical Matching Methodology
Statistical matching is a commonly used method to adjust for observed covariates and embed observational data into an approximately randomized experiment (Rosenbaum, 2002(Rosenbaum, , 2010. We define terminologies in Section 4.1, give details of the all-patient matched comparison in Section 4.2, and details of two within-surgeon, within-hospital matches in Section 4.3.

Glossary of Matching Terms
Bipartite Matching: Matching cases to controls based on a binary treatment status.
Optimal (Bipartite) Matching: Match cases to controls in an optimal way such that some properly defined total cases-to-control distances is minimized after matching.
Propensity Score: The propensity score is the conditional probability of assignment to a particular treatment given a vector of observed covariates (Rosenbaum and Rubin, 1983).
Optimal Matching Within Propensity Score Calipers: A hybrid matching method that minimizes the total cases-to-control distances subject to the constraint that matched cases and control units differ in their estimated propensity scores by no more than a value known as the "caliper" (Rosenbaum and Rubin, 1985). Rosenbaum and Rubin (1985) found that this hybrid method is superior to metric-based matching and propensity score matching.

Mahalanobis Distance:
A multivariate measure of covariate distance between units in a sample (Mahalanobis, 1936, Rubin, 1980. The squared Mahalanobis distance is equal to the difference in covariate values of treated units and matched control units, divided by the covariate's standard deviation. Mahalanobis distance takes into account the correlation structure among covariates. The distance is zero if two units have the same value for all covariates and increases as two units become more dissimilar. Exact Matching: Matching cases to controls requiring the same value of a nominal covariate (Rosenbaum, 2002).

Fine Balance:
A matching technique that balances exactly the marginal distribution of one nominal variable or the joint distribution of several nominal variables in the treated and control groups after matching Yu et al., 2020).

Study Design
In the all-patient matched comparison, each of the 161, 610 patients undergoing valve surgery without TEE in the study cohort is matched to one patient undergoing valve surgery with TEE. We matched exactly on 2. Under the "optimal matching within a propensity score caliper" regime, the size of caliper is important because too large a caliper size renders the matching problem computationally challenging, while too small a caliper size renders the matching problem infeasible (i.e., we cannot match each no-TEE unit with one TEE unit). Fortunately, function optcal in the package bigmatch is able to determine the smallest caliper size such that the matching problem remains feasible using a so-called Glover's algorithm . We therefore used a caliper size equal to E + 0.05 where E is the smallest caliper size such that the matched problem remains feasible as determined by the function optcal in each of the 25 strata, and we added 0.05 to allow some flexibility.
3. Function nfmatch allows researchers to pursue "optimal propensity score matching" while finely balancing the joint distribution of a list of pre-specified categorical variables at the same time.
To do this, we set option fine in the function nfmatch to the list of 11 variables to be finely balanced specified in the Section 4.2.1.
See Section 11 for R code implementing the match.

Study Design
In a second study design, we consider matching patients of the same surgeon. Since the same surgeon may practice in different hospitals, we further match exactly on the hospital ID. In other words, each matched pair consists of two patients of the same surgeon in the same hospital, one undergoing valve surgery with TEE and the other not. This design may maximally control for the selection bias due to surgeon and/or hospital-level residual confounding. We considered two variants of this design. Our primary within-surgeon, within-hospital match only considered patients whose surgeons had an overall preference between 30% and 70% for using TEE during valve surgeries. Moreover, in addition to matching exactly on the surgeon ID and hospital ID, we further match exactly on the surgery type (including having multiple surgery types, i.e., a TEE patient with an AV replacement surgery plus a CABG surgery was matched to a no-TEE patient with an AV replacement surgery and plus a CABG surgery), ejection fraction being normal (55% to 70%), New York Heart classification, and quartiles of predicted mortality. We summarize below variables being exactly matched upon: 1. Surgeon ID; 2. Hospital ID; 3. An indicator of ejection fraction being normal (1/0); 4. New York Heart classification: 0 (none), 1, 2, 3, 4; 5. Predicted mortality rate quartiles and missingness: NA, 1st, 2nd, 3rd, 4th; 6. An indicator for AV repair (1/0); 7. An indicator for AV replacement (1/0); 8. An indicator for MV repair (1/0); 9. An indicator for MV replacement (1/0); 10. An indicator for tricuspid valve repair/replacement (1/0); 11. An indiator for pulmonic valve repair/replacement (1/0); 12. An indicator for aortic proximal (Aortic root/valved conduit (Bentall), AV sparing root, or Aortic homograft or non-valved conduit) (1/0); 13. An indicator for plus CABG (1/0); 14. An indicator for plus other cardiac surgery (1/0).
We also balanced all other variables under the demographics, admission, preexisting comorbidities, hemodynamic data & laboratory values, surgical variables, and surgery type categories in eTable 1 and 2.
As a complementary analysis, we further considered a within-surgeon, within-hospital match that utilized all surgeons.

Implementation Details
Within each stratum formed by surgeon ID and hospital ID, we used a statistical matching algorithm called optimal subset matching (Rosenbaum, 2012(Rosenbaum, , 2020 implemented in the R package rcbsubset. We implemented the matching algorithm with all tuning parameters set to their default values. See Section 11 for R code implementing the match. We analyzed the binary outcome 30-day mortality using McNemar's test (Cox and Snell, 1989;Rosenbaum, 2002  The left panel of Figure 5 examines the pre-surgery and post-surgery creatinine level in the matched treated and matched control groups (excluding < 3.5% patients who did not have a post-surgery creatinine measurement). In the matched treated group, the creatinine level is 1.184 (pre-surgery) versus 1.492 (post-surgery); in the matched control group, the creatinine level is 1.173 (pre-surgery) versus 1.487 (post-surgery). On average, creatinine level elevates by 1.487 − 1.173 = 0.314 in the with-TEE group and 1.492 − 1.184 = 0.308 in the without-TEE group. We test the null hypothesis that TEE has no effect on the creatinine elevation (i.e., sample average treatment effect equal to 0). The p-value is 0.026 (95% CI : [0.001, 0.012]).  The left panel of Figure 7 examines the pre-surgery and post-surgery creatinine level in the matched treated and matched control groups (excluding < 3% patients without a post-surgery creatinine measurement). In the matched treated group, the creatinine level is 1.131 (pre-surgery) versus 1.394 (post-surgery); in the matched control group, the creatinine level is 1.122 (pre-surgery) versus 1.389 (post-surgery). On average, creatinine level elevates by 1.394 − 1.131 = 0.263 in the with-TEE group and 0.267 in the without-TEE group. We test the null hypothesis that TEE has no effect on the creatinine elevation (i.e., sample average treatment effect equal to 0). The p-value is 0.633   The left panel of Figure 7 examines the pre-surgery and post-surgery creatinine level in the matched treated and matched control groups (excluding < 3% patients without a post-surgery creatinine measurement). In the matched treated group, the creatinine level is 1.122 (pre-surgery) versus 1.392 (post-surgery); in the matched control group, the creatinine level is 1.144 (pre-surgery) versus 1.426 (post-surgery). On average, creatinine level elevates by 1.392 − 1.122 = 0.270 in the with-TEE group and 0.282 in the without-TEE group. We test the null hypothesis that TEE has no effect on the creatinine elevation (i.e., sample average treatment effect equal to 0), and the p-value is 0.002 (95% CI : [−0.021, −0.005]). We detected a very minor effect of TEE on creatinine elevation.

Summary
eTable 20 summarizes results of the unadjusted analysis and three matched comparisons. In a sensitivity analysis, we investigate how large a bias from unmeasured confounding could alter our primary analysis results. We used a methodology developed in Rosenbaum (2002, Section 4.3.1) and Rosenbaum and Silber (2009)

Rationale Behind Negative Control Outcome Selection
While all adjusted analyses agreed and all held up against robust statistical sensitivity analyses, we felt a negative control outcome analysis could serve as an additional check for the validity of our findings associating TEE to improved outcomes. An ideal negative control outcome is a factor that would be implausibly related to TEE use (or lack of use). We considered multiple possibilities for a negative control outcome including: surgical site infection, deep vein thrombosis, and arrhythmia. But while all of these outcomes were appropriate (e.g. implausibly related to TEE use) all were associated with an increased rate of mortality and suffered from survival bias. Therefore, we felt none of these could be used as a negative control unless combined with mortality and analyzed as a composite, which was not suitable given our primary outcome was mortality. But, a postoperative creatinine level was obtained in every patient (even those patients who ultimately died). Therefore, we selected creatinine elevation -defined as postoperative creatinine level minus preoperative creatinine levelas our negative control outcome. Elevation in creatinine level is an ideal negative control outcome to test for residual confounding comparing TEE vs no TEE for the following three reasons.
1. In our cohort, postoperative creatinine was a laboratory value obtained in every patient (even those who ultimately died) immediately postoperatively and free from survival bias.
2. While creatinine elevation is highly probable after cardiac surgery (e.g. increased after surgery compared to pre-surgery), it is reasonable to assume that the increase in creatinine would be roughly evenly allocated across both the TEE and no TEE groups.
3. Even if the above assumption (point 2) was untrue, and the TEE-guided optimization of hemodynamic management during cardiac surgery limited the degree of acute kidney injury, the TEE group would have a smaller elevation in creatinine (compared to the non-TEE group). If the TEE group demonstrated a statistically significantly greater elevation in creatinine (compared to the no TEE group) this "worse outcome" result of the negative control outcome analysis would be in disagreement with the "improved outcome" results of the primary analyses. In other words, a statistically significant finding of a greater elevation in creatinine in the TEE group would represent a bias against our primary finding of improved outcome with TEEindicating our primary findings of an association between TEE and improved outcomes could be an underestimate of the true clinical outcomes benefit.

Comment on Results of Negative Control
As can be seen in eTable 20, TEE was associated with a statistically significantly greater increase creatinine elevation (e.g. "worse outcome") on the unadjusted analysis: (0.317 TEE vs 0.309 no TEE; p¡0.002); and the adjusted, all-patient matched analysis (0.314 TEE vs 0.308 no TEE; p=0.028). On the within-hospital, within-equivocal surgeon matched analysis, there was no statistically significant difference in creatinine elevation among the TEE group (vs no TEE group) (0.263 vs 0.267; 0.633).
Only on the within-hospital, within-all surgeon matched analysis did the TEE group demonstrate a statistically significantly smaller increase in creatinine elevation compared to the no TEE group (0.270 vs 0.282; p=0.002). Three of the four negative control outcome analyses are either statistically insignificant or incongruent with the primary results. Compellingly, the incongruent results of the TEE group demonstrating a greater increase in postoperative creatinine indicates the possibility of residual bias against our finding that TEE is associated with improved clinical outcomes. The only result that indicates the possibility of residual confounding is the within-all-surgeon match where the TEE group demonstrates a statistically significant smaller increase (e.g. "better outcome") compared to the no TEE group. Admittedly, this does indicate the possibility that including the high-TEE-surgeons into the within-hospital, within-surgeon match (as opposed to the analysis including only the equivocal-TEE-surgeons) reintroduced confounding between TEE and outcomes. But while statistically significant differences were observed in creatinine elevation between the TEE and no TEE groups -all differences were between -0.013 and +0.007 and not considered to be a clinically meaningful difference in creatinine between the TEE and no TEE groups.