Key PointsQuestion
Are there differences in supplemental oxygen administration among patients of different races and ethnicities associated with pulse oximeter performance discrepancies?
Findings
In this cohort study of 3069 patients in the intensive care unit, Asian, Black, and Hispanic patients had a higher adjusted time-weighted average pulse oximetry reading and were administered significantly less supplemental oxygen for a given average hemoglobin oxygen saturation compared with White patients.
Meaning
There were differences in supplemental oxygen administration between Asian, Black, and Hispanic patients and White patients that were associated with pulse oximeter performance and may contribute to racial and ethnic disparities in care.
Importance
Pulse oximetry (SpO2) is routinely used for transcutaneous monitoring of blood oxygenation, but it can overestimate actual oxygenation. This is more common in patients of racial and ethnic minority groups. The extent to which these discrepancies are associated with variations in treatment is not known.
Objective
To determine if there are racial and ethnic disparities in supplemental oxygen administration associated with inconsistent pulse oximeter performance.
Design, Setting, and Participants
This retrospective cohort study was based on the Medical Information Mart for Intensive Care (MIMIC)-IV critical care data set. Included patients were documented with a race and ethnicity as Asian, Black, Hispanic, or White and were admitted to the intensive care unit (ICU) for at least 12 hours before needing advanced respiratory support, if any. Oxygenation levels and nasal cannula flow rates for up to 5 days from ICU admission or until the time of intubation, noninvasive positive pressure ventilation, high-flow nasal cannula, or tracheostomy were analyzed.
Main Outcomes and Measures
The primary outcome was time-weighted average supplemental oxygen rate. Covariates included race and ethnicity, sex, SpO2–hemoglobin oxygen saturation discrepancy, data duration, number and timing of blood gas tests on ICU days 1 to 3, partial pressure of carbon dioxide, hemoglobin level, average respiratory rate, Elixhauser comorbidity scores, and need for vasopressors or inotropes.
Results
This cohort included 3069 patients (mean [SD] age, 66.9 [13.5] years; 83 were Asian, 207 were Black, 112 were Hispanic, 2667 were White). In a multivariable linear regression, Asian (coefficient, 0.602; 95% CI, 0.263 to 0.941; P = .001), Black (coefficient, 0.919; 95% CI, 0.698 to 1.140; P < .001), and Hispanic (coefficient, 0.622; 95% CI, 0.329 to 0.915; P < .001) race and ethnicity were all associated with a higher SpO2 for a given hemoglobin oxygen saturation. Asian (coefficient, −0.291; 95% CI, −0.546 to −0.035; P = .03), Black (coefficient, −0.294; 95% CI, −0.460 to −0.128; P = .001), and Hispanic (coefficient, −0.242; 95% CI, −0.463 to −0.020; P = .03) race and ethnicity were associated with lower average oxygen delivery rates. When controlling for the discrepancy between average SpO2 and average hemoglobin oxygen saturation, race and ethnicity were not associated with oxygen delivery rate. This discrepancy mediated the effect of race and ethnicity (−0.157; 95% CI, −0.250 to −0.057; P = .002).
Conclusions and Relevance
In this cohort study, Asian, Black, and Hispanic patients received less supplemental oxygen than White patients, and this was associated with differences in pulse oximeter performance, which may contribute to known race and ethnicity–based disparities in care.
Since its invention in 1974, near-infrared pulse oximetry (SpO2) has allowed for convenient, noninvasive transcutaneous monitoring of arterial hemoglobin oxygen saturation.1,2 This technology uses spectrophotometry to indirectly calculate the arterial hemoglobin saturation by determining the proportion of oxyhemoglobin in peripheral arterial blood. The accuracy of SpO2 is usually within 2 to 3 percentage points of true blood hemoglobin oxygen saturation when arterial saturations are greater than 90%, but the accuracy is reduced as saturations decrease to less than 90%.3,4
It has been known for decades that these readings are affected by various surface pigmentations, including nail polish and skin melanin, which may affect light absorption and scattering.5,6 Previous clinical studies in adults and children have shown clear differences in temporally associated SpO2 and blood hemoglobin oxygen saturation associations according to race and ethnicity, generally with higher SpO2 readings in patients of racial and ethnic minority groups.7-10 This increases the risk of hidden hypoxemia, in which patients have falsely elevated SpO2 readings, usually defined as 92% or greater, with a blood hemoglobin oxygen saturation less than 88%.9
While patients of all races and ethnicities are subject to hidden hypoxemia, the higher incidence in patients of racial and ethnic minority groups could possibly lead to more insufficient treatment in this population and contribute to known population disparities in outcomes, including those seen during the COVID-19 pandemic.11-14 In fact, it has been shown that hidden hypoxemia is associated with higher mortality rates.9
However, few studies have directly assessed the factors that may mediate these downstream effects.15 Potential culprits may include insufficient administration of supplemental oxygen and differences in initiation and management of noninvasive and invasive mechanical ventilation. Pulse oximeter performance disparities may also play a role in decision-making regarding fluid management, specialty service consultation, and intensive care unit (ICU) admission.16 Artificially high SpO2 readings in the emergency department could affect the perceived need for cardiology service admission for heart failure management, possibly explaining the finding that Black and Hispanic patients were less likely than White patients to be admitted to a cardiology service.17
The objective of this study was to assess if there are disparities in supplemental oxygen administration between Asian, Black, and Hispanic patients and White patients in the ICU and whether they are associated with discrepancies in pulse oximeter performance.
This study is based on the Medical Information Mart for Intensive Care (MIMIC)-IV critical care data set. It was approved for research by the institutional review boards of Beth Israel Deaconess Medical Center in Boston, Massachusetts (2001-P-001699/14), and the Massachusetts Institute of Technology (0403000206) without a requirement for individual patient informed consent because data were deidentified and publicly available.18
MIMIC-IV data were queried from the Google BigQuery (Alphabet Inc) cloud platform using RStudio, version 1.4 (RStudio PBC) with the R, version 4.0.3 programming language (R Foundation for Statistical Computing). MIMIC-IV includes data on 76 540 ICU stays for 53 150 unique patients (computed by authors) at Beth Israel Deaconess Medical Center. These include the years 2008 through 2019, and therefore there were no patients with COVID-19.19 MIMIC-IV does not provide an admission diagnosis for each patient, but Sequential Organ Failure Assessment (SOFA) scores are used to specify the type and severity of organ dysfunction.
Selection criteria, as well as the number and proportion of patients of each race and ethnicity that were excluded at each step, are shown in Figure 1. An initial cohort was identified by downloading vital sign data for patients’ first ICU admission in the data set. Data were limited to the first period of up to 5 days from ICU admission or until the time point at which the patient received invasive or noninvasive mechanical ventilation, high-flow nasal cannula, or a tracheostomy, whichever came first. This period is subsequently referred to as the index period. Patients were excluded if they did not have a documented race or ethnicity of Asian, Black, Hispanic, or White (groups that were excluded included “other,” “unknown,” “unable to obtain,” and “American Indian/Alaska Native”). Patients were also excluded if they were missing key variables from the index period, including supplemental oxygen (or room air) records, initial partial pressure of carbon dioxide (pCO2), hemoglobin level, and vital sign data, and/or had an index period of less than 12 hours. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guidelines for observational studies.20
Pulse Oximetry and Vital Signs
Vital sign records within the index period, including temperatures (converted to Celsius), heart rates, mean arterial pressures, respiratory rates, and SpO2, were extracted. For each vital sign type, area under the curve was calculated using the trapezoid method with the “MLmetrics” R package21 and was divided by the number of minutes between the first and last reading to produce a time-weighted average value.
Hemoglobin Oxygen Saturation
To avoid aberrant values as well as likely venous blood gas values, which are not directly distinguished from arterial blood gas values under some laboratory codes, readings were removed if the hemoglobin oxygen saturation was less than 70% or greater than 100%. For each patient, the area under the curve was calculated from all remaining timed blood gas records during the index period and was divided by the difference in time between the first and last blood gas reading to give a time-weighted average value. If patients had only 1 blood gas reading, this value was used instead of a calculated average. The number of blood gas tests on each day of the index period was determined as well.
To compute estimates of average supplemental oxygen rates, all nasal cannula flow rates and time on room air records within the index period were downloaded from a derived oxygen supplementation table in MIMIC-IV. Numbers of oxygen records of other device types were reviewed to determine that these were infrequent relative to nasal cannula for most patients and could reasonably be excluded. For example, if a patient had a nasal cannula record at hour 1, face tent at hour 2, and a nasal cannula again at hour 3, the patient was recorded as having a nasal cannula at the respective rates from hour 1 through hour 3. Supplemental oxygen area under the curve was calculated and was divided by the duration of time in seconds between the first and last recorded rate. For 100 patients for whom only 1 oxygen rate was recorded, this rate was used instead of the average.
The SOFA score is available in another derived table in MIMIC-IV. It includes 6 components based on organ systems (respiration, coagulation, liver, cardiovascular, central nervous system, and kidney), with a score of 1 to 4 for each system and an overall maximum score of 24 (most severe organ dysfunction).22 For each patient, the maximum of these scores during the index period was extracted. Vasopressor or inotrope dose rates of infusion were extracted from the fluid input table, and a binary variable was created to indicate whether any vasopressors or inotropes were required during the index period. Elixhauser comorbidity scores were computed from International Classification of Diseases, Ninth Revision (ICD-9) and International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-10) codes using the R “Comorbidity” package.23,24
Continuous values were compared using a nonparametric unpaired Wilcoxon rank test. Multivariable linear regression was performed to determine the association between blood hemoglobin oxygen saturation and SpO2, and associations with oxygen delivery. Specifically, models with race and ethnicity as a covariate were developed to show associations, including (1) SpO2 as a function of criterion standard hemoglobin oxygen saturation to confirm different device performance between races and ethnicities; (2) supplemental oxygen rates for a given hemoglobin oxygen saturation to assess for differences in oxygen delivery by race and ethnicity; (3) similar to model 2 with an added variable for the difference between SpO2 and hemoglobin oxygen saturation to determine whether this discrepancy mediates the effect of race; and (4) supplemental oxygen rates, controlling for SpO2, to determine whether there are differences in oxygen rates when titrating to pulse oximeter readings. An additional model (5) used heparin rate as the outcome and the same independent variables as model 2 to rule out a spurious correlation.
To assess the SpO2–hemoglobin oxygen saturation discrepancy as a mediator of the effect of race and ethnicity, we performed a formal mediation analysis, with race and ethnicity as the treatment variable, using the R “Mediate” package.25 Summary results included the average causal mediation effect, the average direct effect, and the total effect.
For all models, clinically relevant covariates were selected for analysis. Nonsignificant covariates were excluded from the final regression models except to maintain comparability between related models. Given that patients missing key data points for regression analysis were excluded as described above, no imputation was performed. For all analyses, a 2-sided 95% CI was used, and a P value of less than .05 was considered statistically significant.
There were 3069 patients included in this study. Patient baseline characteristics, stratified by race and ethnicity, are shown in Table 1. Additional variables relevant to the hospital course, including numbers of blood gas draws, eventual ventilation and high flow oxygen needs, and care units by race and ethnicity are shown in Table 2. The mean (SD) ages of Asian (n = 83), Black (n = 207), Hispanic (n = 112), and White (n = 2667)patients were 64.2 (14.9), 64.0 (13.8), 59.3 (14.5), and 67.6 (13.3)years, respectively. A lower proportion of Black patients were male compared with patients of other races and ethnicities (Asian, 59 [71.1%]; Black, 103 [49.8%]; Hispanic, 68 [60.7%]; White, 1751 [65.7%]). Mortality rates were 3.6% (3 of 83) for Asian patients, 8.2% (17 of 207) for Black patients, 7.1% (8 of 112) for Hispanic patients, and 6.5% (174 of 2667) for White patients. SOFA score components and laboratory values were overall well matched across races and ethnicities.
Pulse Oximetry and Hemoglobin Oxygen Saturation by Race and Ethnicity
Black patients had a lower median (IQR) blood hemoglobin oxygen saturation than White patients (95.0% [91.0%-97.0%] vs 96.0% [92.0%-97.8%]; P < .001). There were no overall differences in average blood hemoglobin oxygen saturation for Hispanic (96.2% [92.5%-98.0%]; P = .59) and White patients or Asian (96.0% [93.2%-97.9%]; P = .43) and White patients. Black patients had a higher SpO2 than White patients (97.6% [96.1%-98.6%] vs 96.6% [95.5%-97.7%]; P < .001). Similarly, Asian (97.3% [96.0%-98.5%]; P < .001) and Hispanic (97.2% [96.2%-98.1%]; P < .001) patients had a higher SpO2 than White patients.
In multivariable linear regression model 1 (Table 3), Black race was associated with a higher SpO2 for a given hemoglobin oxygen saturation, when controlling for average respiratory rate, initial pCO2, number of blood gas draws on day 1, hemoglobin level, need for vasopressors and inotropes, sex, time duration of index period, and Elixhauser score (coefficient, 0.919; 95% CI, 0.698-1.140; P < .001). The same was true for Hispanic (coefficient, 0.622; 95% CI, 0.329-0.915; P < .001) and Asian patients (coefficient, 0.602; 95% CI, 0.263-0.941; P = .001) patients. Age was not significant in this or any of the subsequent models and was excluded. Differences in SpO2 for a given hemoglobin oxygen saturation between races and ethnicities are shown in Figure 2A.
Factors Associated With Oxygen Delivery Rates
Oxygen delivery rates by race and ethnicity are shown in Figure 2B, the association between oxygen delivery rates and hemoglobin oxygen saturation is shown in Figure 2C, and the association between oxygen delivery rates and SpO2 is shown in Figure 2D. For clarity, Figures 2C and D are limited to average oxygen rates up to 10 L/min, excluding a small number of outliers. Black patients had lower median (IQR) oxygen delivery rates than White patients (2.8 L/min [2.0-3.6 L/min] vs 3.0 L/min [2.3-3.9 L/min]; P < .001). Asian patients also had lower median (IQR) oxygen delivery rates than White patients (2.6 L/min [2.0-3.5 L/min]; P = .009). There was a trend toward a difference between Hispanic patients (2.8 L/min [2.2-3.6 L/min] and White patients (P = .06).
Model 2 multivariable linear regression shows that Asian (coefficient, −0.291; 95% CI, −0.546 to −0.035; P = .03), Black (coefficient, −0.294; 95% CI, −0.460 to −0.128; P = .001), and Hispanic (coefficient, −0.242; 95% CI, −0.463 to −0.020; P = .03) race and ethnicity were associated with approximately 0.2 to 0.3 L/min lower average oxygen delivery rates when controlling for average respiratory rate; initial pCO2; number of blood gas tests on days 1, 2, and 3; hemoglobin level; and need for vasopressors or inotropes (Figure 2C). Sex, index period duration, and Elixhauser score were not significant in this model.
We performed another regression (model 3) using the same variables as model 2 regression and added a new variable representing the difference between average SpO2 and average blood hemoglobin oxygen saturation. In this model, a greater discrepancy was associated with lower supplemental oxygen delivery (coefficient, −0.238; 95% CI, −0.262 to −0.214; P < .001). Notably, Asian (coefficient, −0.144; 95% CI, −0.386 to 0.098; P = .24), Black (coefficient, −0.081; 95% CI, −0.239 to 0.077; P = .31), and Hispanic (coefficient, −0.092; 95% CI, −0.301 to 0.118; P = .39) race and ethnicity were not significant in model 3. Several variables from model 2, including hemoglobin level and use of vasopressors or inotropes, were excluded for nonsignificance.
We performed a mediation analysis to assess the effect of the SpO2–average blood hemoglobin oxygen saturation discrepancy as a mediator of differences in oxygen supplementation. The average causal mediation effect was −0.157; 95% CI, −0.250 to −0.057; P = .002. The average direct effect was −0.144; 95% CI, −0.340 to 0.062; P = .17. The total effect was −0.301; 95% CI, −0.506 to −0.092; P = .008.
In model 4, neither Asian (coefficient, −0.131; 95% CI, −0.373 to 0.111; P = .29), nor Black (coefficient, −0.060; 95% CI, −0.218 to 0.098; P = .46), nor Hispanic (coefficient, −0.084; 95% CI, −0.293 to 0.125; P = .43) race and ethnicity was associated with a difference in oxygen delivery rates when controlling for SpO2. Nonsignificant variables included age, hemoglobin level, pCO2, need for vasopressors or inotropes, and Elixhauser score.
To rule out a spurious correlation by testing a falsification hypothesis,26 we performed a regression with the same variables as model 2, but replaced oxygen delivery with heparin rate, which would not be expected to differ across races and ethnicities, as the dependent variable (model 5). In this model, race and ethnicity were not significant (Asian coefficient, −44.840; 95% CI, −106.002 to 16.322; P = .15; Black coefficient, 3.852; 95% CI, −35.911 to 43.615; P = .85; Hispanic coefficient, −29.573; 95% CI, −82.504 to 23.359; P = .27).
In this study, we show that Asian, Black, and Hispanic patients had higher average SpO2 readings than White patients for a given blood hemoglobin oxygen saturation. They also received less supplemental oxygen when adjusting for potential confounders, and these disparities appear to be mediated by larger discrepancies between SpO2 and blood hemoglobin oxygen saturation.
Our study is consistent with prior studies that have shown differences of several percentage points in SpO2 for a given hemoglobin oxygen saturation between Black and White patients,7-10 but in the past, the clinical significance of these findings has occasionally been discounted. For example, 1 article stated that for the purpose of replacing arterial blood gas, “oximetry need not have exact accuracy—we are likely to provide oxygen to a patient whether his oxygen saturation (SpO2) value is 88% or 91%, and we feel little need to give it if his SpO2 value is 97% or 100%.”27(p131)
However, by analyzing oxygen rates, we identify a clear association with an important treatment parameter, and although we cannot directly determine the causal association of differences in oxygen supplementation shown in this study with known disparities in outcomes, the finding that race and ethnicity could affect how much oxygen a patient receives is notable and concerning. This will become increasingly important with the advent of closed-loop circuits that automatically titrate supplemental oxygen according to SpO2.28,29 While patients of some races and ethnicities may be more likely to have diseases states in which a lower SpO2 is tolerated, no difference in supplemental oxygen in relation to SpO2 was observed to suggest intentional differences in titration practices. Notably, a recent study30 showed that hidden hypoxemia in patients with COVID-19 was associated with delayed or nonadministration of COVID-19 therapies.
Our findings present a unique and compelling opportunity to improve equity through device reengineering and by reevaluating how data are interpreted.31-33 However, this should be done with caution, as some past attempts to correct for race and ethnicity in algorithms have exacerbated disparities and are subject to ethical concerns, such as those leading to the removal of the African American race coefficient from the Chronic Kidney Disease Epidemiology Collaboration estimated glomerular filtration formula in 2021.34,35 Even though skin pigmentation is less nebulous than race and ethnicity–associated factors that have been previously hypothesized to affect GFR estimation, it would still be prudent to justify why a correction factor is appropriate. Ideally, the SpO2 reading would be corrected based on objective colorimetry, not reported race or ethnicity.36
Strengths and Limitations
Strengths of our study include a large cohort of patients and high-resolution data, which allowed us to control for multiple clinical confounders. By averaging oxygenation data over several days, we derived longer-term values to complement the temporal matching studies that we previously referenced.
Our study also has several limitations, including that it was based on data from 1 institution. We initially intended to perform this analysis with a multi-ICU research database, but marked regional differences in oxygen delivery rates combined with substantially different percentages of Black patients, as well as nonstandard reporting of oxygen delivery devices, precluded further analysis. Our study was also limited to nasal cannula oxygen delivery and did not study titration of fraction of inspired oxygen. Time off oxygen is likely underreported as well, because a room-air designation may not be documented with the vital signs.
We also did not temporally match hemoglobin oxygen saturation or SpO2 values with specific oxygen delivery data points for the reasons we have described and because it was not feasible to reliably determine if, how, and when specific blood gas laboratory results were used in minute-by-minute oxygen titration. We did not assess for variable effects at degrees of hypoxemia. Lastly, we were limited to self-reported race and ethnicity, and we did not assess skin tone. Hospital race and ethnicity are ideally documented as reported by the patient or proxy, and while agreement is generally good, this reporting is imperfect.37
We suggest several areas for further studies. Similar analyses should be performed at other institutions and should explore specific factors within a racial and ethnic group that could put some patients at particularly high risk of oxygenation disparities, including skin tone, degree of desaturation, exposure to specific oxygen delivery devices, comorbidities, and other sociodemographic factors. Differences in oxygen supplementation in patients receiving invasive or noninvasive positive pressure ventilation should be studied and a potential association of vasopressors and inotropes should be explored in further detail. Clinical decisions other than oxygen delivery that may be affected by pulse oximeter performance discrepancies should also be explored.
In this cohort study, Asian, Black, and Hispanic patients in the ICU received less supplemental oxygen than White patients, and this finding was associated with differences in pulse oximeter performance. Further research is needed to confirm these findings and explore other clinical factors associated with treatment disparities.
Accepted for Publication: May 13, 2022.
Published Online: July 11, 2022. doi:10.1001/jamainternmed.2022.2587
Corresponding Author: Eric Raphael Gottlieb, MD, MS, Laboratory for Computational Physiology, 77 Massachusetts Ave, E25-505, Cambridge, MA 02139 (ericgott@mit.edu).
Author Contributions: Drs Gottlieb and Celi had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: All authors.
Acquisition, analysis, or interpretation of data: Gottlieb, Ziegler, Rush.
Drafting of the manuscript: Gottlieb, Ziegler, Morley, Rush.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Gottlieb, Ziegler.
Administrative, technical, or material support: Gottlieb.
Supervision: Rush, Celi.
Conflict of Interest Disclosures: None reported.
Funding/Support: Dr Gottlieb is supported by the National Institutes of Health National Institute of Diabetes and Digestive and Kidney Diseases (T32DK007527). Dr Celi is supported by the National Institutes of Health National Institute of Biomedical Imaging and Bioengineering (R01EB017205).
Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Additional Information: MIMIC-IV data are publicly available with training in human participants research and application. Statistical code is available on request to the corresponding author.
11.Donaldson
SV, Thomas
AN, Gillum
RF, Mehari
A. Geographic variation in racial disparities in mortality from influenza and pneumonia in the United States in the pre-coronavirus disease 2019 era.
Chest. 2021;159(6):2183-2190. doi:
10.1016/j.chest.2020.12.029
PubMedGoogle ScholarCrossref 19.Hirano
Y, Shinmoto
K, Okada
Y,
et al. Machine learning approach to predict positive screening of methicillin-resistant
Staphylococcus aureus during mechanical ventilation using synthetic dataset from MIMIC-IV database.
Front Med (Lausanne). 2021;8:694520. doi:
10.3389/fmed.2021.694520
PubMedGoogle ScholarCrossref 29.Denault
M-H, Péloquin
F, Lajoie
A-C, Lacasse
Y. Automatic versus manual oxygen titration in patients requiring supplemental oxygen in the hospital: a systematic review and meta-analysis.
Respiration. 2019;98(2):178-188. doi:
10.1159/000499119
PubMedGoogle ScholarCrossref 35.Inker
LA, Eneanya
ND, Coresh
J,
et al; Chronic Kidney Disease Epidemiology Collaboration. New creatinine- and cystatin C-based equations to estimate GFR without race.
N Engl J Med. 2021;385(19):1737-1749. doi:
10.1056/NEJMoa2102953
PubMedGoogle ScholarCrossref