A, Bars represent mean amount of money raised in campaigns in each neighborhood deprivation index (NDI) level, with error bars denoting bootstrapped 95% CIs. Compared with the least deprived quartile (NDI 1), campaigns in NDI 2 areas raised 13.39% less, those in NDI 3 areas raised 18.14% less, and those in NDI 4 areas raised 26.07% less. B, Bars indicate mean amount of money raised when a particular text feature is present or absent, according to the fitted multivariable regression model described in Table 4. The differences in amounts raised between campaigns that did or did not mention the text feature were 5.23% for self-reliance, 15.40% for bravery, 11.87% for militaristic metaphors, 6.58% for type of treatment, 7.36% for out-of-pocket (OOP) costs, 1.39% for insurance, 9.58% for cancer type, 13.80% for warmth, and 0.09% for gratitude.
aP < .001.
bP < .05.
eAppendix 1. Search Terms Used to Determine Cancer Campaigns
eAppendix 2. Supplemental Methods: Neighborhood Deprivation Index
eFigure 1. US Census Socioeconomic Data by Neighborhood Deprivation Index Quartile
eTable 1. Search Terms Used to Determine and Recode Cancer Type
eTable 2. Search Terms and Regular Expressions (Regex) Used to Determine and Recode Mentions of Insurance, Out-of-Pocket Costs, and Treatment Type
eAppendix 3. Keyword Searches for Deservingness Text Features
eTable 3. Associations Between Amount Raised and Campaign Year
eTable 4. Spearman Rank Correlations Between Amount Raised and Relevant Continuous Variables
eTable 5. Descriptive Statistics for Continuous Variables
eFigure 2. Log Amount Raised by Neighborhood Deprivation Index
eTable 6. Expected and Observed Counts for Text Indicators by County Socioeconomic Status
eTable 7. Full Outputs From Multivariable Regression Model on Amount Raised (Log-Transformed)
Customize your JAMA Network experience by selecting one or more topics from the list below.
Silver ER, Truong HQ, Ostvar S, Hur C, Tatonetti NP. Association of Neighborhood Deprivation Index With Success in Cancer Care Crowdfunding. JAMA Netw Open. 2020;3(12):e2026946. doi:10.1001/jamanetworkopen.2020.26946
How is online crowdfunding for cancer care associated with existing socioeconomic health disparities in the US cancer care setting?
In this cross-sectional study of 144 061 cancer crowdfunding campaigns, those located in US counties with high socioeconomic status raised significantly more than campaigns in lower–socioeconomic status counties. Crowdfunders who used campaign narratives to portray beneficiaries as worthy of donations raised significantly more than those without such portrayals, and the use of these portrayals was unequally distributed across socioeconomic strata.
These findings suggest that crowdfunding’s reliance on access to interpersonal wealth and proficiency in digital self-marketing may disproportionately benefit those with existing socioeconomic advantage.
Financial toxicity resulting from cancer care poses a substantial public health concern, leading some patients to turn to online crowdfunding. However, the practice may exacerbate existing socioeconomic cancer disparities by privileging those with access to interpersonal wealth and digital media literacy.
To test the hypotheses that higher county-level socioeconomic status and the presence (vs absence) of text indicators of beneficiary worth in campaign descriptions are associated with amount raised from cancer crowdfunding.
Design, Setting, and Participants
This cross-sectional analysis examined US cancer crowdfunding campaigns conducted between 2010 and 2019 and data from the American Community Survey (2013-2017). Data analysis was performed from December 2019 to March 2020.
Neighborhood deprivation index of campaign location and campaign text features indicating the beneficiary’s worth.
Main Outcomes and Measures
Amount of money raised.
This study analyzed 144 061 US cancer crowdfunding campaigns. Campaigns in counties with higher neighborhood deprivation raised less (–26.07%; 95% CI, –27.46% to –24.65%; P < .001) than those in counties with less neighborhood deprivation. Campaigns raised more funds when legitimizing details were provided, including clinical details about the cancer type (9.58%; 95% CI, 8.00% to 11.18%; P < .001) and treatment type (6.58%; 95% CI, 5.44% to 7.79%; P < .001) and financial details, such as insurance status (1.39%; 95% CI, 0.20% to 2.63%; P = .02) and out-of-pocket costs (7.36%; 95% CI, 6.18% to 8.55%; P < .001). Campaigns raised more money when beneficiaries were described as warm (13.80%; 95% CI, 12.30% to 15.26%; P < .001), brave (15.40%; 95% CI, 14.11% to 16.65%; P < .001), or self-reliant (5.23%; 95% CI, 3.77% to 6.72%; P < .001).
Conclusions and Relevance
These findings suggest that cancer crowdfunding success ay disproportionately benefit those in high–socioeconomic status areas and those with the internet literacy necessary to portray beneficiaries as worthy. By rewarding those with existing socioeconomic advantage, cancer crowdfunding may perpetuate socioeconomic disparities in cancer care access. The findings also underscore the widespread nature of financial toxicity resulting from cancer care.
US individuals experience high rates of disease-related financial burden, with 1 in 4 reporting trouble paying medical bills.1 Rates of financial distress due to medical expenses among patients with cancer in particular are estimated to range from 16% to 73%.2,3 Treatment-related financial hardship can lead patients to adopt unsustainable coping methods, including debt accumulation, medication nonadherence, foregoing medical appointments, and exhausting savings.4,5 Thus, it is not surprising that financial toxicity (FT) resulting from cancer care is associated with adverse mental and physical health outcomes.2,6,7 To alleviate FT, some patients with cancer have turned to online fundraising, or crowdfunding.
Crowdfunding campaigns solicit donations from friends, family, and strangers, mediated by websites that host pages describing the beneficiary’s story and needs. Effective campaigns successfully engage more donors by relaying a sympathetic narrative of the recipient,8 thus demonstrating the beneficiary’s worth or deservingness for financial assistance. Not surprisingly, past work on medical crowdfunding has raised the concern that crowdfunding may exacerbate existing socioeconomic disparities by disproportionately benefiting those with access to interpersonal wealth and the internet, as well as those with the digital media literacy necessary to successfully frame an online campaign.9-13 This concern is especially poignant given the positive association between internet literacy and socioeconomic status (SES) among children,14 college students,15 and adults.16
In the context of crowdfunding, internet and digital literacy may be understood as proficiency in presenting a beneficiary as worthy of donations by appealing to perceptions of deservingness. The stereotype content model is a framework for studying social judgments along 2 dimensions of perceived warmth and competence.17 Applied to SES, low-SES individuals are stereotyped as warmer but less competent than high-SES individuals.18 Consequently, emphasizing a beneficiary’s warmth, gratitude, and kindness can communicate their worth.19 Crowdfunders may also benefit by avoiding language that could lead donors to perceive the beneficiary as incompetent and responsible for their hardship or as misleading donors regarding the true extent of their needs.18,20-23 Specific to cancer, language describing an individual’s compliance with treatment by battling their cancer while remaining brave may also position a beneficiary as more deserving of donations.23
Quantitative studies have yet to examine the associations among SES, text indicators of worth, and cancer crowdfunding on a large scale within the US cancer care setting. To address this gap in the literature, we leveraged open-source data mining tools to analyze all US cancer crowdfunding campaigns shared publicly on GoFundMe.com. Given past work highlighting the importance of digital literacy in perpetuating cancer care disparities through crowdfunding,8,9,11,12 we focused on crowdfunders’ ability to portray beneficiaries as deserving and sympathetic, in addition to the geographical socioeconomic context of the campaign. We tested the hypotheses that the amount raised by campaigns, but not goal amount, would be associated with higher county-level SES and text describing the beneficiary’s worth, including positive stereotypes of low-SES individuals, distancing from negative stereotypes of low-SES individuals, legitimizing clinical and financial details, and consistency with cancer narratives. We also posited that campaigns in higher-SES counties would more often use the text features describing the beneficiary’s worth.
This cross-sectional study used public crowdfunding data from GoFundMe.com and InternetArchive.org, socioeconomic data from the US Census Bureau, and geographic information from Mapbox24,25 and the Federal Communications Commission. Columbia University Medical Center’s institutional review board determined this research to be exempt and classified it as non–human subjects research; thus, informed consent was not needed, in accordance with 45 CFR §46. The code used for this analysis is available on Github.26 This report follows the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline for cross-sectional studies.
We implemented a web crawler in Python programming language version 3.7 (Python) to automatically retrieve information from individual campaign pages using GoFundMe.com’s public sitemap.27 To account for missing information and old campaigns taken down before our collection date, we deployed a second web crawler to scrape archived GoFundMe.com webpages available through InternetArchive.org’s Wayback Machine application programming interface (API).28 We conducted this search on September 10, 2019, and scraped 1 856 154 pages in total. For each campaign, we collected the campaign title, creation date, location, description, tag for campaign category, amount of money raised, goal amount, number of contributors, number of social media shares, and number of likes or followers.
Next, we curated a cancer-specific subset of the resulting data set (Table 1), yielding 144 752 cancer crowdfunding campaigns. Campaigns were selected on the basis of a keyword search of titles and descriptions (eAppendix 1 in the Supplement). Because we intended to capture campaigns for individuals (rather than organizations or initiatives), we restricted campaigns to only those categorized as medical by users in GoFundMe.com’s mutually exclusive campaign categories.
Using location names provided by the campaign posters, we geocoded these 144 752 campaigns using a geocoding API from Mapbox.24,25 We then used an API from the Federal Communications Commission to map the latitudes and longitudes to county federal information processing codes,29 allowing for linkage with county-level SES US Census data. Campaigns failed to geocode if the location was indecipherable (eg, a non–zip code number) or was not in the US. In total, 542 campaigns failed to geocode. We then excluded any campaigns with research in the title to reduce the risk of including campaigns not intended for individuals with cancer. This resulted in a final data set of 144 061 cancer crowdfunding campaigns between the years 2010 and 2019 (Table 1).
After curating the final data set, we Winsorized the goal amount to handle extreme outliers by reassigning values below the 5th percentile and above the 95th percentile, as recommended by past work.30 Using this method, 6.32% of values (9106 campaigns) were reassigned, such that values above the 95th percentile (2113 campaigns [1.47%]) were reassigned to $100 000 (the 95th percentile value), and values below the 5th percentile (6993 campaigns [4.85%]) were reassigned to $2000 (the 5th percentile value).
We extracted a number of socioeconomic variables31-33 from the American Community Survey using the US Census Bureau’s API (see eAppendix 2 and eFigure 1 in the Supplement for details).34 We used 5-year estimates (2013-2017) for all variables and performed a principal components analysis in 2 steps to compute a single neighborhood deprivation index (NDI), similar to past work.31-33
After running the initial principal components analysis, we omitted variables with a factor loading less than 0.25.31,33 On the basis of this criterion, the final NDI included unemployment, poverty, percentage uninsured, high school completion, internet access, households headed by a single parent, and households with annual income less than $35 000. These factors explained 59.25% of the variability across counties, similar to past work,33,35 and we used them to calculate standardized NDIs (see eAppendix 2 in the Supplement for factor loadings and details). NDI scores were grouped into quartiles,31-33 and index scores were matched to campaigns according to the campaign’s county federal information processing code.
To extract features related to self-reliance, militaristic metaphors, bravery, clinical details, and financial details, we conducted a series of keyword searches using regular expressions, as detailed in eTable 1, eTable 2, eTable 3, and eAppendix 3 in the Supplement. Each search was intended to capture both a single key word (eg, brave) and words indicating similar constructs (eg, courage, strength, and hero). Campaigns were categorized as either containing or not containing the construct in question according to the results of the keyword search.
Our primary outcome was the amount raised by crowdfunding campaigns. We also examined campaign goal amount as a secondary campaign outcome.
We generated descriptive statistics to characterize the monetary campaign information, NDI quartile distributions, data missingness, and presence of text features. For all analyses, we omitted cases with missing data for the variable of interest. To test the hypothesis that the amount raised by campaigns, but not goal amount, would be associated with higher county-level SES, we used 1-way analyses of variance comparing mean amount raised and goal amount across the 4 NDI quartiles. All tests for significance were 2-sided. We examined significant (P < .05) overall differences by NDI quartile using post hoc Tukey HSD tests. To test the hypothesis that the amount raised would be associated with text describing the beneficiary’s worth, we used independent samples t tests assessing differences in amount raised and goal amount by presence of the following text features: warmth, gratitude, self-reliance, cancer type, treatment type, insurance, out-of-pocket costs, bravery, and militaristic metaphors. We used χ2 tests of independence to compare the observed and expected counts of each text mention listed across NDI quartile. We probed significant overall associations using pairwise χ2 tests with a Bonferroni correction for the 6 pairwise comparisons between NDI quartiles, resulting in a significance threshold of P < .0083.36
We constructed a generalized linear regression model to estimate associations between amount raised with campaign textual characteristics and SES, while adjusting for variation attributable to social media shares, goal amount, year of creation, and number of campaign contributors. We selected these variables a priori on the basis of past work37 and author assumptions. We confirmed their inclusion in the model as potential confounders on the basis of significant univariable associations with amount raised (eTable 3 and eTable 4 in the Supplement). To account for nonnormality in the distribution of errors and considerable positive skew in amount raised (eTable 5 in the Supplement), we log-transformed the amount raised and entered it as the dependent variable in regression models. We calculated the expected percentage change in amount raised when a given feature was present.38 Data analysis was performed using Python version 3.7 and R statistical software version 3.6.3 (R Project for Statistical Computing) from December 2019 to March 2020.
We curated the largest reported sample of US cancer crowdfunding campaigns to date, with 144 061 campaigns included in the analyses. Descriptive statistics are displayed in Table 1 and eTable 5 in the Supplement. An analysis of variance indicated significant differences in amount raised across NDI quartiles. The mean (SD) amounts raised were $7402.94 ($12 184.61) for NDI 1 (the least deprived quartile), $6009.28 ($10 740.33) for NDI 2, $5775.41 ($12 183.32) for NDI 3, and $4857.87 ($9797.49) for NDI 4 (the most deprived quartile) (F3,143 568 = 256.00; P < .001) (Table 2). Tukey honest significant different post hoc tests revealed a dose-dependent effect, such that the amount raised increased significantly with decreasing levels of deprivation. However, there was also a significant difference in the stated goal amount by NDI quartile. As with amount raised, those in less deprived counties sought a higher goal amount than those in more deprived counties, although the means for counties in the second and third deprivation quartiles did not significantly differ from one another (see eFigure 2 in the Supplement).
As shown in Table 3, campaigns that mentioned a beneficiary’s warmth or gratitude raised significantly more money than those that did not (mean [SD], $8050.41 [$13 196.40] vs $5993.32 [$11 194.11]; t143 572 = 25.96; P < .001), but they also indicated a significantly higher goal amount. Campaigns that described the beneficiary as self-reliant raised significantly more than those that did not (mean [SD], $7515.91 [$13 398.57] vs $6167.24 [$11 258.40]; t143 572 = 15.67; P < .001) but had a significantly higher goal amount. Those who mentioned a specific cancer type, specific treatment, insurance type, or out-of-pocket costs raised significantly more money than campaigns that did not mention these features, although they also requested a higher goal amount. Finally, campaigns that invoked militaristic metaphors or described the beneficiary as brave raised significantly more money than campaigns that did not, but they also requested a higher goal amount than campaigns without these features (Table 3).
We found that the use of text features indicating deservingness would be unequally distributed across NDI quartiles. In general, campaigns in the least deprived quartile used text features indicating deservingness more often than those in more deprived quartiles, with the exception of insurance, which was mentioned more often in campaigns in more deprived quartiles (see Table 2 for distributions and pairwise comparisons and eTable 6 in the Supplement for expected and observed counts).
The results of a generalized linear regression model with amount raised as the dependent variable are summarized in Table 4 and the Figure. After adjusting for potential effects of social media shares, campaign year, number of contributors, and goal amount on amount raised, we found results similar to those reported in univariable analyses (see eTable 7 in the Supplement for full regression model outputs). Campaigns in the most deprived NDI quartile group raised 26.07% (95% CI, –27.46% to –24.65%; P < .001) less than those in the least deprived NDI quartile group, those in the third NDI quartile raised 18.14% less than those in the least deprived quartile, and those in the second NDI quartile raised 13.39% less than those in the least deprived quartile. Campaigns that mentioned a beneficiary’s warmth raised significantly more than those that did not (13.80%; 95% CI, 12.30% to 15.26%; P < .001), but the association between amount raised and gratitude was not significant, as did campaigns that mentioned the beneficiary’s self-reliance (5.23%; 95% CI, 3.77% to 6.72%; P < .001) and bravery (15.40%; 95% CI, 14.11% to 16.65%; P < .001). We found that greater detail regarding the beneficiary’s need were associated with a greater amount raised. Campaigns that mentioned the beneficiary’s cancer type (9.58%; 95% CI, 8.00% to 11.18%; P < .001), treatment type (6.58%; 95% CI, 5.44% to 7.79%; P < .001), insurance (1.39%; 95% CI, 0.20% to 2.63%; P = .02), or out-of-pocket costs (7.36%; 95% CI, 6.18% to 8.55%; P < .001) raised significantly more than campaigns that did not mention these needs (Table 4). Finally, campaigns that mentioned the beneficiary’s bravery or used militaristic metaphors raised significantly more money than campaigns without these features.
This cross-sectional analysis found that crowdfunding campaigns for cancer care expenses raised significantly more money when they were located in US counties with higher SES and when they described beneficiaries as worthy of donations (eg, as consistent with positive low-SES stereotypes, inconsistent with negative low-SES stereotypes, or aligned with cancer narratives of bravery and battle imagery).39,40 The use of these text features was unequally distributed across socioeconomic strata: campaigns in higher-SES counties tended to use these indicators more often than those in lower-SES counties. These findings suggest that online crowdfunding may exacerbate socioeconomic disparities in cancer care and also highlight the widespread nature of difficulty paying for cancer care. As cancer care costs continue to increase,41-43 FT resulting from cancer care poses a significant public health issue. Although lower-SES patients are at greater risk of FT, we found that these patients are the least likely to benefit from cancer crowdfunding as a way to mitigate it.
Although qualitative and theoretical work on medical crowdfunding has suggested that the practice may perpetuate socioeconomic health disparities,9,11,12,44 quantitative work on cancer crowdfunding is limited. One Canadian study of 1788 campaigns linked geospatial campaign counts with corresponding socioeconomic data and found a positive association between crowdfunding use and SES.13 Our study confirms their findings in a much larger sample of 144 061 US cancer crowdfunding campaigns. In addition to corroborating previous associations between high SES and cancer crowdfunding, our results demonstrate the importance of campaign text characteristics, particularly characteristics that portray beneficiaries as worthy. As with other indicators of digital and internet literacy,15,16,45,46 the use of personal marketing strategies was unevenly distributed across socioeconomic strata, often to the benefit of those in higher-SES areas. This supports previously noted concerns that medical crowdfunding disproportionately benefits those with the requisite internet literacy.9,11,12,44
Our findings are comparable to recent work by Cohen and colleagues47 focusing on explicit mentions of being insured and uninsured in US cancer crowdfunding campaigns. In their sample of 1035 campaigns, Cohen et al47 found no differences in amount raised by insurance status but did find higher goal amounts among uninsured and underinsured beneficiaries. We found similar results for goal amount, but we also found a small but significant association between mentioning insurance and amount raised. This discrepancy may be due to our broader definition of insurance mention but may also be related to our novel approach to web scraping and data reduction. Although previous studies have used GoFundMe.com’s internal search bar to find cancer crowdfunding campaigns,13,37,47 we leveraged the domain’s sitemap to collect all indexed campaigns. Because internal search bars commonly use search engine optimization to return results with greater engagement potential and more recent results, our method decreased the risk of this bias by collecting URLs directly from GoFundMe.com’s sitemap.
Although, to our knowledge, our study offers the largest and most comprehensive quantitative analysis of cancer crowdfunding campaigns in the US to date, it is limited in some respects. First, using automated text mining rather than manual coding may include campaigns in counts of text features when the use context of certain words does not align with the connotation we assumed. Similarly, we were unable to extract information regarding cancer stage at campaign initiation. A related concern is that crowdfunding campaigns are often initiated by someone other than the beneficiaries themselves. Future work would do well to use natural language processing techniques to explore this issue. Our automated approach to geocoding the data set carries similar limitations related to trade-offs between precision and quantity. Because geocoding relied on user-provided locations with varying levels of granularity, the smallest unit of geographic analysis for US Census data was the county. Further insights may be gleaned by collaborating with crowdfunding platforms to obtain primary data. Because we did not have access to primary data such as the campaign initiator’s internet provider address, we were not able to compare the user-defined campaign location with the initiator’s location. Despite our novel approach to data collection, our sample is prone to biases in data availability. Earlier campaigns may be sparse because of users deleting campaigns in the time since initiation. Because campaigns can be listed as inactive for any reason, it is difficult to anticipate how these campaigns would influence results. We were able to recover data for some inactive campaigns by scraping archived GoFundMe.com webpages with the InternetArchive.org’s API. Future research regarding campaign closure, including time from initiation to closure and reasons for closure, may provide additional insights. Furthermore, we did not collect other data from campaigns that may be informative, including the number of campaign updates, campaign comments from other users, and image data.
To our knowledge, this cross-sectional study is the first to provide large-scale quantitative support for the notion that cancer crowdfunding may perpetuate SES disparities in US cancer care. In particular, we found that crowdfunders with the digital literacy necessary to market beneficiaries as worthy and those in higher-SES areas raised significantly more money through online cancer crowdfunding than those without these advantages. Our results suggest that rather than remedying socioeconomic disparities in cancer care, online crowdfunding provides additional privileges to those with existing socioeconomic advantage, potentially exacerbating SES disparities and further marginalizing those most at risk of FT.
Accepted for Publication: September 25, 2020.
Published: December 3, 2020. doi:10.1001/jamanetworkopen.2020.26946
Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2020 Silver ER et al. JAMA Network Open.
Corresponding Author: Nicholas P. Tatonetti, PhD, Department of Biomedical Informatics, Columbia University Irving Medical Center, 622 W 168th St, PH20, New York, NY 10032 (firstname.lastname@example.org).
Author Contributions: Drs Hur and Tatonetti had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Ms Silver and Mr Truong contributed equally to this work.
Concept and design: All authors.
Acquisition, analysis, or interpretation of data: Silver, Truong, Ostvar, Hur.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: Silver, Truong, Hur, Tatonetti.
Statistical analysis: Silver, Truong, Ostvar.
Obtained funding: Tatonetti.
Administrative, technical, or material support: Hur, Tatonetti.
Supervision: Hur, Tatonetti.
Conflict of Interest Disclosures: None reported.