Key Points español 中文 (chinese)
Do patient decisions about sharing their electronic health records and biospecimens for research vary according to health care institution, data or biospecimen item, patient characteristics, data recipient, and format in which consent choices are presented?
In this survey study of 1246 patients who completed a data and biospecimen sharing survey after being randomly assigned to 1 of 4 options with different layout and formats of indicating sharing preferences, patient preference for sharing compared with no sharing was significantly higher after controlling for covariates when presented with the opt-out compared with the opt-in format. The form layout (detailed vs simple) was not associated with the sharing decision.
The findings suggest that many patients may be willing to share data and biospecimens for research and that researchers’ affiliations, the design of consent forms, and patient age and health literacy are associated with patient sharing decisions.
Patients increasingly demand transparency in and control of how their medical records and biospecimens are shared for research. How much they are willing to share and what factors influence their sharing preferences remain understudied in real settings.
To examine whether and how various presentations of consent forms are associated with differences in electronic health record and biospecimen sharing rates and whether these rates vary according to user interface design, data recipients, data and biospecimen items, and patient characteristics.
Design, Setting, and Participants
For this survey study, a data and biospecimen sharing preference survey was conducted at 2 academic hospitals from May 1, 2017, to September 31, 2018, after simple randomization of patients to 1 of 4 options with different layout and formats of indicating sharing preferences: opt-in simple, opt-in detailed, opt-out simple, and opt-out detailed.
All participants were presented with a list of data and biospecimen items that could be shared for research within the same health care organization or with other nonprofit or for-profit institutions. Participating patients were randomly asked to select the items that they would share (opt-in) or were asked to select items they would not share (opt-out). Patients in these 2 groups were further randomized to select only among 18 categories vs 59 detailed items (simple vs detailed form layout).
Main Outcomes and Measures
The primary end points were the percentages of patients willing to share data and biospecimen categories or items.
Among 1800 eligible participants, 1246 (69.2%) who completed their data sharing survey were included in the analysis, and 850 of these patients (mean [SD] age, 51.1 [16.7] years; 507 [59.6%] female; 677 [79.6%] white) responded to the satisfaction survey. A total of 46 participants (3.7%) declined sharing with the home institution, 352 (28.3%) with nonprofit institutions, and 590 (47.4%) with for-profit institutions. A total of 836 (67.1%) indicated that they would share all items with researchers from the home institution. When comparing opt-out with opt-in interfaces, all 59 sharing choice variables (100%) were associated with the sharing decision. When comparing simple with detailed forms, only 14 variables (23.7%) were associated with the sharing decision.
Conclusions and Relevance
The findings suggest that most patients are willing to share their data and biospecimens for research. Allowing patients to decide with whom they want to share certain types of data may affect research that involves secondary use of electronic health records and/or biosamples for research.
Use of personal data without explicit user consent has recently put technology companies in the public spotlight.1-4 In contrast, it appears that fewer concerns have been raised regarding the use of medical records and biospecimens,5 which are also sensitive, for secondary use purposes such as research. It is unclear whether this relative lack of concern is because patients are generally unaware that their deidentified records are being made available to researchers,6 their lack of knowledge that anonymized records can be traced back to individuals,7,8 or simply because there have not been many widely publicized incidents to date.9
Current laws and regulations require health care institutions to comply with a minimally necessary standard in sharing patient medical records and biospecimens for research.10 Some legislation11,12 regulates research reuse of patient medical records and biospecimens so that health care institutions can allow deidentified data sharing and identified data sharing (as long as proper institutional review board approvals have been obtained) unless the patient explicitly declines the use of data and biospecimens for purposes other than direct patient care. This all-or-nothing option is problematic because, alerted by recent high-profile cases, the increasing awareness among the general public regarding inappropriate reuse of personal data without explicit user consent13 may markedly change patients’ attitudes toward secondary use activities that involve their medical records and biospecimens (ie, patients may start denying research access to all of their data and biospecimens, even if they are willing to share most of their data and biospecimens or to share them only with certain types of institutions).14,15 The regulatory landscape is also changing; for example, in the United States as of September 23, 2013,16 newly enrolled patients, who need to sign a Health Insurance Portability and Accountability Act (HIPAA) authorization, must opt in to allow the use of their personal health information for optional substudies and future secondary use. In the European Union, the General Data Protection Regulation17 implemented in 2018 requires that patients provide consent for clinical data use for research. These issues point to a critical need to better understand patients’ sharing preferences.
Surveys using hypothetical scenarios have been conducted,18-20 but there has been a paucity of research studies involving electronic health record (EHR) and biospecimen sharing preferences applied in real settings.21 Tiered consent (ie, breaking the record into smaller units in a consent form and allowing partial use of the EHR) is not routinely available in practice today, limiting patients’ rights and participation in how their health data are being shared, while there is increasing evidence that what patients want to be asked6 and what they consider to be sensitive varies. In California, patient’s specific permission to share mental health, substance abuse, HIV status, and genetic information is required in HIPAA authorization forms, but no other items are specified.22 In many states, there is no requirement for patient’s specific permission for sharing these types of data items.10
Our study assessed patients’ preferences toward sharing specific data items in their EHRs and biospecimens with different types of researchers. We hypothesized that there would be different decisions for sharing, depending on researchers’ affiliations, patient characteristics, and the user interface design format of the consent form in which data sharing preferences were elicited. In this study, we randomly assigned patients to 1 of 4 types of preference elicitation forms to examine whether the form layout and opting-in or opting-out method were associated with patients’ sharing preferences.
Study Design and Population
For this survey study, patients were recruited from 2 academic medical centers in Southern California. They were approached by email invitation or in person in the waiting area of 10 adult outpatient clinics. Inclusion criteria were age of 18 years or older, being a patient at either academic medical center, and ability to read English or Spanish. Among 1800 eligible participants, 1582 signed a consent form to participate in this study. Of these, 1246 (69.2%) completed their data sharing survey and were included in the analysis, and 850 (68.2%) of these responded to the satisfaction survey. Although it was preferred that all research activities be conducted through the research website, the study provided an option to allow patients who did not have easy access to the internet to participate using paper forms. Preference elicitation and surveys were conducted from May 1, 2017, to September 31, 2018. Written informed consent was obtained from the web portal immediately after sign-up for online users and via a paper form for other users. Data were deidentified for the statistical analyses; however, we tracked the identities of the individuals to ensure their data selections were honored during the study and to know who completed the survey so that we could compensate them with a gift certificate. The institutional review boards of the University of California, San Diego and the University of California, Irvine approved this study. This study followed the American Association for Public Opinion Research (AAPOR) reporting guideline.
Study participants were invited to select preferences of sharing their data and biospecimens for research use. The preferences for data sharing were honored by the institutions during the study period. Each participant also received periodical reports that listed research activities that involved secondary use of their medical records.
The list of data and biospecimens that a participant could choose to share or not share included 59 data and biospecimen items grouped into 18 categories (Box). This taxonomy was developed based on a pilot study21 and 5 focus groups of 18 patients who also provided input on how to best present the selection options on a computer screen and on paper.
Box Section Ref ID
List of Date Elements and Categories
Substance abuse–related disease or condition
Mental health disease or condition
Sexual or reproductive disease or condition
Substance abuse–related disease or condition
Mental health disease or condition
Sexual or reproductive disease or condition
Mental health relateda
a Items and categories not previously included in our pilot study.21
Eleven data categories that encompassed 50 data items, 6 data categories (sexual life, pregnancy history, adoption history, body measurements, vital signs, and allergies) without detailed data items plus 3 biospecimen items grouped into 1 biospecimen category were available for selection (Box). The simple form contained 18 categories, and the detailed form contained 53 detailed items plus 6 data categories (ie, there were 59 sharable items in this detailed form). When 2 interventions (opting method and form layout) were combined, each participant was randomized using simple randomization to 1 of 4 conditions: (1) opt-in simple (n = 322), (2) opt-in detailed (n = 319), (3) opt-out simple (n = 298), and (4) opt-out detailed (n = 307).
For the 4 groups, participants could indicate sharing preferences that could result in 8 combinations of 3 types of researcher’s affiliations (ie, the institution holding their EHRs and biospecimens, including home institution, nonprofit institution, and for-profit institution): (1) do not share, regardless of affiliation; (2) share with the home institution only; (3) share with nonprofit institutions only; (4) share with for-profit institutions only; (5) share with the home institution and nonprofit institutions; (6) share with the home institution and for-profit institutions; (7) share with nonprofit institutions and for-profit institutions; and (8) share with any researcher, regardless of affiliation.
There was no time limit to complete the sharing preferences, which could be changed over time (preferences submitted as of September 31, 2018, were considered in the analysis). Participants indicated their sharing preferences by selecting an item or category that they wanted to share when they received an opt-in form or unselecting what they did not want to share when they received an opt-out form. For the simple forms, when a category was selected, all items that belonged to that category were selected to compare individual items across groups. Participants could assess information about which study used or did not use their data and modify their future sharing choices at any time. The screenshots of our digital consent system are shown in eFigure 1 and eFigure 2 in the Supplement. Once the intervention period was over, a request to complete a satisfaction survey was submitted to assess participant satisfaction with the study and to obtain information about self-reported sociodemographics. Participants had 3 months to complete this survey. Monthly reminder emails were sent, and participants were compensated with a $10 gift card for the completion of the sharing choice form and a $10 gift card for completing the satisfaction survey. They were not compensated when they made changes to previous selections.
We implemented the Short Assessment of Health Literacy, which is designed to assess health literacy by measuring comprehension of the meaning and relation of 18 sets of keywords.23 A participant was deemed to have an adequate level of literacy if at least 15 items or 83.3% were answered correctly; otherwise, literacy was deemed to be inadequate according to the Short Assessment of Health Literacy evaluation criteria.23
The homogeneity of the 4 randomization groups by variable of interest was assessed with the χ2 test on baseline characteristics. In a univariate analysis, for each of 59 sharable items, a 2 × 2 table was constructed using shared vs not shared as response to a binarized exposure variable (ie, exposure vs reference). An unadjusted odds ratio (OR) and its 95% CI were calculated. Assessed exposure variables included the elicitation form’s opting method (opt-out vs opt-in), form layout (detailed vs simple), patient’s age (≥60 vs <60 years), self-reported health status (very good or better vs worse than very good), health literacy (adequate vs inadequate), sex (female vs male), household income (≥US$125 000 vs <US$125 000 per year), race (white vs nonwhite), educational level (≥4-year college vs <4-year college), and site (site 2 vs site 1). A logistic regression was applied for the model-based adjusted OR after controlling for exposure variables as covariates. Statistical significance was determined by 95% CI of the OR for each sharing choice variable.
Of the 1800 patients eligible for this study, 1582 signed a consent form to participate in this study, 1246 (69.2% of eligible participants) who completed their data sharing preference surveys were included in the primary analysis, and 850 (68.2%) of these responded to the exit survey (507 [59.6%] female; 677 [79.6%] white; mean [SD] age, 51.1 [16.7] years). The participant recruitment and randomization processes are summarized in Figure 1. Randomization assignments and characteristics of the participants who completed sharing preferences and who completed the survey are given in Table 1. The higher number in the opt-in group reflects that only this option was available for the 40 participants who elected to use paper forms (simple or detailed). Of 12 variables in Table 1, none had a significant difference among 4 randomized groups.
A total of 46 participants (3.7%) declined sharing with the home institution, 352 (28.3%) with nonprofit institutions, and 590 (47.4%) with for-profit institutions. A total of 291 patients (23.4%) were willing to share all items with any researcher, whereas 46 (3.7%) were not willing to share any items. The remaining 909 (72.9%) were willing to share selectively, meaning that they wanted to share at least 1 item with at least 1 type of institution with a general preference toward sharing within the institution in which the patient received care, followed by sharing with researchers from nonprofit institutions. A total of 836 patients (67.1%) were willing to share all items with researchers from the home institution.
The analyses based on affiliations focused on those who were not willing to share their data and biospecimens, those who would share their data and biospecimens with the home institution, those who would share their data and biospecimens with the home institution and nonprofit institutions, and those who would share their data and biospecimens with any researcher because the other combinations appeared rarely (4.4%).
Table 2 gives the data sharing preferences of all participants. Demographics, allergies, vital signs, and body measurements were among the items that the participants were most willing to share. Contact information, sexual history, adoption and pregnancy history, and income were the items that the participants were least willing to share.
The sharing preferences were associated with the form’s opting method (opt-out vs opt-in) but not with the layout (detailed vs simple). Participants were willing to share fewer items when they used the opt-in form (Figure 2). Differences according to opting method were significant for all 59 variables (100%). For form layout (eFigure 3 in the Supplement), however, only 14 variables (23.7%) had a significant association with sharing choices. Age of 60 years or older was associated with sharing selections for 56 variables (95%), and adequate health literacy was associated with sharing selections for all 59 variables (100%) (eFigure 4 and eFigure 5 in the Supplement). The associations of opting method with sharing decision remained significant with 1 exception (race) but decreased in magnitude (eFigure 6 in the Supplement) after adjusting for participants’ characteristics and the form layout (from an OR of 1.67 [95% CI, 1.08-2.62] to 1.53 [95% CI, 0.97-2.42]). The adjusted ORs of sharing compared with no sharing for the 59 variables were controlled for form layout, age, educational level, sex, health literacy, household income, self-reported health status, and site in a logistic regression model. For form layout, the number of variables that had a significant association with sharing decision decreased from 14 to 9 after adjusting for participants’ characteristics and the opting method (eFigure 7 in the Supplement).
Participants older than 60 years or deemed to have an adequate health literacy level were more willing to share more items than were their counterparts (eFigure 4 and eFigure 5 in the Supplement). The ORs for all items were greater than 1 and statistically significant, except for sexual life (OR, 1.39; 95% CI, 1.00-1.95), adoption history (OR, 1.37; 95% CI, 0.99-1.92), and pregnancy history (OR, 1.37; 95% CI, 0.98-1.93). Household income, education level, sex, perceived health status, race, and site were not associated with a higher level of sharing for most variables (eFigures 8-13 in the Supplement).
A total of 850 participants (68.2%) completed the satisfaction survey. Of these, 815 (95.9%), could understand the data or information presented in the forms, whereas 17 (2.0%) thought that the choices in the forms were inadequate. A total of 837 (98.4%) enjoyed participating in the study. A total of 517 (60.8%) indicated that having a detailed form layout to make selections had no influence on their sharing decisions, 288 (33.9%) indicated that it made them more willing to share, and 27 (3.2%) indicated that it made them less willing to share their data and biospecimens. The remaining 18 (2.1%) did not answer this question. Consistent with previous findings,21 637 respondents (74.9%) were highly interested in knowing who would use the data or biospecimens, and 683 (80.3%) were also equally willing to share their data and biospecimens for research and health care.
The finding in this study that most patients were willing to share data from their EHRs and biospecimens with researchers is reassuring. Not only can biomedical research benefit from these resources but also a multisite learning health care system5,24 can continuously advance as a result of data-driven improvements to processes and associated outcomes. The finding that 955 participants (76.6%) made sharing choices to select at least 1 item that they did not want to share with a particular type of researcher is important when considering that this item might lead to a decision to decline sharing of the whole record if only an all-or-nothing option is available. This finding is important because the item to withhold may not be of relevance to a certain study, but the current all-or-nothing option, if chosen, would remove that patient’s data from all research studies. The finding that 291 participants (23.4%) were willing to share all items with everyone can help plan for studies based on EHRs and biospecimens that are expected to be shared with a broad range of researchers. The finding that only a few participants (46 [3.7%]) were not willing to share any item is also reassuring. Opt-in forms appear to be the most conservative opting method to obtain sharing preferences, resulting in less sharing.
An important finding of this study is that most participants indicated at least 1 item that should not be shared. There was a preference to share the data and biospecimens within the institution in which the patient received care, followed by nonprofit institutions. In a system in which people can choose where to receive care, it seems plausible that a patient selects to receive care in the most trusted institution, and this trust may more easily transfer to the care of data and biospecimens.
The reluctance to share data and biospecimens with researchers from for-profit institutions needs further investigation because the category aggregates highly different industries and further refinement might reveal subgroups that have higher association with declining to share than others. Strategies to convey how data and biospecimens are being used or will be used for research that includes the development of commercial products to improve health outcomes need to be developed and implemented so that patients can provide consent that is truly informed.
In addition, studies that require permission to use the whole EHR for research may consider provisions for participants to decline sharing of specific items and for participants to specify the types of researchers who should be authorized to work with their data. This approach may increase participation and satisfaction.
This study has some limitations. Patients who elect to receive care at academic medical centers may be more familiar with research and more willing to share their data and biospecimens for research than patients who receive care in other types of institutions. In addition, health literacy in general was relatively high in our sample, so the results may be optimistic. However, this optimistic figure may be counterbalanced by the fact that patients who participated in our study may be more concerned about data and biospecimen sharing than those who declined participation. Thus, our recruitment may have selected for individuals who would in general be more concerned about the privacy of their data and biospecimens and tended to remove more items than would those who did not want to participate in this study. There could also be geographic factors: both institutions were located in California, where privacy protections for EHRs are higher than in many other states and in which biomedical and data science research are prominent. However, these limitations do not detract from the findings that it was practical to implement a system that used patient data and biospecimen sharing preferences to guide services that make these resources available for research and that most patients were willing to share their EHR data and biospecimens for research.
We found that a tiered-permission system that allowed for specific removal of data items or categories of data could be implemented in practice and that it mattered to participants with whom the EHR data and biospecimens would be shared because there were differences in sharing preferences according to the researchers’ affiliations. Participants appreciated being asked about their data and biospecimen sharing preferences. We also found that the way in which sharing preferences were elicited mattered to patients. In this study, data and biospecimen sharing preferences were equivalent across institutions but were different according to the opting method (an opt-out version was associated with more sharing than an opt-in version). A simple form layout that displays data categories was associated with sharing preferences that were equivalent to those elicited from a detailed form layout that displays specific data items.
Accepted for Publication: June 29, 2019.
Published: August 21, 2019. doi:10.1001/jamanetworkopen.2019.9550
Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2019 Kim J et al. JAMA Network Open.
Corresponding Author: Lucila Ohno-Machado, MD, PhD, Department of Biomedical Informatics, UC San Diego Health, University of California, San Diego, 9500 Gilman Dr, Mail Code 0728, La Jolla, CA 92093 (firstname.lastname@example.org).
Author Contributions: Mr J. Kim and Dr H. Kim contributed equally to this work. Mr J. Kim and Dr Ohno-Machado had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: J. Kim, H. Kim, Bath, Jiang, Ohno-Machado.
Acquisition, analysis, or interpretation of data: J. Kim, H. Kim, Bell, Bath, Paul, Pham, Zheng, Ohno-Machado.
Drafting of the manuscript: J. Kim, H. Kim, Bell, Bath, Paul, Pham, Ohno-Machado.
Critical revision of the manuscript for important intellectual content: J. Kim, H. Kim, Bath, Jiang, Zheng, Ohno-Machado.
Statistical analysis: J. Kim, Bath, Jiang.
Obtained funding: H. Kim, Ohno-Machado.
Administrative, technical, or material support: J. Kim, Bell, Bath, Paul, Pham, Zheng, Ohno-Machado.
Supervision: J. Kim, H. Kim, Zheng, Ohno-Machado.
Conflict of Interest Disclosures: Ms Paul and Dr Ohno-Machado reported receiving grants from the National Institutes of Health during the conduct of the study. No other disclosures were reported.
Funding/Support: This work was supported by grants R01HG008802 (Drs H. Kim, Jiang, and Ohno-Machado) and UL1TR001442 from the National Institutes of Health.
Role of the Funder/Sponsor: The funding source had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and the decision to submit the manuscript for publication.
Additional Contributions: Mayara Figueiredo, MS (University of California, Irvine), Cinnamon Bloss, PhD (University of California, San Diego [UCSD]), Pamela Sankar, PhD (University of Pennsylvania), Mildred Cho, PhD (Stanford University), Mark Yarborough, PhD (University of California, Davis), Ken Goodman, PhD (University of Miami), and Malia Fullerton, DPhil (University of Washington), provided input in this project. Ms Figueiredo participated in a paid internship at UCSD and performed some statistical analyses as part of her internship project. Dr Bloss was a paid coinvestigator for this project and organized a workshop at UCSD in which Drs Sankar, Cho, Yarborough, Goodman, and Fullerton examined the protocols and materials of Informed Consent for Clinical Data and Sample Use for Research (iCONCUR) and discussed ethical aspects related to the implementation of the system. They were reimbursed for their travel expenses.
LA. Challenges and opportunities in secondary analyses of electronic health record data. In: Secondary Analysis of Electronic Health Records
. Basel, Switzerland: Springer International Publishing; 2016:17-26. doi:10.1007/978-3-319-43742-2_3
L. Identifying inference attacks against healthcare data repositories. AMIA Jt Summits Transl Sci Proc
. 2013;2013:262-266.PubMedGoogle Scholar
PH. Giving patients granular control of personal health information: using an ethics ‘points to consider’ to inform informatics system designers. Int J Med Inform
. 2013;82(12):1136-1143. doi:10.1016/j.ijmedinf.2013.08.010PubMedGoogle ScholarCrossref
C. Sharing electronic health records: the patient view. Inform Prim Care
. 2006;14(1):55-57.PubMedGoogle Scholar
J. Privacy and Security Solutions for Interoperable Health Information Exchange Report on State Law Requirements for Patient Permission to Disclose Health Information. Chicago, IL: RTI International; 2009.