Assessment of Employee Susceptibility to Phishing Attacks at US Health Care Institutions

Key Points Question Are employees at US health care institutions susceptible to phishing attacks? Findings In this multicenter quality improvement study, more than 2.9 million simulated emails were sent to employees at 6 hospitals, with a median click rate of 16.7%. Repeated phishing campaigns were associated with decreased odds of clicking on a subsequent phishing email. Meaning Employees at US health care institutions may be susceptible to phishing emails, which presents a major cybersecurity risk to hospitals.


Introduction
The security of health care data and systems is rapidly emerging as a critical component of hospital infrastructure, and attacks on hospital information systems have had substantial consequences, with closed practices, canceled surgical procedures, diverted ambulances, disrupted operations, and damaged reputations.
Attacks against hospitals have been increasing, with substantial financial cost as well.
In a recent well-publicized example, a large hospital network was taken offline by a virus for almost 2 weeks, resulting in service disruption, patient confusion, and delays in radiation therapy, among other repercussions. Health care delivery has become increasingly dependent on integrated, complex information systems that are susceptible to disruption. Securing our health information systems is critical to safe and effective care delivery and is now of public health concern.
Phishing is the practice of deceiving individuals into disclosing sensitive personal information or clicking on links that introduce malicious software through deceptive electronic communication. Usually done via email, phishing is a common attack strategy against health care system employees and can be a remarkably accessible, low-cost, and effective way of obtaining real credentials to health care information systems or inducing employees to click on malicious software. Phishing emails can be realistic, and the sender's identity is frequently spoofed, or deliberately faked, so as to appear to be sent by a trusted individual or organization. Once an attacker has access to a system, they can steal personally identifiable information and sell it for profit, disrupt system availability, encrypt a database and demand a ransom payment to unlock it ("ransomware"), manipulate and falsify clinical data, or perform other malicious activities. A recent report indicated that 55% of physicians have experienced a phishing attack.
Employee awareness and training represent an important component of protection against phishing attacks. One method of generating awareness and providing training is to send simulated phishing emails to a group of employees and subsequently target educational material to those who inappropriately click or enter their credentials. For reference, 2 examples of phishing emails are listed in eTable 1 in the Supplement. The first email is a phishing simulation, and the second is an actual phishing email received at 1 of the participating institutions. As shown, the emails can be realistic and often appear to be sent by a trusted individual or member of the employee's organization. Phishing simulation is common in many industries and is also being used in health care, typically as a training and improvement initiative. The simulated emails are designed to be as close as possible to real phishing emails; if the simulated email is clicked, it is used as a real-time opportunity to provide short phishing education to the employee. Several vendors exist that offer phishing simulation as a service (eg, composing and sending the simulation emails, collecting employee responses, providing phishing training, and reporting on click rates to hospital leadership). In this context, we examined the practice of phishing simulation and the extent to which health care employees are vulnerable to phishing simulations and identified potential determinants of vulnerability to email phishing simulation.

Participants
In this retrospective, multicenter quality improvement study, we partnered with a sample of 6 US health care institutions that run phishing simulations using vendor-or custom-developed software tools. These institutions represent a diverse set of organizations across the entire spectrum of care and a range of US geographies, including institutions from the 4 US Census Bureau census regions; all had implemented an information security program. The identities of the specific institutions are anonymized herein for security and privacy concerns. Some participants were health care systems that operated multiple hospitals; in this case, we defined an institution as including multiple hospitals. More information about the institutions is listed in eTable 2 and eTable 3 in the Supplement. The Partners Healthcare Institutional Review Board determined the study to be exempt from review. The requirement of written informed consent was waived for the study. This study adhered to the Standards for Quality Improvement Reporting Excellence (SQUIRE) 2.0 reporting guideline. Data collected from participating institutions included institution, content of the phishing email, the number of emails delivered, and the number of clicks. Collaborators provided their data per phishing campaign, where a campaign was defined as an email with specific content sent to a group of employees. While individual employee characteristics were not available and responses of the same employee were not linked over time, no employees were excluded from phishing campaigns. All employees across all types of hospital roles (clinical and nonclinical) were eligible to receive the emails. One institution (site 2) ran several campaigns against small, targeted subsets of the population (eg, information security professionals). Because these campaigns were not general employee campaigns, they were excluded to increase generalizability.

Email Classification
Because different phishing emails might be more likely to be clicked based on their content, we classified all emails into 1 of the following 3 categories: office related, personal, or information technology (IT) related. These categories were generated by consensus among 3 of us (W.J.G., A.W., and A.B.L.). Emails were then separately classified by 2 of us (W.J.G. and A.W.), and disagreements were refereed by another of us (A.B.L.). Examples of each email category are listed in Table 1.

Statistical Analysis
Institutions were anonymized (site 1 through site 6). The subsequent data set contained no institution-or person-identifiable information. We performed descriptive statistics on the institutions and phishing campaigns. We aggregated our data by institution and by campaign and calculated the proportion of emails that were clicked by employees, as well as the median click rates for each campaign. Multivariable logistic regression, with the use of a generalized estimating equation approach, was used to compute odds ratios (ORs) with 95% CIs for the odds that a phishing email would be clicked during a campaign. We used a generalized estimating equation approach with independence working correlation to obtain robust variance estimates because campaign click rates within an institution may be correlated. Covariates included year (2011-2018, centered on 0), the number of campaigns the institution had run before the phishing email being sent (institutional campaign number 1-5, 6-10, or >10), an indicator for anonymized institution, email category (office related, personal, or IT related), and season. All analyses were conducted using a software program (R; R Foundation for Statistical Computing).

Results
A convenience sample of 6 US health care institutions provided data for the study.  Table 3.

Discussion
In this study of US health care institutions that run phishing simulations, overall click rates varied by institution but were notably high: on average, almost 1 in 7 simulated emails sent were clicked on by employees. In models adjusted for several potential confounders, including year, institutional campaign number, institution, and email category, the odds of clicking on a phishing email were 0.511 lower for 6 to 10 campaigns at an institution and 0.335 lower for more than 10 campaigns at an institution. We also found that there were important institutional differences in click rates, as well as differences in click rates between email category and season.
Our study demonstrates that, similar to other industries, health care institutions conduct phishing simulations to raise awareness and identify employees who may benefit from education and training. We show herein that, under simulation, a large number of employees click on phishing emails, consistent with findings across other industries, where click rates can range from 13% to 49%, depending on industry.
We found that the odds of clicking on a phishing email decreased with greater institutional experience, which we hypothesize may be due to the benefit of running phishing simulation campaigns for employee education and awareness. In addition, we note that there is a wide range of click rates between simulated campaigns. We hypothesize that the range of click rates is due to a number of factors, including prior employee exposure to phishing simulations (eg, from previous employment), complexities of individual phishing emails, email timing, and institutional factors (eg, messaging), as well as individual, employeelevel factors that we were unable to collect or control for, which will need further study.
Health care systems have been increasingly targeted by cyberattacks, either as part of larger international events (eg, WannaCry or NotPetya) or as direct targets themselves. Health care delivery organizations are critical infrastructure and are attractive targets for cybercriminals for several reasons, including the value of personal health data (ranging from $10 to $1000 per record in online marketplaces, depending on completeness ), the criticality of services provided by hospitals, and an overall lack of information security processes. Phishing is an easily deployable attack strategy, largely because email is an easy access point to hospital employees, many of whom have credentials for several internal information systems (eg, electronic health records). In our experience, email addresses are easy to ascertain, either from published resources (journal articles, public websites, and social media) or through guessing (eg, firstname_lastname[at]hospital[dot]org). In addition, emails are frequently opened, regardless of sender. For example, more than one-third of sales and marketing emails are ultimately opened. The open rate may be even higher for emails that are not sales related.
Health care systems are also uniquely vulnerable to phishing attacks. Employee turnover at hospitals is high, and there is a constant influx of new employees (eg, trainees) who may have no prior cybersecurity training, which creates a continuous stream of newly susceptible employees. Hospital systems are vulnerable due to significant end point complexity, a term used to describe the large number of IT devices that could be targeted in an attack. For example, every employee smartphone that is connected to the network is a potential risk, as are other networked devices (eg, patient monitors, clinical workstations, tablets, and all of the core information systems already in use). In addition, hospital information systems are highly interdependent. An electronic health record is dependent on a laboratory information system to display clinical results. The laboratory information system, in turn, is dependent on a network connection to the laboratory analyzer system to process results. Attacking 1 system could significantly influence multiple downstream systems. Finally, locking down information systems is difficult. In a large health care system, there are typically a vast, heterogeneous, and distributed set of users that need access (eg, affiliated practices, state-level information exchanges, and reporting agencies). It only takes 1 successful phishing email, sent to 1 user, to shut down a critical system, potentially disrupting care across an entire organization.
There are many strategies for preventing or minimizing the consequences of phishing attacks. One strategy is to prevent phishing emails from being received or read in the first place (eg, using technology to filter emails based on patterns suspicious for phishing or modifying emails to indicate they are from external senders). A second strategy is to minimize the value of username and passwords, by requiring multifactor authentication (eg, a unique code generated by a smartphone application that must be entered to log in) or requiring special access controls for specific systems, so that credentials are less useful even if they are obtained. A third strategy is to foster employee awareness and training, and our results suggest that including phishing simulation campaigns as part of employee awareness or training may be helpful. There were several institution-level awareness efforts implemented in conjunction with phishing simulation campaigns. Some examples include distribution of antiphishing laptop decals and multilingual antiphishing posters, as well as phishing awareness in annual employee training programs. These are just some of the components of an information security program, and a robust plan needs to include multiple approaches.

Limitations
There are several limitations to our study. First, we used a convenience sample of institutions, all of which have an information security organization mature enough to conduct phishing simulations. While not representative of the entire US health care system, we have no reason to believe that the trends described herein would be different at other institutions. Furthermore, the click rate estimates may be conservative because systems with robust information security programs would likely have lower click rates than other institutions. Second, we did not have access to employee-level data (eg, to look at trends based on department, individual employees, or employee characteristics like age, sex, or role in the organization or to look at correlations between individuals because not all employees received all phishing simulation emails). Third, we did not adjust for additional factors that could influence click rates, such as campaign complexity, timing, and other institutional factors like intercampaign training programs or informal awareness efforts. Fourth, we are also unsure of the sustainability of click rate improvements over time.

Conclusions
In summary, current click rates in phishing simulations at US health care organizations indicate a major cybersecurity risk. These click rates highlight the importance of phishing emails as an attack vector, as well as the challenge of securing information systems. Repeated campaigns were associated with improved click rates, suggesting that simulated phishing campaigns are an important component of a proactive approach to reducing risk. It is necessary for all members of the health care community to understand this risk, particularly as safe and effective health care delivery becomes increasingly dependent on information systems. We are currently updating our database and email center. All unused accounts will be deleted… If you are receiving this message, it means that your email address has been queued for deactivation… Abbreviation: IT, information technology.

Notes
Emails were placed into 1 of 3 categories based on expert review. Shown are example lures from each of the categories, highlighting the type of content that is used to solicit further engagement with the phishing email from employees. Also shown are the number of campaigns from our sample that fell into each category. a a Figure 1.

Study Design and Data Acquisition
Data collected for each campaign included year of campaign, institutional campaign number, emails sent, emails clicked, and email category.

Boxplot of Campaign Click Rate Among 95 Simulated Phishing Campaigns, by Site
The click rate distribution is shown by site. Each site is an anonymized institution. Click rate is calculated as a proportion (total emails sent divided by total emails delivered) across each campaign. The whiskers indicate the minimum and maximum values for each institution. The lower and upper borders of the box represent the first and third quartiles, respectively, while the line in the box represents the median. Open in a separate window Abbreviations: IT, information technology; NA, not applicable; OR, odds ratio.