Association Between Lottery Prize Size and Self-reported Health Habits in Swedish Lottery Players

This quasi-experimental cohort study uses data from 3 Swedish lotteries to assess whether the magnitude of the prize won is associated with long-term health behaviors and overall health.

1. We dropped prizes if the winning player's personal identification number ("PIN") could not be reliably determined or if key covariates (e.g., information about the number of tickets owned in Kombi) were missing.
2. From each of the two Triss samples, we dropped subjects for whom we had indications that the winning ticket was jointly owned. Such prizes constitute ~7% of the sample (for details on joint ownership, see Section IV in the Online Appendix of our previous study 3 ). We also dropped a small number of Triss players who won multiple prizes under the same prize plan.
3. We restricted the sample to prizes won by players aged 18 or above at the time of win and who were at most 75 years of age at year-end in 2016 (the year of the survey). The upper age limit is motivated by evidence that survey nonresponse increases with age. 6 We also dropped players deceased by 2011 (the last year for which we have data on mortality).
4. For each large-prize event in Kombi, we sought to identify suitable experimental controls. A non-winning player was deemed a suitable control if their sex, year of birth and number of tickets owned (in the month of win) were identical to that of the winner. For three large-prize winners, we were unable to identify four controls satisfying these criteria; we therefore dropped them.

The above restrictions left us with 259 large-prizes from Kombi, 3,294 Triss-Lumpsum prizes and 608
Triss-Monthly prizes. We supplied information about these winners to Statistics Sweden, who dropped prizes won by individuals who were deceased or lacked an official Swedish address of residence in 2016. These restrictions leave us with 241 Kombi prizes, 3,065 Triss-Lumpsum prizes and 570 Triss-Monthly winners.
6. In a final step, we added four experimental controls for each large Kombi prize. . Surveyed players were initially mailed a letter of invitation accompanied by the eight-page survey, a return envelope, and a 100 SEK gift certificate included to encourage survey participation. The cover letter explained that subjects who chose to return the mailin survey were also consenting to having their survey responses linked to administrative registers about socioeconomic outcomes.
As a condition for conducting the survey, Statistics Sweden required that information about these registers be provided to interested subjects, along with information about the selection of the Survey Population. We did not wish to make salient to subjects why they were being surveyed, out of fear that any mention of lotteries might color their responses. Therefore, the cover letter made no reference to individual lotteries or to the administrative lottery sample from which we had identified the Survey Population. Subjects interested in more information about the administrative registers or the selection of the Survey Population were instead referred to a website. Unbeknownst to the subjects, each letter's website URL was unique, and the final data delivered to us therefore contains information about which subjects accessed the website. Only six subjects did, so any biases are likely to be negligibly small.
According to the survey protocol agreed to with Statistics Sweden, subjects who failed to return a survey after the first mailing were sent three reminder letters, the last two of which also included new paper copies of the survey. In a next step, subjects in the Triss-Monthly sample who failed to return a survey after the third reminder were also contacted by telephone and asked to return the survey. Subjects who acquiesced were mailed a new copy of the survey if required. Efforts to establish contact ceased after four calls. For budgetary reasons, we restricted the phone-call reminders to players in the Triss-Monthly sample (observations from this lottery contribute more treatment variation, on average, and are hence more valuable in terms of improving the precision of our estimates).
Respondents Sample. Three weeks after the end of the regular data-collection, Statistics Sweden conducted a follow-up study on a randomly selected subset of 501 subjects who failed to return a mail-in survey. Each subject was invited to participate in an abbreviated version of the survey via the telephone. The phone interview was designed to take six minutes. Statistics Sweden made five attempts at contact before abandoning efforts to obtain responses to the abbreviated telephone version of the survey.
In total, the survey attained a response rate of 69% (see eTable 2 for response rates by lottery).
Here and in what follows, we refer to the survey participants as our Respondents Sample. In eTable 2, we show the distribution of lottery prizes overall and by lottery, in both the Survey Population and the Respondents Sample. All prize amounts are net of taxes and measured in units of year-2011 dollars. For comparability, the Triss-Monthly prizes are converted to a net-present-value amount. Even though the Survey Population is approximately a 1% subsample of the "pooled lottery sample" analyzed in our previous study, 3 the oversampling of large-prize winners allowed us to retain about half of the identifying variation in the original administrative sample.
In addition to the survey data, the data set Statistics Sweden ultimately delivered to us also contains a number of demographic variables from administrative registers and some lottery-specific variables needed to construct the group identifiers. The administrative variables are shown in Panel A of eTable 3 and are available also for survey nonrespondents. Panel B instead reports a set of baseline characteristics used throughout several analyses.
To reduce concerns about investigator degrees of freedom in the selection of our estimation sample, we also defined the procedures by which we would select our estimation sample in each analysis. Following the Analysis Plan, all our analyses are conducted in the largest attainable sample of individuals with non-missing outcome data and baseline characteristics in the year prior to the lottery. (In the Analysis Plan, we noted that under some scenarios, the three diagnostic tests described below could produce results that may justify ex-post changes to the sample-selection criteria, but we committed a priori to clearly describe any such revisions as departures from the original strategy in the eventually published study. Fortunately, no such revisions were deemed necessary and all analyses in the main text were conducted in estimation samples constructed exactly following our pre-registered procedures.) Representativeness eTable 4 compares the distributions of demographic characteristics of players in the Respondents Sample, overall as well as by lottery, the Survey Population and a representative sample that has been reweighted to match the sex and age distribution of the Respondents Sample. The demographic characteristics are the baseline characteristics defined in eTable 3, albeit with age and age squared replaced by year of birth. The baseline characteristics of lottery players are measured at year-end in the year prior to the lottery and are similarly distributed in the Respondents Sample and the Survey Population. Column (6) also shows the baseline characteristics for a random sample of Swedish adults drawn in 2010 after it has been reweighted to match the sex and age distribution of the Respondents Sample. Columns (4) and (6) suggest that in terms of these observable baseline characteristics, the two samples are similar. A previous analysis 3 of the administrative sample from which the Survey Population was drawn also found that, adjusting for compositional sex-and age differences, the health-care utilization and mortality of lottery players resembles that of the Swedish population as a whole.
In additional analyses, we calculated the prevalence of 35 ailments, diseases and health conditions included in our survey and compared the results to a representative sample of Swedes. (Further information about the 35 conditions is available in the description of our Health Index below.) eTable 5 reports the prevalence of each condition in the Respondents Sample and the 2010 wave of the Swedish Level of Living Survey (SLLS). The SLLS numbers are calculated in a sample that has been reweighted to match the sex-and age distribution of the Respondents Sample. The comparison is subject to some interpretational caveats. First, the questions used in our survey were not formulated identically to those used by the SLLS. Second, the SLLS questions offered several response alternatives designed to measure the intensity of any symptoms, whereas our survey questions did not. Third, the SLLS data are based on face-to-face interviews conducted in 2010, whereas our data are based on a mail-in-survey administered in 2016. Prevalence is greater in the Respondents Sample for 22 out of 35 conditions and smaller for the remaining 13 conditions. In most cases, the differences are small in magnitude, however.
To further assess representativeness, we used publicly available data from the 2016 wave of the Swedish Public Health Survey, a representative survey whose methodology is similar to ours. We compared the prevalence of nine indicator variables constructed from questions that were identical across the two surveys. In these analyses, we continue reweight the data from the representative sample to match the sex-and age distribution of our Respondents Sample. The publicly available summary statistics only contain data for four sex-and age categories, so the reweighting is unfortunately coarser than in other analyses. eTable 6 suggests that members of the Respondents Sample have somewhat worse health and health habits overall.

Primary Outcomes
The primary outcomes and their pairwise correlations are summarized in eTable 7. Our six primary outcomes were defined as follows.
Subjective Health. This variable is based on the subject's response to the question "How do you judge your overall state of health?". Subjects are offered five response categories, ranging from "Very Poor" to "Excellent". We assigned a numerical value of 1 for subjects who selected "Very Poor", 2 for "Poor" and so on up to 5 for "Excellent".
Health Index. Our survey contains a question adapted from the SLLS. 7 The original survey lists 51 health conditions (an ailment, disease or symptom) and asks the respondent to indicate whether or not they have suffered from each of the conditions in the past 12 months (and if yes, how much). To economize on survey space, we excluded the 16 conditions with the smallest pairwise correlations with subjective health. We asked subjects to indicate whether they suffered from each of the conditions, but not about the severity of any symptoms.
The 35 health conditions we retained in the survey are listed in Table S5. In addition to these 35 categories, the survey question contains an additional response: "None of the above ailments or diseases during the past twelve months". Respondents who did not check any of the boxes (including the "None of the above…." option) are treated as missing in our analyses. The health index variable, whose construction is described below, is defined for all other respondents. (Table VI in the Analysis Plan incorrectly listed genital discomfort as one of the 35 conditions covered by our survey and inadvertently omitted nausea.).
Our procedure for aggregating the 35 conditions into a health index is similar to that used by Lindahl. 8 Specifically, we regress Subjective Health on indicator variables for each of the 35 conditions, a cubic in age, sex, and sex-byage interactions. We restrict the estimation sample to individuals aged 18-75 at the time of the survey. We then use the coefficients estimated from SLLS to predict each respondent's subjective health from their covariates . Our final index is simply this predicted value after it has been standardized to have mean zero and unit variance in our estimation sample.
Smoking. Our survey asks respondents to state whether they smoke daily, occasionally or never. Daily smokers are also asked how many cigarettes they smoke per day. Our primary outcome is the number of cigarettes the respondent smokes per day (responses were restricted to be a positive integer below 100). For daily smokers, we set this variable equal to the number of cigarettes smoked per day, censoring the variable at 60, a threshold corresponding to three packs of cigarettes per day. For occasional smokers, we set the variable equal to 1 cigarette per day (thus ensuring that no occasional smokers has a recorded smoking quantity strictly greater than the minimum quantity a daily smoker can report). For never smokers, we set the variable's value equal to 0.
Alcohol. This variable is the respondent's score on a 3-item screening test for problem drinking. We use a previously described procedure to determine each respondents score. 9 The first question asks respondents how often they had an alcoholic beverage last year. Possible answers are never (0 points), monthly or less (1 point), 2 to 4 times a month (2 points), 2 to 3 times a week (3 points) or 4 or more times a week (4 points). The second question asks respondents how many drinks they consumed on a typical drinking day during the past year, and respondents were showed a picture of what a standardized drink refers to. Responses are coded as followed: 1 to 2 drinks (0 points), 3 to 4 drinks (1 point), 5 to 6 drinks (2 points), 7 to 9 drinks (3 points), or 10 or more drinks (4 points). The third question asks how often the respondent had 6 or more drinks at one occasion during the last 12 months. Responses were less than monthly (1 point); monthly (2 points), weekly (3 points), or daily or almost daily (4 points). With one exception, the primary outcome measure is the sum of points from the three questions, i.e. a possible score between 0 and 12. The exception is that individuals who answer "never" to the first question are scored as zero (regardless of their responses to the remaining questions). Respondents whose response to the first question indicates some drinking are coded as missing unless they respond to both follow-up questions.
Physical Activity. Our survey contains two questions about physical activity inspired by the International Physical Activity Questionnaire (IPAQ). The short IPAQ questionnaire asks respondents to indicate how many minutes per week they spend on nine separate activities. 10 Physical activities are weighted by their metabolic equivalent (MET) to form an overall measure of the total number of MET-minutes per week. Our first question asks how much time the respondent spends exercising during a regular week and the second how much is spent doing moderately physically demanding everyday activities like walking and biking. We assume that MET for the activities in the first and second question are 8 and 4, respectively. To make it easier for respondents to answer the questions, we did not allow open-ended answers and provided a number of response alternatives, e.g. 0, 0-29, 30-59, 60-89, 90-119, 120 minutes or more for the question about exercise. We translate each response into minutes using the midpoint of these intervals. For the highest choice alternative, we add half the distance between the second highest choice alternative, i.e. 120 + 15 minutes for exercise. If a respondent has only answered one of the two questions, it is coded as missing. The resulting measure of MET-minutes per week is our primary outcome.
Healthy Diet. This variable is derived from responses to three questions about dietary habits. The first question asks respondents how often they eat vegetables (excluding potatoes). The second asks how often they consume soda and other sweet drinks and the last question how often they eat seafood. Each question has between 5 and 7 response alternatives. For the questions about seafood and vegetables, we assign the number 0 to the response indicating the lowest frequency of consumption, 1 to the response indicating the second lowest frequency, and so on. For soda, we proceed the same way, but reverse-code the responses so that higher values denote lower consumption of soda. We subsequently standardize the three numerical variables. Our final index is the sum of the three (standardized) variables. For subjects with exactly one missing item-level response, we replace the missing value by the question-specific mean before calculating the value of the index. We set the index to missing if at least two of the questions have missing values.

Statistical Inference
Throughout, we report p-values based on analytical standard errors that have been clustered 11 at the individual level. In our main analysis of the primary outcomes, we also report permutation-based p-values constructed by simulating the distribution of the coefficient estimates under the null hypothesis of no association. In each simulation iteration, we independently permute the prize column in each group. We next use our estimating equation to generate an estimate of and its standard error. Repeating this process 10,000 times gives us a simulated distribution that we use to calculate the probability of observing a test statistic as extreme as the one observed under the null hypothesis. Finally, in our main analyses of the primary outcomes, we also report p-values that have been adjusted to account for the fact that we examined six primary outcomes. To calculate these familywise error rate adjusted p-values, we apply the free step-down resampling method of Westfall and Young. 12 In the tables, we refer to the resulting p-values as FWER-adjusted p-values.

Survey Nonresponse and Tests of Endogenous Attrition
A potentially serious concern is the possibility that the lottery outcome is related to the likelihood that a respondent agrees to participate in the survey. Such endogeneity can lead to violations of the key identifying assumption for causal inference (i.e., that the treatment is independent of potential outcomes conditional on our group-identifier effects) in the Respondents Sample even if it holds in the Survey Population. To test for selection biases, we prespecified three distinct tests for endogenous selection. Below, we present these tests and the results.

Diagnostic Test 1. Association between Wealth and Survey Participation.
In our first diagnostic test, we examined whether survey participation was associated with prize amount. The results from this test are shown in eTable 8. The first two columns report coefficient estimates from a regression of an indicator variable equal to 1 for subjects who returned a mail-in survey and 0 for subjects who did not, on prize amount won. The results without group identifier fixed effects are shown in column 1 and the results with the group identifier fixed effects are in column 2. Column 3 shows the results from an analogous specification estimated among players invited to the abbreviated telephone survey. Here, the dependent variable is an indicator equal to one for subjects who agreed to participate. Finally, column 4 shows the results from a specification in which survey participation is defined as either having returned the mail-in survey or having answered the abbreviated telephone survey. Across all specifications, we fail to see any indications that lottery prize is associated with survey participation.

Diagnostic Test 2. Testing for Balance in Baseline Characteristics.
Our second pre-specified test checks for covariate imbalance using the following estimating equation: where is the prize amount, are indicator variables for the group identifiers and the pre-specified baseline characteristics measured in year prior to the lottery event. The Analysis Plan showed that the covariates are balanced in the Survey Population, as indeed one should expect given that it is virtually attrition free. In eTable 9, we reproduce the results from the original analyses of the Survey Population, alongside results from analogous analyses conducted in the Respondents Sample. Controlling for the group-identifier fixed effects, no individual coefficient estimate is statistically distinguishable from zero at the 5% level and an F-test of the joint significance of the baseline characteristics also fails to reject the null hypothesis that pre-lottery characteristics are jointly predictive of the lottery outcome. These conclusions hold in both the Survey Population and the Respondents Sample. We reach similar conclusions when Kombi is analyzed separately from the two Triss lotteries. (The Kombi specification omits sex and age, because these variables never vary among players in the same group.)

Diagnostic Test 3. Lottery Estimates in Survey Population vs Respondents Sample.
In our third test, we estimated the association between lottery wealth and a number of register-based outcome variables in the Survey Population and examined whether the coefficients moved appreciably when the estimation sample was restricted to the Respondents Sample. Evidence of systematic differences between the two sets of coefficient estimates could, but need not, be an indication of endogenous selection into the Respondents Sample. In eTable 10, we report the results from these analyses. In columns 1, 3, 5, and 7 we report estimates from the Survey Population (the smaller sample sizes in columns 1 and 3 reflect the fact that financial variables are only available 2000-2007 and net wealth and debt at year-end in the year of the lottery event is only defined for players who won in these years). In columns 2, 4, 6, and 8, we report the results from exactly analogous analyses conducted with non-respondents omitted from the estimation sample. For all pre-specified outcomest = 0 net wealth, t = 0 debt, t = 1 capital income, and t = 1 labor incomethe estimates are similar in magnitude.

Robustness Analyses
Our Analysis Plan also specified a set of robustness analyses, the results of which are reported in eTable 12 and eFigure 3. In our first robustness analysis, we omitted large-prizes winners. Our original Analysis Plan defined prizes above 4M SEK (year-2011 prices) as large. For expositional ease, we rounded this cutoff to 500K USD (year-2011 prices) in eFigure 3. For completeness, eTable 12 reports results for both cutoffs, and unsurprisingly, the results are highly similar. Dropping large prizes increases standard errors, but the coefficient estimates are broadly similar, suggesting that our main findings are unlikely to be driven entirely by large wealth shocks. In our second robustness analysis, we reran the main analysis for Subjective Health weighting mail-in survey respondents and telephone survey respondents to match the population response rate to the mail-in survey. The reweighting leads to a somewhat larger estimate.

Benchmarking Lottery Estimates
Rescaling Lottery Estimates for Comparability with Gradients. Since much of the literature on the relationship between income and health is correlational, we expected that some readers would find it informative to compare our lottery estimates to income-health gradients. In this section, we therefore describe and motivate the methodology used to generate the income-well-being gradients and rescaled lottery estimates reported in the main text.
The methodology follows the following four general principles (outlined in the Analysis Plan). First, all else equal, gradients should be estimated in Swedish samples reweighted to match the sex and age distribution in the Respondents Sample. Second, all else equal, it is desirable to use outcomes defined as similarly as possible to the primary outcomes. Third, it is desirable to smooth out transitory fluctuations in year-to-year income whenever possible. And fourth, when multiple income measures are available, the measure most highly correlated with net annual household income is preferable.
The lottery estimates in eTable 11 are not on a scale that easily permits comparisons to income-health gradients. The lottery prizes we study represent substantial, one-time, increase in lifetime wealth. Previous work on the Swedish administrative sample has shown that large-prize winners enjoy sustained improvement in economic conditions that are robustly detectible for well over a decade after the windfall. 3-5 For example, winners reduce their labor supply following a win, but the reduction is modest, and persists for up two decades (and possibly longer; the number of lottery players who can be tracked for at least two decades is not yet large enough to allow well-powered analyses). The lottery wealth is spent down over time, but at a rate modest enough to detect associations between lottery wealth and financial assets and real estate wealth measured a decade after the lottery event. This evidence, which is consistent with the conclusions from interview-based research on lottery winners in multiple countries, 13-16 suggests lottery prizes induce a major shift in the long-run income status of the winner's household.
The general idea behind our comparison is therefore to measure, for each lottery prize, an approximate increase in annual income that it could sustain over a long time period. With lump-sum prizes converted to a measure of a long-run increase in annual income, our lottery estimates can be interpreted as associations with a permanent increase in annual income, and are more directly comparable to income well-being gradients. We have no unassailable method for translating the lump-sum prize to a corresponding increase, but following the Analysis Plan and previous research, 3 we proceed by translating each lottery prize into an annual income, measured as the annual payout the prize would generate if it were annuitized over a 20-year period at an actuarially fair return. For point of reference, a lump-sum prize of $100,000 translates into an increase in permanent annual after-tax income of $5,996.

Income Gradients in Respondents Sample and European Social Survey.
In the main text, we compare rescaled lottery estimates with gradients estimated in the full Respondents Sample. To obtain a long-run measure of economic status purged of transitory year-to-year fluctuations in income, we defined the permanent annual income for each player as the average disposable household income over the period 2004-2014. Our measure of annual household income is net of taxes and we left censor annual observations at $6,000. We then estimate a well-being gradient for each primary outcome by regressing it on permanent income, controlling for sex, a fourth-order polynomial in age and sex-by-age interactions. A potential limitation of this approach is that some of the variation in our measure of permanent income is endogenous to the lottery. We therefore also ran these analyses in a sample restricted to small-prize winners, defined as players who won prizes smaller than $20K. In this sample, average prize won ($8,483) is small enough that any endogeneity is likely to be negligibly small. eTable 14 shows that gradients estimated using the full and restricted Respondents Sample are consistently very similar in magnitude.
We also compare the gradients in the Respondents Sample with gradients estimated among Swedish respondents in wave seven (2014) of the European Social Survey (ESS). To maximize comparability, we estimated the ESS gradients using the same sex and age controls, in a sample reweighted to match the sex and age distribution of the Respondents Sample. ESS respondents are asked to indicate their household income, net of taxes, by choosing one of several categories. Each category corresponds to an interval. We assign each respondent an income equal to the midpoint of the chosen interval. We set the annual after-tax income to 0.66M SEK (year-2014 prices) for households in the top decile. For comparability, our final income variables are converted to units of year-2011 10K USD, and we apply the same left-censoring threshold ($6,000) as in the Respondents Sample.
For each of our primary outcomes, we sought to construct a similar variable in the ESS. Our ESS measures of subjective health, smoking and risk for alcohol dependency are derived from questions that are very similar to those used to construct the primary outcomes Subjective Health, Smoking and Alcohol. Our ESS analogue of the Health Index is a linear combination of 22 indicator variables derived from responses to the questions (1) "Which of the health problems on this card have you had or experienced in the past 12 months …", and "Which of the health problems on this card have you experience in the last 12 months hampered you in your daily activities in any way?". The item weights are derived from a regression analogous to the one used to construct Health Index. Our measure of physical activity is derived from responses to the question "On how many of the last 7 days did you walk quickly, do sports or other physical activity for 30 minutes or longer?" Finally, we use two questions from the ESS to construct a variable comparable to the primary outcome Dietary Quality. The first asks respondents how often they eat fruit and the second about vegetables (both measured on a 1-7 scale). The two scores are standardized and summed to form a single outcome variable. eTable 14 compares gradients from our Respondents Sample to those in the ESS. In all specifications, the dependent variable has been standardized to have unit variance. For most outcomes, the gradients are similar in magnitude. For example, the estimated ESS and Respondents Sample gradients for Subjective Health are 0.080 (SE = 0.07) and 0.086 (SE = 0.012), respectively. Across the six outcomes, there is no systematic tendency for the gradients to be systematically steeper in either sample: for three out of the six primary outcomes, the absolute value of the estimated gradient is greater in the ESS. The ESS gradients in ESS are weaker for Health Index and Physical Activity, but these outcomes differ the most from our survey. In the ESS, the positive relationship between Alcohol and household income is stronger in the ESS than in the Respondents Sample, despite the two outcomes being measured similarly.
Comparison to Income Gradients. Having established that gradients in our restricted Respondents Sample replicate standard patterns in the literature, we compare our rescaled lottery estimates to gradients from the Respondents Sample. The results from this comparison is shown in eTable 15 and Figure 4.
Comparison to Published Estimates from Lottery Studies. The Analysis Plan briefly mentioned our intention to benchmark our findings against estimates in previous quasi-experimental studies, especially studies of lottery players. Here, we describe the inclusion criteria used to identify the final list of quasi-experimental estimates used in our final comparisons. We also explain how we transformed the estimates in the original studies to make them comparable to ours.
We conducted a systematic search for quasi-experimental studies of lottery players' overall health or health behaviors. We identified four studies that analyzed at least one outcome comparable to one of our six primary outcomes. 8, [17][18][19] However, two of these analyzed a very similar set of outcomes in the same dataset (the British Household Panel Survey). 17,19 To simplify the exposition, we only retained one of them.
The first study used data from three waves of SLLS to study the health of Swedish lottery winners. 8 The SLLS survey data can be used to calculate the sum of monetary prizes won between 1969 and 1981. Winners are not asked about the exact year of win or the name of the lottery, however. The study therefore compared the health outcomes of N = 626 winners who won prizes of different magnitudethe key identifying assumption for causal inference is that in this sample of winners, the reported lottery wealth is as good as randomly assigned. One of the health outcomes, an index of Bad Health, closely resembles our Health Index (except that it is coded so greater values denote worse health). Table 3 of the study reports that the estimated effect of 130,000 SEK (year-1998 prices) on Bad Health, measured up to 12 years after the win, ranges from -0.04 to -0.07 SD units. The estimates vary depending on the set of controls used (but the standard error, rounded to three decimal places, is 0.029 in all three specifications). We therefore use the midpoint of these estimates, -0.055, in our comparison. To make this estimate comparable to ours, we first calculated that 130,000 year-1998 SEK is equal to 22,264 year-2011 USD. The estimates must therefore be converted by a factor of -100,000/22,264 = -4.49. The negative sign is needed to align the directional coding of the two indices. Applying this conversion factor gives a rescaled estimates of 0.25 SD units (SE = 0.13) to be compared to our estimated lottery estimate for Health Index, which is equal to -0.003 (SE = 0.015).
Raschke 18 uses the German Socioeconomic Panel to study the effects of lottery wealth on a number of outcomes comparable to ours. Using the longitudinal data, he analyzes within-person changes in the outcome shortly after a large win (2,500 euros or more). The study's key identifying assumption is therefore that, in the sample of largeprize winners, the timing of the win is unrelated to other time-varying and unobserved factors that affect the outcome. In his primary specification, used in our comparisons below, Raschke estimates the association with an indicator variable for having won a large prize in the last year.
The study considers three measures of overall health. The first is an index of physical health which we consider comparable to our Health Index. The next two variables are binary and comparable to the question about selfassessed overall health we used to generate Subjective Health. Raschke 18 defines Bad Health as an indicator variable equal to one for individuals who indicate that their overall health is "Poor" or "Very Poor". His second variable -Good Health -is an indicator variable for individuals who rate their health as "Good" or "Very Good". The study also analyzes binary variables for smoking and frequent drinking (but not quantities consumed).
The study reports a small and statistically insignificant negative association with the index of physical health. In SD units, the estimate is -0.01 (SE = 0.086). To facilitate comparisons to the four binary outcomes, we constructed similar outcomes in our sample and reran our main analyses. Specifically, we use our question about self-assessed health and the same cutoffs as Raschke to generate binary indicators for Good Health and Bad Health. We classify a person as a smoker if he or she reports a daily consumption of cigarettes greater than zero. To get a binary measure of drinking, we define an indicator variable for respondents whose score on our screening test exceeded 7, the cutoff for dependence recommended by a recent validation study conducted in Sweden. 20 Raschke finds that immediately following a win, winners are 7.7 percentage points (SE = 2.4) more likely to evaluate their own health status as bad or very bad (Bad Health), 4.6 percentage points (SE = 2.9) less likely to evaluate their own health as good or very good (Good Health), 2.4 percentage points (SE = 2.4) less likely to be smokers and 3.3 percentage points (SE = 3.3) less likely to drink alcohol frequently. Our lottery estimates for all four binary outcomes are close to zero, with standard errors at least eight times smaller. For example, estimates based on our sample suggest that a net wealth shock of $100,000 reduces the long-run probability of Bad Health by less than one tenth of a percentage point, compared to 17.1 percentage points (SE = 5.4).
Three out of the four comparisons of binary outcomes are based on outcomes that were defined using very similar procedures in the two studies. For Alcohol, an interpretational caveat is that our binary variable is derived from a score on a test designed to screen individuals at risk for alcohol dependence. Alcohol dependence is conceptually distinct from frequent consumption of alcohol. Since the two variables are empirically related, we find the comparison informative nonetheless, but we urge caution in making this specific comparison.
A study based on the British Household Panel Study concludes that lottery wealth is associated with more smoking and social drinking in the years following a win. 17,19 For smoking, this conclusion is based in part on a comparison of the outcomes of "big-prize winners" (defined as winners who report prizes greater than £500 in year-2005 prices) to small-prize winners in (i) the two years following a win (ii) the year following a win or (iii) the yearof-win. For expositional ease, our comparisons below are restricted to estimates based on the two-year comparisons of big-prize and small-prize winners. However, none of our conclusions change substantively if we instead use estimates that are based on tighter follow-up windows, or one of the other identification strategies used in the study.
The study reports results for binary and non-binary measures of smoking and drinking (specifically, an index of social drinking). Our primary outcomes Alcohol and Cigarettes are not binary, so we use the variables measuring the quantities consumed in our comparisons. Relative to the small-prize winners, the study finds that big-prize winners smoke 0.936 more cigarettes per day (SE = 0.341) and score 0.035 points higher on an index of drinking (SE = 0.041). Using frequencies depicted in the paper's Figure 1, we infer that the standard deviation of the index is approximately 1.5. In SD units the difference in the drinking index is therefore 0.023 (SE = 0.028). The study also reports that big-prize winners are 0.99 percentage points (SE = 1.19) less likely to report that their selfassessed health is in the highest category of a five-point response scale. This is the only measure of overall health analyzed, so we use it in our comparisons.
The study reports the average size of small prizes (£61.64, see p. 525), the fraction of prizes that are big wins (6%, see p. 524) and the average prize size overall (£245, see p. 524). From this information, we infer that the average big prize is approximately £3,120 in year-2005 prices, or $5,800 in year-2011 prices. Hence, we multiply the estimates by 100,000/5,800 = 17.24. All three previous studies conclude that lottery wealth impacts at least one of our primary outcomes. One way to reconcile our overall pattern of null results with the findings in these previous studies is to argue that short-run effects of wealth are substantially larger than long-run effects. This explanation may have some merit, but it fails to explain some patterns in the data. A first is that Lindahl found large positive associations with a long-run measure of overall health. A second is that Apouey and Clark's results do not suggest a systematic tendency for the lottery wealth to have a large effect in the year-of-win that subsequently decays. A second possibility is that the previous studies relied on identification strategies that required stronger identifying assumptions for causal inference. Indeed, none of the previous studies compared the outcomes of players from the same lottery, because the data sets used do not contain information about the lottery associated with each prize won.
Under assumptions about realistic effect sizes that are informed by our new evidence, it is plausible that the earlier studies were underpowered. To illustrate some of the consequences of low power, suppose the true treatment effect on Health Index is smaller than 0.033 SD units (the upper limit of the 95% confidence interval of our estimate). The statistical power to detect an effect, at = 0.05, of such magnitude did not exceed 5.7% in any of the above three studies. Conditional on finding a statistically significant effect, design calculations 21 reveal that studies with statistical power this low will report an estimated effect with the wrong sign ("type S error") between 23% and 42% of the time. Moreover, the expected overestimate of the true effect size ("type M error") ranges from 9 to 34. This table compares the income-health gradient in our sample of small-prize winners (<20K) and the full Respondents Sample to wave 7 (2014) of the Swedish data from the European Social Survey (ESS). All gradients are estimated controlling for sex, a fourth-order age polynomial and sex-by-age interactions. To maximize comparability, the ESS regressions are weighted to ensure a sex-and age distribution matching the Respondents Sample. In the Respondents Sample, we define income as the respondent's average annual household disposable income between 2004 and 2014, left-censored at $6K in year-2011 prices, whereas we use the self-reported household income from ESS. We constructed outcomes in ESS to be maximally similar to the primary outcomes used in our survey. The Subjective Health measures are near-identical to those used in our survey. The Health Index is a linear combination of 22 dummy variables representing responses to the question "Which of the health problems on this card have you had or experienced in the past 12 months …", and , and "Which of the health problems on this card have you experience in the last 12 months hampered you in your daily activities in any way?". Weights are derived by regressing the subjective health rating on these dummy variables. The smoking and alcohol questions are similar to our survey and the outcomes are defined in the same way as our primary outcomes. The measure of physical activity in ESS is determined by responses to the question "On how many of the last 7 days did you walk quickly, do sports or other physical activity for 30 minutes or longer?" Dietary quality is quantified using two questions in the ESS that measure respondents' frequency of eating fruit and frequency of eating vegetables on a 1-7 scale. These two scores are standardized and summed to form a single outcome variable. This table compares lottery estimates to income-health gradients estimated in the Respondents Sample. Lottery estimates have been rescaled assuming lottery prizes are annuitized over 20 years at a 2% real interest rate. Baseline controls and cell fixed effects are included when estimating effect sizes, whereas gradients are estimated controlling for sex, a fourth-order age polynomial and sex-by-age interactions. Gradients are estimated using the respondent's average annual household disposable income between 2004 and 2014, left-censored at $6K in year-2011 prices. "p equal" is the p-value obtained from a Wald test that the rescaled causal estimate is equal to the gradient estimated in the full sample.