Customize your JAMA Network experience by selecting one or more topics from the list below.
Fiellin DA, Reid MC, O'Connor PG. Screening for Alcohol Problems in Primary CareA Systematic Review. Arch Intern Med. 2000;160(13):1977–1989. doi:10.1001/archinte.160.13.1977
Copyright 2000 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2000
Primary care physicians can play a unique role in recognizing and treating patients with alcohol problems.
To evaluate the accuracy of screening methods for alcohol problems in primary care.
We performed a search of MEDLINE for years 1966 through 1998. We included studies that were in English, were performed in primary care, and reported the performance characteristics of screening methods for alcohol problems against a criterion standard. Two reviewers appraised all articles for methodological content and results.
Thirty-eight studies were identified. Eleven screened for at-risk, hazardous, or harmful drinking; 27 screened for alcohol abuse and dependence. A variety of screening methods were evaluated. The Alcohol Use Disorders Identification Test (AUDIT) was most effective in identifying subjects with at-risk, hazardous, or harmful drinking (sensitivity, 51%-97%; specificity, 78%-96%), while the CAGE questions proved superior for detecting alcohol abuse and dependence (sensitivity, 43%-94%; specificity, 70%-97%). These 2 formal screening instruments consistently performed better than other methods, including quantity-frequency questions. The studies inconsistently adhered to methodological standards for diagnostic test research: 3 (8%) provided a full description of patient spectrum (demographics and comorbidity), 30 (79%) avoided workup bias, 12 (of 34 studies [35%]) avoided review bias, and 21 (55%) performed an analysis in pertinent clinical subgroups.
Despite methodological limitations, the literature supports the use of formal screening instruments over other clinical measures to increase the recognition of alcohol problems in primary care. Future research in this field will benefit from increased adherence to methodological standards for diagnostic tests.
EXCESSIVE alcohol consumption is associated with considerable morbidity and mortality and substantial direct and indirect economic costs.1 It is estimated that alcohol use is responsible for 100,000 deaths annually and a $100 billion cost.1 Primary care physicians provide routine care for a large number of patients with alcohol problems; prevalence rates range from 2% to 29%, depending on the type of disorder, in ambulatory patients.2-4
Primary care physicians are encouraged by the National Institute on Alcohol Abuse and Alcoholism to screen patients not only for alcohol abuse and dependence, but also for alcohol consumption that would place them at risk for current or future adverse health events.5,6 The rationale for this recommendation is that primary care physicians can play an instrumental role in recognizing alcohol problems, initiating therapy, providing advice for further treatment options, monitoring response to therapy, and promoting relapse prevention.7,8
Studies of screening instruments in primary care have focused on a wide spectrum of alcohol consumption, including at-risk, heavy, or harmful drinking, and alcohol abuse and dependence (Table 1). At-risk or hazardous drinking is usually defined by establishing a threshold amount of alcohol consumption (eg, daily, weekly, or per occasion) and is also referred to as problem, heavy, or excessive drinking.5 This pattern of drinking is thought to put patients at risk for alcohol-related consequences either because of the amount they drink or because of the effect of alcohol on comorbid medical conditions. Harmful drinkers exhibit physical or psychological harm from alcohol consumption but may not meet criteria for alcohol dependence.5,9 Patients with alcohol abuse and dependence experience marked and repeated negative physical and social effects from alcohol.10 These diagnostic classification schemes can be used by clinicians to stratify patients with respect to severity, prognosis, and appropriate treatment regimens. In addition, they should be considered when a strategy is chosen for identifying patients with alcohol problems. For instance, a method that performs well in detecting patients with alcohol abuse or dependence may perform poorly in identifying patients drinking at harmful or hazardous levels.
The goal of the current review is to answer the clinical question: "Are there effective screening strategies to identify patients with alcohol problems in primary care settings?" To answer this question, we reviewed the literature on the detection of alcohol problems in primary care settings and assessed the strength of the evidence in support of these efforts, on the basis of an appraisal of the methods used in these studies.
We searched the MEDLINE database by using specific medical subject heading and text words to identify candidate articles for review (Table 2). Potential articles were examined to determine if they met the following eligibility criteria: (1) were published in peer-reviewed journals between 1966 and 1998, (2) were written in English, (3) were performed in a primary care setting, (4) examined the performance characteristics of screening methods for alcohol problems, (5) compared a screening method to a criterion standard, and (6) reported performance characteristics (eg, sensitivity and specificity) for the method.
Evaluating the accuracy of a screening instrument requires that a reference or criterion standard be used to determine whether a diagnosis is present or absent. The choice of criterion standard depends on the disorder that is the target of the screening. While accurate diagnosis of an alcohol problem can be difficult because of the complexity of the disorders and the variety of diagnostic schemas available (Table 1), standardized diagnostic instruments exist. For the purpose of this review, we considered that a study compared a screening method with a criterion standard if an identified diagnostic instrument, eg, Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, Revised Third Edition,11 or operational definition (eg, quantity and frequency of alcohol consumption) was used to establish the presence or absence of an alcohol problem.
The goal in performing this systematic review was to evaluate the evidence, and the quality of that evidence, for screening instruments for alcohol problems in primary care. To evaluate the quality of the evidence, we determined whether the methods used in the studies conformed to standards designed to increase validity in this type of research. We appraised each report according to standards that are used to assess the quality of evidence in screening and diagnostic test research.12-15 All eligible articles were appraised by means of a standardized form to record pertinent study characteristics and results according to prespecified coding criteria. The methodological standards and the realms evaluated are described below.
An adequate description of the spectrum of patients included in a study can help clinicians know whether to generalize the results to their patients. To allow clinicians to decide whether the study populations were representative of an unbiased selection of patients, and similar to their own, we considered the spectrum of the patients included in the studies. A study met this standard if the following information was provided about the study population: (1) demographics (age and sex distribution), (2) comorbidity (medical and psychiatric), and (3) eligibility criteria and the number of eligible and screened subjects (ie, participation rate).
Workup bias12 occurs when subjects with a positive (or negative) result on a screening test preferentially receive the criterion standard evaluation and can distort a test's performance. For instance, if patients with positive, as opposed to negative, results on screening tests preferentially receive the criterion standard evaluation, the sensitivity of the test can be falsely elevated because of the incorrect exclusion of subjects (false negatives) from the analysis. Therefore, a study met this standard if all subjects received both the screening and criterion standard test.
Review bias12,15 occurs when knowledge of a subject's results on a screening examination affects the interpretation of the criterion standard test. This can occur when the screening test and criterion standard procedure are not performed in a blinded fashion. For instance, a patient's response on a screening evaluation (eg, 4 positive responses to the CAGE questions, a questionnaire for detecting alcohol problems ["Have you ever felt you should cut down on your drinking?" "Have people annoyed you by criticizing your drinking?" "Have you ever felt bad or guilty about drinking?" "Have you ever taken a drink first thing in the morning (eye-opener) to steady your nerves or get rid of a hangover?"]) could potentially influence scoring on a subsequent diagnostic interview to evaluate for alcohol use disorders. Failure to meet this standard could result in an overestimation of the test's performance. To assess for avoidance of review bias, we considered the sequence of the screening and criterion standard evaluation and whether blinding was described. To meet this standard, we required that investigators report that blinding was performed.
Whereas studies may evaluate the accuracy of a screening test in a population with a broad range of drinking disorders and patient characteristics, clinicians may be interested in a test's accuracy in a particular clinical subgroup. For instance, test accuracy may vary according to demographic or clinical (eg, severity of alcohol problem) factors. If study results are presented as an aggregate, the clinician can only extrapolate results from one group to another without any assurance that the test performs equally well in each group. To check for the clinical utility of the results, we determined whether an analysis was performed on pertinent clinical subgroups. We considered that this standard was met if there was a separate analysis by demographic characteristic or diagnostic category, eg, current vs lifetime disorder.
Our MEDLINE database search yielded 373 citations. We excluded nonresearch publications such as reviews, letters, and editorials (n=56); studies that were not performed in primary care settings (n=73); studies that did not examine the performance characteristics of screening methods for alcohol problems (n=151); and studies that did not compare a screening method with a criterion standard (n=55), leaving 38 articles in the final sample. Some studies evaluated more than 1 screening instrument. The number of studies that examined the performance of each screening test is as follows: the Alcohol Use Disorders Identification Test (AUDIT) or a variation (n=9), the CAGE questions or a variation (n=15), the Michigan Alcoholism Screening Test (MAST) or a variation (n=8), the 2-question screen proposed by Cyr and Wartman16 (n=3), mental or general health screens (n=4), quantity-frequency questions (n=6), and clinical indicators such as clinician recognition or laboratory tests (n=7).
The main focus of screening was at-risk, heavy, and harmful drinking in 11 of the studies and alcohol abuse or dependence in 27. We will discuss studies that evaluated screening tests for at-risk, heavy, and harmful drinking, followed by studies that examined tests for detecting alcohol abuse and dependence. Articles that report results of screening for at-risk, heavy, or harmful drinking, as well as alcohol abuse and dependence, are discussed in the first section.
The 11 studies on screening for at-risk, heavy, and harmful drinking were performed in a variety of primary care settings (Table 3). Five of the studies were performed outside of the United States.3,17-20 The mean age of the subjects, when reported, ranged from 35 to 47 years,2,3,17,18,20 while 1 study4 included only subjects aged 60 years or older. Between 30% and 100% of the subjects were male. The prevalence of alcohol problems in the populations ranged from 1% to 44% and varied by sex and disorder. Finally, in all of the studies, either the screens were self-administered or screening was conducted by the research staff.
To allow for meaningful comparisons within screening instruments and across studies, this section describes the accuracy of the screening methods organized by instrument.
Six studies evaluated the AUDIT for detecting at-risk, harmful, or heavy drinking2,3,17,21-23 (Table 3). The AUDIT had a sensitivity of 97% and a specificity of 78% for hazardous use and a sensitivity of 95% and a specificity of 85% for harmful use when a cutoff of 8 or more was used.17 Using the same cutoff, but different criterion standards, others have reported sensitivities between 51% and 59% and specificities of 91% to 96% for detecting at-risk drinking or heavy drinking.2,21-23 Piccinelli and colleagues3 reported a sensitivity of 84% and a specificity of 90% for combined hazardous, harmful, or dependent drinking when a cutoff of 5 or more was used. A brief version of the AUDIT that includes only the first 3 (consumption) questions was evaluated and found to have a sensitivity of 54% to 98% and a specificity of 57% to 93% for various definitions of heavy drinking.21-23
Four studies evaluated the CAGE questionnaire as a screening tool for at-risk, harmful, or hazardous drinkers in primary care.4,18,19,23 King18 evaluated the ability of the CAGE questions to detect at-risk drinkers, defined as those who consumed 64 g or more of alcohol per day, and found that this 4-item screen had a sensitivity of 84% and a specificity of 95% when a cutoff of 2 or more positive responses was used. Using the same criteria for a positive score, Adams et al,4 however, found that the CAGE questionnaire had a sensitivity of 14% and a specificity of 97% for detecting at-risk drinking (according to National Institute of Alcohol Abuse and Alcoholism criteria) among patients older than 60 years. The CAGE questionnaire had a sensitivity between 49% and 69% and a specificity between 75% and 95% in screening for patients with heavy drinking.19,23 An augmented CAGE questionnaire, which includes the 4 CAGE questions, the first 2 quantity and frequency questions of the AUDIT, and a question pertaining to history of drinking problems, had a sensitivity of 65% and a specificity of 74% in 1 study.23
Three other studies examined the operating characteristics of a screen for this spectrum of drinking in primary care settings.19,20,24 Taj et al24 evaluated the properties of a single question, "On any single occasion during the past 3 months, have you had more than 5 drinks containing alcohol?" among primary care patients. This single-item screen had a sensitivity of 62% and a specificity of 93% for detecting problem drinkers.24 Two studies investigated the operating characteristics of selected laboratory values for identifying patients with this spectrum of alcohol problem. Carbohydrate-deficient transferrin, a new serological marker for recent alcohol ingestion, had a sensitivity of 39% to 69% and a specificity of 29% to 81% for heavy drinking.19,20 In addition, mean corpuscular volume, aspartate aminotransferase, alanine aminotransferase, and γ-glutamyltransferase had limited utility as screening tests for this disorder,19,20 although 1 group found a sensitivity of 77% and a specificity of 81% for γ-glutamyltransferase.19
The 27 studies on screening for alcohol abuse and dependence are described in Table 4. These studies were conducted in a variety of primary care settings, with 4 of the studies performed outside of the United States.26-29 The mean age of subjects in studies reporting demographic information ranged from 36 to 72 years. Males represented between 19% and 100% of the subjects. The prevalence of alcohol problems in the population ranged from 2% to 41%, depending on the diagnosis and whether lifetime or current criteria were applied. Finally, in most studies (66%), screening was performed by research staff, whereas in the remaining investigations the screen was either self-administered (15%) or clinician-administered (19%).
The AUDIT is designed to detect less severe alcohol problems, such as hazardous and harmful drinking, as well as alcohol abuse and dependence disorders. Five studies have examined the performance of the AUDIT as a screening tool for alcohol abuse or dependence. The operating characteristics of the screen varied with the cutoff used to determine positive results of a screen and whether one is interested in detecting a lifetime (ie, if patients met criteria for these disorders at any point in their life) or current diagnosis. For instance, in 1 study,30 the AUDIT had a sensitivity of 61% and a specificity of 90% for a current alcohol use disorder with the use of a cutoff of 8. Changing the cutoff score to greater than 11 resulted in an expected decrease in sensitivity of (40%) and an increase in specificity (96%). The performance characteristics changed dramatically when the investigators considered lifetime alcohol use disorders. In this situation, the AUDIT had a sensitivity of 46% and 30% with a specificity of 90% and 97% with the use of cutoff scores of 8 and 11, respectively.30 Other investigators found that the AUDIT had a sensitivity of 63% and 93% and a specificity of 96% and 96%, for a lifetime or current diagnosis, respectively, of alcohol abuse or dependence.31 The AUDIT did not perform as well as a screening test in a study by Schmidt et al.32 In this study, the AUDIT had a sensitivity of 38% with a specificity of 95% for a lifetime diagnosis of alcohol abuse or dependence. These results are similar to those obtained by Morton et al33 with a cutoff of 8 in a population older than 65 years. In this study, the AUDIT had a sensitivity of 33% and a specificity of 91%.33 The AUDIT was noted to have different performance characteristic within different ethnic and sex populations.34 In 1 study, the AUDIT, with a cutoff of 8 for a positive test, had a sensitivity between 70% and 92% with a specificity of 73% to 94%, with variation based on sex and ethnic background.34
Ten studies evaluated the performance of the CAGE questionnaire in screening patients for alcohol abuse and/or dependence in primary care settings.28,33-41 Sensitivities of 21% to 94% with specificities of 77% to 97% were found when a cutoff score of 2 or more was used.34,35,38-41 Lowering the cutoff to 1 or more positive responses to CAGE questions resulted in a sensitivity of 60% to 71% and a specificity of 84% to 88%.39,40 In older primary care populations, sensitivities ranged from 63% to 70% and specificities from 82% to 91% with CAGE questionnaire scores of 2 or more.33,36 The CAGE questions had a sensitivity of 53% and a specificity of 93% when a combined target of alcohol abuse, dependence and harmful drinking was the goal of screening.28
One study investigated each of the 4 CAGE questions in a population screened for alcohol use disorders with the use of the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition10 criteria as the criterion standard.37 The proportion of subjects answering yes to a specific CAGE question varied by race, sex, and item. For instance, the question "Have you ever felt the need to cut down on your drinking?" had a sensitivity of 63% and a specificity of 84%, whereas the question "Have you ever taken a drink (eye opener) first thing in the morning?" had a sensitivity of 21% and a specificity of 95%. As with the AUDIT, the CAGE questions were noted to have varying performance characteristics within different ethnic and sex populations.34
Seven studies evaluated the MAST or variants of the MAST as screening tools for alcohol abuse and/or dependence.28,30,33,42-44 As with the other screening tests, the operating characteristics of the MAST and its derivatives varied by cutoff score and diagnosis (ie, current or lifetime alcohol abuse or dependence disorder). For instance, unweighted scoring of the Short Michigan Alcoholism Screening Test (SMAST) with a cutoff of 2 or more points had a sensitivity of 82% and 100% with a specificity of 96% and 85% for detecting patients with lifetime and current diagnoses, respectively, of alcohol abuse and dependence.43 Another study, using the same cutoff, found that the SMAST had a sensitivity of 48% and a specificity of 95%, although no distinction was made regarding current or lifetime disorders.28 Weighted scoring of the SMAST typically uses a cutoff of 5 or more points and had a sensitivity of 57% and 66% and specificity of 80% and 80% for current and lifetime alcohol use disorders, respectively.30 Others have reported sensitivities of 38% to 80% and specificities of 79% to 97% with the use of various cutoffs for the SMAST.39,42,44 Finally, the recently developed geriatric version of the MAST had a sensitivity of 70% and a specificity of 80% when a cutoff score of 5 or more was used in a geriatric (>65 years old) primary care population.33 A shortened, 9-item Self-administered Alcoholism Screening Test had a sensitivity of between 13% and 69% with a specificity of between 67% and 95% in different ethnic and sex groups in primary care.34
Cyr and Wartman16 found that the combination of a positive response to the question "Have you ever had a drinking problem?" and/or "When was your last drink?"(within 24 hours was considered a positive response) had a sensitivity of 91% and a specificity of 90% when MAST scores were the criterion standard. However, other investigators have attempted to replicate these findings in other primary care settings and found sensitivities between 48% and 53% and specificities between 76% and 93%.45,46 Permutations of the single question "Have you ever had a drinking problem?" have had a sensitivity of 40% to 70% with a specificity ranging between 93% and 99%.16,40,42,45,46
The TWEAK questions (tolerance, worry, eye-opener, amnesia, kut down), a combination of items from the CAGE questionnaire and MAST developed to identify at-risk drinking among pregnant women, were found to have a sensitivity of 75% with a specificity of 90% in 1 study.41
Three investigations evaluated quantity-frequency questions as a screen for alcohol abuse or dependence disorders. One study found a sensitivity of 47% and a specificity of 96%, with the use of MAST scores as the criterion standard, and a quantity cutoff score of 4 or more drinks per day.16 Fleming and Barry40 found sensitivities of 50% and 20% with specificities of 87% and 97%, with the use of a cutoff of 7 and 20 drinks per week, respectively. In 1 study, there was a gradual decrease in sensitivity (100%-21%) with a corresponding increase in specificity (43%-97%) as the number of drinks consumed per week increased from 0 to 24 or more.47
Six studies examined clinical strategies such as clinical judgment and/or laboratory values to detect alcohol problems.26,27,29,31,43,48 In 2 studies,43,48 physicians identified only 36% to 77% of patients with current alcohol problems and 21% of patients with inactive alcohol problems.48 More formal assessments have found that physician's judgment had a sensitivity of 18% to 44% with a specificity of 96% to 99% for a diagnosis of alcohol abuse and dependence.29,31
Attempts to formalize the use of clinical indicators have led to the creation of the Alcohol Clinical Index26 and the use of a diagnostic grid that combines the CAGE questionnaire and features of the history and physical examination.27 The Alcohol Clinical Index had a sensitivity of 28% and a specificity of 86% for alcohol dependence. The grid had a reported sensitivity of 99% and a specificity of 96% for alcohol dependence; however, it should be noted that the same physician provided the criterion standard diagnosis and filled out the grid.27
Laboratory methods for detecting alcohol problems, such as mean corpuscular volume, γ-glutamyltransferase, aspartate aminotransferase, and alanine aminotransferase, have performed poorly as screening tools.26,43 In a receiver operating characteristic analysis, the SMAST screening test consistently outperformed physician judgment and laboratory evaluations.43 In another evaluation, Escobar et al26 found that in a select group of subjects, use of the mean corpuscular volume, elevated γ-glutamyltransferase level, or aspartate aminotransferase–alanine aminotransferase ratio of 2 or more had sensitivities that ranged from 13% to 63% and specificities that ranged from 48% to 94%.
Two studies evaluated a screen for mental disorders, including alcohol dependence, by means of disease-specific modules.49,50 The alcohol items in the Symptom-Driven Diagnostic System for Primary Care cover worry about drinking, excessive drinking, and morning drinking. This screen had a sensitivity of 38% to 75% and a specificity of 97% to 99% for a current diagnosis of alcohol dependence in primary care populations.49,50
The Health Screening Survey, a masked screen for alcohol abuse and dependence that includes items about alcohol use buried among questions about exercise, nutrition, and smoking, was found to have a sensitivity of 78% and a specificity of 71% in a primary care population.40 Finally, the Spare Time Activity Questionnaire, another disguised questionnaire, had a sensitivity of 100% and a specificity of 72% when compared with psychiatrist assessment of addiction to alcohol.51
Overall compliance with the individual standards by study is shown in Table 5. Most investigations (29 [76%]) provided a pertinent description of the demographic characteristics (age distribution and sex of the subjects) of their respective study populations; however, only 3 (8%) described the medical or psychiatric comorbidity of the screened subjects. Overall, 23 (61%) of the articles provided eligibility criteria and rates of participation. Thirty (79%) of the investigations used methods designed to avoid workup bias, while only 12 (35%) of 34 of the studies met the standard for avoidance of review bias. Finally, 21 (55%) of the studies examined the performance of the screening instruments among different clinical subgroups.
Accuracy data on the most frequently investigated screening instruments, the AUDIT, CAGE questionnaire, and SMAST, are presented in Table 6 by diagnostic category. We chose these instruments because each was evaluated in at least 3 investigations and they have been advocated for use as screening tools.52 Of the 17 studies investigating screening for distinct diagnoses (eg, alcohol dependence) with these instruments, 6 met the 2 standards designed to minimize workup and review bias,4,19,21,23,40,42 and 2 met all the standards.21,23
This evaluation of screening for alcohol problems in primary care reveals that a number of strategies have been evaluated in a variety of settings. To date, screening has generally been directed toward alcohol abuse and dependence. In addition, methodological standards designed to increase the validity of diagnostic test research are inconsistently adhered to in these investigations. Unfortunately, few studies have been performed with multiple instruments, allowing for a direct comparison of the screens' accuracy under similar conditions. Finally, clinicians infrequently performed screening in these investigations. Despite these limitations, the literature supports the effectiveness of select screening instruments for primary care.
On the basis of the reported accuracy of the techniques included in this evaluation, the literature supports screening for less severe alcohol problems such as at-risk, harmful, and hazardous drinking by means of the AUDIT. Designed specifically to increase detection of this spectrum of alcohol problems, and incorporating questions about quantity and frequency of consumption, there is evidence that the AUDIT has increased accuracy relative to the other screening methods, with a reported sensitivity between 57% and 97% and a specificity between 78% and 96%. Restricting the analysis to studies that reported on screening for distinct diagnoses and that also met the standards designed for avoidance of workup and review bias limits the sample to 2 studies, which reported sensitivities between 57% and 59% and specificities between 91% and 96%.21,23 In this same population, modifications of the AUDIT restricted to consumption questions have shown promise but will need to be validated in a wider population and varied settings.21,22
The literature also supports screening for lifetime and current abuse or dependence disorders by means of the CAGE questions. There is evidence that the CAGE questionnaire has increased accuracy compared with other screening instruments for this spectrum of alcohol problems in primary care. This 4-item screen had a sensitivity of 43% to 94% with a specificity of 70% to 97% and performed better than the AUDIT or SMAST. Restricting the analysis to studies that reported on screening for distinct diagnoses and that also met the standards for avoidance of workup and review bias limits the sample to 2 studies that found sensitivities of 43% to 77% and specificities of 70% to 86%.23,40
Our analysis provides a framework for implementing strategies to screen for alcohol problems in primary care in a number of ways. First, this research shows that, although imperfect, there are effective strategies that clinicians can use to identify unrecognized patients with alcohol problems in primary care settings. Structured instruments generally perform better than quantity-frequency questions, clinical impressions, or laboratory data that clinicians report they frequently use to detect alcohol problems in their patients.53 Second, decisions about screening options should include a consideration of the accuracy of instruments across the spectrum of alcohol problems. For instance, the CAGE questions perform better in identifying patients with alcohol abuse and dependence. Conversely, the AUDIT is more sensitive for hazardous and harmful drinkers. This doubtlessly reflects the fact that the AUDIT includes measures of quantity and frequency that are used to establish these diagnoses. Third, the accuracy of screening instruments is responsive to clinical factors. For example, demographic characteristics and stage of diagnosis (current vs lifetime) have a profound effect on test performance. Finally, as one might expect, variation in study characteristics including patient spectrum, study methods, criterion standard, and analysis account for disparate findings regarding the accuracy of screening instruments in the medical literature. Attention to these components of study design and execution can help clinicians determine the utility of the results in their own practices.
One potential limitation of the current analysis is that it may not include all studies on screening for alcohol problems in primary care performed to date. However, we have attempted to identify the most appropriate studies in this field by using a search strategy that was both broad and unbiased. We believe that our search strategy, by using established terms, identified studies that were more likely to meet methodological standards.
A separate limitation to our conclusions is imposed by the state of the art in constructing appropriate criterion standards for the diagnosis of the various alcohol problems. The field has a number of diagnostic schemes from organizations including the World Health Organization, the National Institute on Alcohol Abuse and Alcoholism, and the American Psychiatric Association, and researchers may choose among these (and others) in selecting their diagnostic categories. In addition, there is variability in the criterion standard assessments that are used to establish an alcohol-related diagnosis. Therefore, conflicting or inconsistent results between reported accuracy of screening instruments may result from the definition used for the disorder, the choice of criterion standard, or differences between screening instruments. Further attention to developing uniform diagnostic schemes and accurate criterion standard tests will help to advance screening efforts.
Primary care clinicians should strive to identify patients across the spectrum of alcohol problems. The National Institute of Alcohol Abuse and Alcoholism recommendation that all patients who drink alcohol should be screened with the CAGE questions6 is supported by our findings. The primary drawback to this strategy, however, is the relatively poor performance of the CAGE questions, compared with the AUDIT, in recognizing less severe drinking disorders. Therefore, in situations where time allows for more in-depth interviewing, incorporating the AUDIT may help to identify a wider spectrum of alcohol problems. Nonetheless, the concise nature of the CAGE questions makes them more amenable to primary care clinical encounters than are other longer instruments. A strategy that incorporates the CAGE questionnaire, followed by questions about quantity and frequency of consumption,6 such as the augmented CAGE questionnaire, is pragmatic and shows promise.23 Additional history should be obtained from all patients who have positive results with standardized screening instruments or quantity-frequency questions, and those suspected of having an alcohol disorder irrespective of their screening scores. Further diagnostic efforts to assess for specific disorders, eg, alcohol dependence or harmful or hazardous drinking, should be undertaken in this group of patients.
This review has identified substantial heterogeneity in adherence to methodological standards designed to improve validity in diagnostic test research and reporting of results. These findings undoubtedly reflect the difficulty of conducting research in clinical settings. Nonetheless, some recent studies have been successful in adhering to many or all of the standards.21,22,50 To aid clinicians' efforts at recognizing and caring for patients with alcohol problems, future studies will benefit from increased attention to these standards. Among the areas that warrant increased attention are the reporting of characteristics of the study population, the avoidance of workup and review bias, and description of test performance in pertinent clinical subgroups. In addition, physician-based screening strategies should be empirically tested. The results in clinical settings may differ from those obtained in research because of factors that affect test performance, such as interviewing technique and competing clinical priorities, that were not accounted for in earlier studies.
Future research, by addressing these limitations, will help clinicians feel confident about extrapolating results to clinical practice and facilitate recognition of patients with alcohol problems in primary care settings. Once clinicians have identified these patients, they can begin the process of helping patients reduce the harm associated with excessive alcohol consumption.7
Accepted for publication January 10, 2000.
Dr Fiellin is supported by the National Institute on Drug Abuse Physician Scientist Award (NIDA K12 DA00167), Bethesda, Md. Dr Reid is supported by a Career Development Award from the Department of Veterans Affairs Health Services Research and Development Service, Washington, DC.
Reprints: David A. Fiellin, MD, Yale University School of Medicine, 333 Cedar St, PO Box 208025, New Haven, CT 06520-8025 (e-mail: firstname.lastname@example.org).