HPV indicates human papillomavirus; NHANES, National Health and Nutrition Examination Survey. Error bars indicate 95% confidence intervals. Both low-risk and high-risk HPV types were detected in some females. Low-risk HPV types are defined as HPV type 6, 11, 32, 40, 42, 44, 54, 55, 61, 62, 64, 71, 72, 74, 81, 83, 84, 87, 89, and 91; and high-risk HPV types as HPV type 16, 18, 26, 31, 33, 35, 39, 45, 51, 52, 53, 56, 58, 59, 66, 67, 68, 69, 70, 73, 82, 85, and IS39.
HPV indicates human papillomavirus; NHANES, National Health and Nutrition Examination Survey. Error bars indicate 95% confidence intervals.
*HPV types with a relative SE of more than 30% are not presented, except for HPV-11, since it is a vaccine type.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Dunne EF, Unger ER, Sternberg M, et al. Prevalence of HPV Infection Among Females in the United States. JAMA. 2007;297(8):813–819. doi:10.1001/jama.297.8.813
Human papillomavirus (HPV) infection is estimated to be the most common sexually transmitted infection. Baseline population prevalence data for HPV infection in the United States before widespread availability of a prophylactic HPV vaccine would be useful.
To determine the prevalence of HPV among females in the United States.
Design, Setting, and Participants
The National Health and Nutrition Examination Survey (NHANES) uses a representative sample of the US noninstitutionalized civilian population. Females aged 14 to 59 years who were interviewed at home for NHANES 2003-2004 were examined in a mobile examination center and provided a self-collected vaginal swab specimen. Swabs were analyzed for HPV DNA by L1 consensus polymerase chain reaction followed by type-specific hybridization. Demographic and sexual behavior information was obtained from all participants.
Main Outcome Measures
HPV prevalence by polymerase chain reaction.
The overall HPV prevalence was 26.8% (95% confidence interval [CI], 23.3%-30.9%) among US females aged 14 to 59 years (n = 1921). HPV prevalence was 24.5% (95% CI, 19.6%-30.5%) among females aged 14 to 19 years, 44.8% (95% CI, 36.3%-55.3%) among women aged 20 to 24 years, 27.4% (95% CI, 21.9%-34.2%) among women aged 25 to 29 years, 27.5% (95% CI, 20.8%-36.4%) among women aged 30 to 39 years, 25.2% (95% CI, 19.7%-32.2%) among women aged 40 to 49 years, and 19.6% (95% CI, 14.3%-26.8%) among women aged 50 to 59 years. There was a statistically significant trend for increasing HPV prevalence with each year of age from 14 to 24 years (P<.001), followed by a gradual decline in prevalence through 59 years (P = .06). HPV vaccine types 6 and 11 (low-risk types) and 16 and 18 (high-risk types) were detected in 3.4% of female participants; HPV-6 was detected in 1.3% (95% CI, 0.8%-2.3%), HPV-11 in 0.1% (95% CI, 0.03%-0.3%), HPV-16 in 1.5% (95% CI, 0.9%-2.6%), and HPV-18 in 0.8% (95% CI, 0.4%-1.5%) of female participants. Independent risk factors for HPV detection were age, marital status, and increasing numbers of lifetime and recent sex partners.
HPV is common among females in the United States. Our data indicate that the burden of prevalent HPV infection among females was greater than previous estimates and was highest among those aged 20 to 24 years. However, the prevalence of HPV vaccine types was relatively low.
Human papillomavirus (HPV) is estimated to be the most common sexually transmitted infection in the United States.1 HPV prevalence has been found to be highest among young persons within the first few years after sexual debut.2-6 However, there are no data on the prevalence of HPV among women across a broad age range and representative of the US population.
Genital HPV types are categorized according to their epidemiological association with cervical cancer. Infections with low-risk types, such as HPV types 6 and 11, can cause benign or low-grade changes in cells of the cervix, genital warts, and recurrent respiratory papillomatosis. High-risk HPV types can cause cervical, anal, and other genital cancers. High-risk HPV types are detected in 99% of cervical cancers, and worldwide approximately 70% of cervical cancers are due to HPV types 16 and 18.7,8 Although HPV infection is common, studies suggest approximately 90% of infections clear within 2 years.9,10
A highly efficacious prophylactic vaccine against HPV types 6, 11, 16, and 18 was licensed in June 2006 and recommended for routine use in females aged 11 to 12 years in the United States.11-14 Clinical studies of the quadrivalent HPV vaccine demonstrated close to 100% efficacy in preventing infection and disease (cervical cancer precursors, genital lesions) associated with types included in the vaccine in analyses restricted to those women who were naive to HPV types 6, 11, 16, or 18 (either by HPV DNA or HPV antibodies).
Representative data on type-specific prevalence of HPV DNA detection in the United States could provide a baseline estimate to measure the wide-scale impact of the vaccine for reducing infection and could help guide models evaluating impact and cost effectiveness. With widespread implementation of the prophylactic HPV vaccine, decreases in the prevalence of vaccine HPV types would be expected. To determine a prevaccine population-based prevalence of cervicovaginal HPV in the United States, we performed HPV DNA testing on self-collected vaginal swabs among females participating in the National Health and Nutrition Examination Survey (NHANES) 2003-2004.
NHANES is conducted by the National Center for Health Statistics, Centers for Disease Control and Prevention, and uses a representative sample of the US noninstitutionalized civilian population. The representative sample is obtained by using a complex, stratified, multistage probability sample design with unequal probabilities of selection to obtain a nationally representative sample.15 Certain subgroups of people, such as adolescents, non-Hispanic blacks, and Mexican Americans are oversampled. All females aged 14 to 59 years selected for NHANES 2003-2004 were eligible for participation in this study. Of 2482 females aged 14 to 59 years who were interviewed at home for the 2003-2004 cycle, 2387 (96.2%) were examined in a mobile examination center. A total of 2026 females (81.6%) submitted cervicovaginal swab specimens. Four hundred sixty-six females (23.0%) were considered nonresponders because they either submitted an inadequate swab specimen (n = 105) or they did not submit a swab specimen (n = 361). There were significant differences between nonresponders and responders on some demographic and behavioral variables (nonresponders were significantly more likely than responders to be of other race/ethnicity, to be younger [<40 years], to be born outside the United States or Mexico, and to have never had sex). Written informed consent was obtained from all participants, and parental permission for those females younger than 18 years. This human subjects research was approved by the Centers for Disease Control and Prevention institutional review board.
Race and ethnicity were self-reported into categories, including non-Hispanic black, non-Hispanic white, and Mexican American. Poverty index ratio was calculated by dividing total family income by the poverty threshold index, adjusted for family size at year of interview.16 Estimates of the total number of cases were generated by multiplying the appropriate population size from the January 2004 monthly postcensal civilian noninstitutionalized population17 by the weighted prevalence estimate among females in the appropriate age category.
Sex was defined as vaginal, oral, or anal sex. For those females who had at least 1 lifetime sex partner, additional questions were asked on age of first sex and lifetime number of sex partners. For those females aged 14 to 17 years who were sexually active, additional questions about sexual behavior were asked, including age at first sex, number of lifetime partners, and ever use of a condom. For women aged 18 years or older who were sexually active, additional questions were asked on number and gender of sex partners in the last 12 months and lifetime sex partners, and past history of sexually transmitted infections.
Females aged 14 to 59 years were asked to self-collect a cervicovaginal sample in the mobile examination center. In brief, each female was given a collection device, which was a small foam swab on a plastic handle packaged in an individual reclosable plastic sleeve (Catch-All Sample Collection Swabs Epicenter, Madison, Wis). Participants were instructed to wash their hands before opening the swab, to hold the swab by the end of the handle, to insert the foam swab into the vagina similar to inserting a tampon, to gently turn the swab during a count of 10, and to replace the swab in the plastic sleeve, avoiding contact with the external genitalia. Participants took swabs and instructions into a bathroom and collected the samples in privacy. Swabs were given to NHANES personnel, stored at room temperature, and mailed to the Centers for Disease Control and Prevention laboratory for processing.
DNA was extracted using slight modifications of the QIAmp Mini Kit protocol (Qiagen, Valencia, Calif) within 1 month of sample collection. Briefly, swabs were incubated at 56°C for at least 12 hours in proteinase K lysis solution. One half of each sample was added to an equal volume of 100% ethanol and applied to each of 2 QIAmp Mini columns for DNA isolation. The eluates were collected, washed, and concentrated in Microcon 100 concentrators (Fisher Scientific, Pittsburgh, Pa), adjusting the final volume to 100 μL with deionized distilled water. Samples were tested immediately or stored at −20°C. For every 40 samples, a water blank was processed through all steps of extraction to serve as a contamination control.
HPV detection and typing was performed by using the Roche prototype line blot assay (reagents provided as a gift from Roche Molecular Systems Inc, Pleasanton, Calif). This assay uses HPV L1 consensus polymerase chain reaction with biotinylated PGMY09/11 primer sets and β-globin as an internal control for sample amplification.18,19 Five μL of the DNA was used in the 100-μL polymerase chain reaction. Amplicons (10 μL) were evaluated for β-globin and HPV bands with 1.5% agarose gel electrophoresis stained with ethidium bromide, and those amplicons with an HPV band were hybridized to the typing strips. The first generation strip used from January 2003 to April 2004 included probes for 27 HPV types (6, 11, 16, 18, 26, 31, 33, 35, 39, 40, 42, 45, 51, 52, 53, 54, 55, 56, 57, 58, 59, 66, 68, 73, 82, 83, and 84). The second generation strip used after April 2004 included additional types (61, 62, 64, 67, 69, 70, 71, 72, 81, 89, and IS39) but omitted HPV-57.20 Samples that did not hybridize the strip were sequenced as previously described to determine HPV type.21 Samples negative for both β-globin and HPV (n = 105, 5.2%) were considered inadequate for interpretation and were omitted from further analysis. We considered low-risk HPV types as HPV type 6, 11, 32, 40, 42, 44, 54, 55, 61, 62, 64, 71, 72, 74, 81, 83, 84, 87, 89, and 91; and high-risk HPV types as HPV type 16, 18, 26, 31, 33, 35, 39, 45, 51, 52, 53, 56, 58, 59, 66, 67, 68, 69, 70, 73, 82, 85, and IS39.
Females who submitted an adequate swab for HPV evaluation were included in the final analysis (n = 1921). Statistical analyses were conducted using SAS version 9.122 and SAS callable SUDAAN.23 Variance estimates were calculated by using a Taylor series linearization that incorporated the complex sample design of the survey.24 All estimates were weighted using the 2003-2004 medical examination weights provided by National Center for Health Statistics to account for the unequal probabilities of selection and adjustment for nonresponse. The weighting methodology has been described previously.25 Because there were some missing laboratory specimens, we investigated whether any additional nonresponse adjustments to the original NHANES weights were needed. We found that using weights with an additional nonresponse and poststratification adjustment always provided prevalence estimates within the 95% confidence interval (CI) based on the original NHANES weight; therefore, no additional adjustments to the NHANES weights were made.
We considered a prevalence estimate unreliable if the relative SE was more than 30% of the prevalence estimate; these estimates are not presented. Confidence intervals were calculated by using a log transformation with the SE of the log prevalence based on the delta method and applying SUDAAN estimated SEs.26 Tests of association between HPV and the demographic or behavioral characteristics were based on the Wald χ2 statistic. To compare the prevalence between HPV types, we applied a version of the McNemar test for complex surveys. No adjustments were made to the P values for multiple comparisons.
To explore the association with age and overall HPV prevalence, age was categorized into 4-year intervals (14-19, 20-24, 25-29, etc) and plotted against the log odds of HPV. The plot revealed a nonlinear association between age and HPV prevalence. Prevalence increased up to age 20 to 24 years and then decreased. To test for the presence of an increasing linear trend among those females younger than 24 years and a decreasing linear trend for those females older than 24 years, 2 separate logistic regressions were used. Each logistic regression treated age as a continuous variable and was restricted to the age range under investigation. A trend was considered statistically significant if the β coefficient for the independant variable was nonzero at P<.05, using the Satterthwaite adjusted F test.
Demographic and behavioral characteristics of participants were evaluated. Characteristics significant at the P<.20 level based on a Wald χ2 statistic were considered in a multivariate model. There are only 15 df available to develop a multivariate model, due to the complex survey design in the 2-year cycle of NHANES. To limit the df used when including age in the model, we collapsed age into 4 categories (18-19, 20-24, 25-29, and 30-59 years), because the prevalences for age categories between ages 30 and 59 years did not differ significantly. The multivariate model was limited to females aged 18 to 59 years because all questions regarding sexual exposures were asked of this group. We used SUDAAN for logistic regression to model independent associations between prevalence of any HPV and demographic and behavioral variables among sexually active females aged 18 to 59 years. We eliminated variables in a backward fashion that did not meet the criteria of P<.05 by Satterthwaite adjusted F test at each step. Any participants with missing data on variables included in the multivariate analysis were excluded. Once all variables in the model were statistically significant at the P<.05 level, all pairwise interactions were evaluated and retained only if the overall P value for the interaction was <.05. Goodness of fit for the final step of the model was assessed using the Hosmer-Lemeshow Satterthwaite adjusted F test.
From 2003-2004, 2026 vaginal swab specimens from female NHANES participants aged 14 to 59 years were collected. Of the 1921 adequate specimens, 26.8% (95% CI, 23.3%-30.9%) were positive for any HPV DNA. Using 2000 Census data, this corresponds to 24.9 million females in this age range with prevalent HPV infection.
Prevalence of HPV infection was highest among females aged 20 to 24 years (44.8%; 95% CI, 36.3%-55.3%); overall prevalence among females aged 14 to 24 years was 33.8% (95% CI, 28.6%-40.0%) (Table 1). There was a statistically significant trend for increasing HPV prevalence with each year of age from 14 to 24 years (P<.001), which was followed by a nonsignificant gradual decline in HPV prevalence through 59 years (P = .06).
When the analysis was restricted to sexually active females, the prevalence of HPV was still highest among those aged 20 to 24 years. Among sexually active females, HPV prevalence was 39.6% (95% CI, 32.9%-47.8%) for 14 to 19 years, 49.3% (95% CI, 40.7%-59.6%) for 20 to 24 years, 27.8% (95% CI, 21.7%-35.7%) for 25 to 29 years, 27.3% (95% CI, 20.2%-36.8%) for 30 to 39 years, 23.9% (95% CI, 18.8%-30.5%) for 40 to 49 years, and 20.2% (95% CI, 14.6%-27.8%) for 50 to 59 years.
The overall prevalence of high- and low-risk HPV types was 15.2% and 17.8%, respectively. Prevalences of both low-risk and high-risk HPV types were highest in females aged 20 to 24 years (Figure 1). There was a statistically significant difference between low- and high-risk HPV types among females aged 14 to 19 years and 50 to 59 years. Prevalence of high-risk types decreased after 20 to 29 years, and prevalence of low-risk types plateaued after 30 to 39 years.
The most common HPV types detected were HPV-62 (3.3%; 95% CI, 2.2%-5.1%) and HPV-84 (3.3%; 95% CI, 2.2%-5.1%), HPV-53 (2.8%; 95% CI, 2.1%-3.7%), and HPV-89 (2.4%; 95% CI, 1.4%-4.3%) and HPV-61 (2.4%; 95% CI, 1.6%-3.8%) (Figure 2). HPV-16 was detected in 1.5% (95% CI, 0.9%-2.6%) of females aged 14 to 59 years. There was no statistically significant difference in the prevalence of HPV-16 and the 13 more commonly detected types, except for HPV-84 and HPV-62. HPV-6 was detected in 1.3% (95% CI, 0.8%-2.3%), HPV-11 in 0.1% (95% CI, 0.03%-0.3%; relative SE≥30%), and HPV-18 in 0.8% (95% CI, 0.4%-1.5%) of female participants. Most participants infected with HPV (60.1%) had only 1 HPV type detected (95% CI, 53.2%-67.9%); however, 23.9% had 2 types (95% CI, 18.3%-31.3%) and 16% had 3 or more types detected (95% CI, 12.0%-21.2%). Overall, HPV types 6, 11, 16, or 18 were detected in 3.4% of the study participants, corresponding with 3.1 million females with prevalent infection with HPV types included in the quadrivalent HPV vaccine. Few participants (0.10%) had both HPV types 16 and 18 and none had all 4 HPV vaccine types. At least 1 of these 4 HPV types was detected in 6.2% (95% CI, 3.8%-10.3%) of females aged 14 to 19 years.
The variables associated with HPV DNA detection that were significant in the bivariate analysis were age, race, poverty index, education, marital status, and sexual behavior (Table 1 and Table 2).
HPV DNA was detected in 5.2% of females who reported never having had sex (Table 2). In an unweighted analysis, 88% of these females were between 14 to 19 years, and the remaining 12% were 40 to 49 years. Two of these participants were also positive for herpes simplex type 2 by type-specific antibody testing,27 and 1 female was positive for chlamydia by nucleic acid amplification testing.
The final multivariate model demonstrated that age younger than 25 years, marital status, and increasing numbers of recent or lifetime sex partners were independently associated with HPV detection (Table 3). Race/ethnicity was not significant in the multivariate model. Increasing numbers of recent sex partners was more strongly associated with HPV detection than was lifetime sex partners. No pairwise interactions among the variables in the final model were statistically significant below the P<.05 level.
Prevalence of HPV DNA in a representative sample of US females aged 14 to 59 years was 26.8%, with the highest prevalence (44.8%) among women aged 20 to 24 years. The overall prevalence of HPV among females aged 14 to 24 years was 33.8%. This prevalence corresponds with 7.5 million females with HPV infection, which is higher than the previous estimate of 4.6 million prevalent HPV infections among females in this same age group in the United States.1 The overall prevalences of high-risk HPV types 16 and 18 were 1.5% (95% CI, 0.9%-2.6%) and 0.8% (95% CI, 0.4%-1.5%), respectively.
We found that the prevalence of HPV infection increased from 14 years through 24 years, and then decreased at older ages. Most evaluations in the United States have found that the prevalence of any HPV was highest in the younger age groups (<20 years). Manhart et al6 in a population-based assessment of sexually active 18- to 25-year-old women using a urine sample found the prevalence of HPV-6 or HPV-11 to be 2.2%, and the prevalence of HPV-16 or HPV-18 to be 7.8%. Among sexually active 18- to 25-year-old women in our study, the prevalence of HPV-6 or HPV-11 (2.2%; 95% CI, 1.0%-5.0%; relative SE >30%) was similar to that found by Manhart et al,6 and the prevalence of HPV-16 or HPV-18 (3.5%; 95% CI, 1.8%-6.7%; relative SE >30%) was lower. These differences may be due to differences in the type of test used for the HPV detection, differences in the study population, or both.
We found the overall prevalence of HPV-16 to be low and that other HPV types were more prevalent. In most other studies, HPV-16 has been found to be the most prevalent type, although prevalence varied based on the population evaluated. Population-based studies outside the United States have found lower prevalence of HPV-16 than clinic-based studies have.28 It is possible that our cervicovaginal assessment using the self-collected vaginal swabs was more likely to detect HPV types not related to cervical infection. Castle et al28 hypothesized that there is tropism of some phylogenetic groupings (A3/A4/A15) to the vagina. In our assessment, the most common types detected were HPV-84 and HPV-62, both in the A3 phylogenetic grouping. Manhart et al6 found HPV types 84 and 62 were also frequently detected in urine samples among females aged 18 to 25 years.
Independent risk factors for HPV DNA detection in our analysis included sexual behavior (number of sex partners in the last year, number of lifetime sex partners) and demographic variables, including young age and marital status, consistent with risk factors for HPV detection found in other studies.
HPV DNA was detected in approximately 5% of women in our study who reported never having had sex. In an unweighted analysis of these women, we found that most were young in age and some had other sexually transmitted infections, suggesting their self-reported sexual history may not be accurate. Genital HPV is primarily associated with sexual intercourse; however, 1 study3 found that nonpenetrative sexual contact, such as genital-genital contact, could also result in HPV transmission. A detailed sexual history was not collected, therefore, we could not evaluate specific types of sexual contact.
The 84.9% response rate (number of collected swabs/number of eligible swabs = 2026/2387) for vaginal swab collection in our study suggests that the self-collected swabs were a feasible and effective method for HPV DNA detection in a large survey. Previous assessments of the acceptability of self-collected vaginal swabs found this collection method to be more acceptable than specimens collected by health care professionals.29,30 Evaluations have also demonstrated a high correlation of HPV DNA detection, both high-risk and low-risk types, in self-collected swabs compared with swabs collected by health care professionals.31-35 There are few studies comparing correlation of specific HPV types. Available studies on type-specific correlations suggest different types may be detected from these specimens, although a strong correlation, at least in 1 study, existed for HPV types 6, 11, 16, and 18.35 There are no evaluations in the general US population using self-collected swabs precluding a direct comparison of our data with those from other studies.
There were several limitations to our study. Nonresponders were significantly different from responders by certain demographic characteristics (race, age, country of birth, and ever had sex) and this could bias prevalence estimates; however, as noted, evaluations of the weighted prevalence estimates taking the nonresponders into consideration did not substantially change our point estimates. A self-collected cervicovaginal sample was obtained, which may not detect the same HPV types as cervical mucosa samples obtained by health care professionals. Also, HPV DNA point prevalence will most certainly underestimate cumulative incidence as many infections clear9,10; this assessment only measures current infection and does not indicate past exposure to HPV. A study in a previous NHANES sample found overall HPV-16 seroprevalence in females to be 17.9%, with the prevalence peaking in 20- to 29-year-olds.36 Seroprevalence provides a better estimate of cumulative exposure. The difference between the seroprevalence found in that study and the current DNA prevalence in our study reflects the high clearance of HPV infections. This assessment, as all assessments of HPV DNA, could not determine if the HPV detected was from the participant or a partner, or if it represented active infection. Finally, we had only 2 years of data; the distribution of HPV types may change with additional years of data and a larger sample. We did not present demographic and behavioral factors associated with HPV vaccine type infection because the analysis was limited to subgroups with prevalence estimates in which the relative SE was 30% or less; a relative SE of more than 30% means that the SE is quite large relative to the estimate and hence considered unreliable.
Our study provides the first national estimate of prevalent HPV infection among females aged 14 to 59 years in the United States. Overall, HPV prevalence was high (26.8%), and prevalence was highest among females aged 20 to 24 years. Our data indicate that the burden of prevalent HPV infection among women was higher than previous estimates. However, the prevalence of HPV vaccine types was relatively low.
Corresponding Author: Eileen F. Dunne, MD, MPH, Centers for Disease Control and Prevention, 1600 Clifton Rd, MS E-02, Atlanta, GA 30333 (email@example.com).
Author Contributions: Dr Dunne had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Unger, McQuillan, Markowitz.
Acquisition of data: Unger, McQuillan, Swan, Patel, Markowitz.
Analysis and interpretation of data: Dunne, Unger, Sternberg, McQuillan, Patel, Markowitz.
Drafting of the manuscript: Dunne, Patel, Markowitz.
Critical revision of the manuscript for important intellectual content: Dunne, Unger, Sternberg, McQuillan, Swan.
Statistical analysis: Sternberg, McQuillan.
Administrative, technical, or material support: Unger, McQuillan, Swan.
Study supervision: Unger, Markowitz.
Funding/Support: This study was supported by the Division of STD Prevention, Centers for Disease Control and Prevention.
Role of the Sponsor: The funding organization, National Center for Health Statistics, Centers for Disease Control and Prevention, assisted with the conduct of the study, in the collection and management of the data, and in the preparation and review of the manuscript.