Bifani PJ, Mathema B, Liu Z, Moghazeh SL, Shopsin B, Tempalski B, Driscoll J, Frothingham R, Musser JM, Alcabes P, Kreiswirth BN. Identification of a W Variant Outbreak of Mycobacterium tuberculosis via Population-Based Molecular Epidemiology. JAMA. 1999;282(24):2321-2327. doi:10.1001/jama.282.24.2321
Author Affiliations: Public Health Research Institute Tuberculosis Center, New York City, NY (Messrs Bifani, Mathema, Moghazeh, and Shopsin and Dr Kreiswirth); Departments of Microbiology (Messrs Bifani and Shopsin) and Environmental Medicine (Dr Alcabes), New York University School of Medicine, New York; New Jersey Department of Health and Senior Services, Division of Communicable Disease, Trenton (Dr Liu); Department of Geography, University of Washington, Seattle (Ms Tempalski); New York State Department of Health, Wadsworth Center, Albany (Dr Driscoll); Durham Veterans Affairs Medical Center, Durham, NC (Dr Frothingham); and Institute for the Study of Human Bacterial Pathogenesis, Department of Pathology, Baylor College of Medicine, Houston, Tex (Dr Musser).
Context Typing of Mycobacterium tuberculosis could
provide a more sensitive means of identifying outbreaks than use of conventional
surveillance techniques alone. Variants of the New York City W strain of M tuberculosis were identified in New Jersey.
Objective To describe the spread of the W family of M tuberculosis strains in New Jersey identified by molecular typing and surveillance
Design Population-based cross-sectional study.
Setting and Subjects All incident culture-positive tuberculosis cases reported in New Jersey
from January 1996 to September 1998, for which the W family was defined by
insertion sequence (IS) IS6110 DNA fingerprinting,
polymorphic GC-rich repetitive sequence (PGRS) typing, spacer oligotyping
(spoligotyping), and variable number tandem repeat (VNTR) analysis.
Main Outcome Measure Identification and characterization of W family clones supplemented
by surveillance data.
Results Isolates from 1207 cases were analyzed, of which 68 isolates (6%) belonged
to the W family based on IS6110 and spoligotype hybridization
patterns. The IS6110 hybridization patterns or fingerprints
revealed that 43 patients (designated group A) shared a unique banding motif
not present in other W family isolates. Strains collected from the remaining
25 patients (designated group B), while related to W, displayed a variety
of IS6110 patterns and did not share this motif.
The PGRS and VNTR typing confirmed the division of the W family into groups
A and B and again showed group A strains to be closely related and group B
strains to be more diverse. The demographic characteristics of individuals
from groups A and B were specific and defined. Group A patients were more
likely than group B patients to be US born (91% vs 24%, P<.001), black (76% vs 16%, P<.001),
human immunodeficiency virus positive (40% vs 0%, P
= .007), and residents of urban northeast New Jersey counties (P<.001). Patients with group B strains were primarily non-US born,
of Asian descent, and more dispersed throughout New Jersey. No outbreak had
been detected using conventional surveillance alone.
Conclusions The implementation of multiple molecular techniques in conjunction with
surveillance data enabled us to identify a previously undetected outbreak
in a defined geographical setting. The outbreak isolates comprise members
of a distinct branch of the W family phylogenetic lineage. The use of molecular
strain typing provides a proactive approach that may be used to initiate,
and not just augment, traditional surveillance outbreak investigations.
Despite the introduction of the first antituberculin drugs almost 50
years ago, morbidity and mortality associated with Mycobacterium
tuberculosis remains a major public health threat. Recently, the study
of tuberculosis (TB) epidemiology and transmission, traditionally accomplished
by patient contact tracing, has been augmented by the use of molecular strain
typing. A striking example was the identification of the W strain, a multidrug-resistant
(MDR) clone that caused disease in more than 350 patients in New York City
and accounted for more than 25% of all MDR cases in the United States in the
early 1990s.1- 4
This MDR and successful clone, associated with high mortality rates in both
New York prisons and hospitals, has since become the "index" strain in the
Public Health Research Institute (PHRI) TB Center (New York, NY) and has been
the focus of a number of molecular epidemiological studies.1- 6
It is generally accepted that M tuberculosis
isolates with identical insertion sequence (IS) IS6110
fingerprint patterns, such as the 350 W isolates responsible for the New York
City outbreak, are clonal and indicative of recent transmission while isolates
with unique patterns represent cases of reactivation and are unrelated.7,8 However, the use of multiple typing
techniques has provided insight into the relatedness of strains with similar,
but not identical, IS6110 patterns.
The use of multiple molecular tools in combination has demonstrated
the phylogenetic relatedness of strains from diverse temporal and geographic
areas that have genetic markers similiar to the MDR W strain. These strains,
which are grouped in the W family lineage, have a common genotype with the
previously described Beijing clones, which are the predominant strains in
China and found throughout Asia.9- 11
These strains are now viewed as being members of the same phylogenetic lineage
and recent ancestors to the MDR W strain. Together, the W and Beijing families
share distinctive chromosomal markers, as they all belong to genotypic group
1, have the identical spacer oligotype (spoligotype) pattern S00034, and have
in common unique IS6110 chromosomal insertions.4,12,13
In the past, molecular epidemiological studies of TB have primarily
focused on the analysis of disease spread in small areas in which molecular
data had been used to confirm epidemiological linkage or test hypotheses.1,7,8,14- 16
The PHRI TB Center, in collaboration with the New Jersey Department of Health
and Senior Service (NJDHSS), has been genotyping all viable M tuberculosis cultures from reported TB cases in the state of New
Jersey. The molecular analysis is routinely combined with patient surveillance
data. A major goal of this collaboration has been to develop public health
strategies and TB control protocols that integrate M tuberculosis molecular information and case surveillance data on a state population.
We conducted an investigation of the spread of the M tuberculosis W family in the New Jersey TB population during the
years 1996-1998, a time when no outbreaks of TB had been identified by conventional
contact tracing methods.
The study population included all culture-positive TB cases reported
to the NJDHSS between January 1996 and September 1998. All available isolates
from culture-positive cases were genotyped by IS6110
fingerprinting as part of the National Tuberculosis Genotyping and Surveillance
Network, Centers for Disease Control and Prevention (CDC). During the study
period, 1575 culture-positive TB cases were reported and isolates from 1207
cases (77%) were genotyped. Isolates from 368 cases (23%) were nonviable or
not available. Out of 1207 total isolates, 68 belonged to the W family based
on their IS6110 pattern similarities to the index
W strain. These 68 isolates were further analyzed.
Mycobacterium tuberculosis isolates were cultured
on Lowenstein-Jensen slants and grown at 37°C for 3 to 5 weeks. Right
and left IS6110 DNA fingerprint analysis was performed
as previously described.17 The hybridization
patterns were compared on a Sun Sparc 5 workstation (Sun Microsystems, Palo
Alto, Calif) with BioImage Whole Band Analyzer software version 3.4 (BioImage,
Ann Arbor, Mich). Classification of the DNA fingerprint patterns was previously
described.18 Isolates with identical banding
patterns were assigned the same arbitrary letter code (eg, W, C, BE) to indicate
that at least 2 TB cases were caused by the same strain. The IS6110 patterns that resembled, but were not identical to, 1 of the strain
types were denoted by the addition of a number to the cluster letter (eg,
W4, W79). Since 1992, PHRI has characterized nearly 10,500 M tuberculosis isolates of which 80% were cultured from New York City
and New Jersey patients. The other isolates were from 7 additional states
in the United States, the former Soviet Union, Singapore, South Africa, Romania,
Egypt, Israel, Venezuela, Honduras, Mexico, India, Chile, and Kenya.
The W family isolates were further characterized using a number of previously
described secondary typing methods. Sequence determination of codons 463 and
95 in the genes encoding catalase peroxidase (katG)
and the A subunit of DNA gyrase (gyrA), respectively,
cataloged the isolates into 1 of 3 principal genotypic groups.12,19
The unique direct repeat region of the M tuberculosis
chromosome was compared for each isolate using the spoligotype membrane format.20,21 Specific IS6110 insertion site mapping probes were used to determine the presence
of insertions in the origin of replication and in the NTF chromosomal region.12,13 The DNA was also compared on the
basis of Southern blot hybridization using a consensus polymorphic GC-rich
repetitive sequence (PGRS) probe.22,23
Polymerase chain reaction was used to determine the exact number of tandem
DNA repeats at each of 5 chromosomal loci containing variable numbers of tandem
repeats (variable number tandem repeat [VNTR] loci ETR-A through ETR-E), as
The demographic and clinical data were obtained from the TB surveillance
system in New Jersey. This includes data from the NJDHSS contact investigation
reports, which were used to evaluate epidemiological links between the patients.
Routine contact investigations were conducted on all proven or suspected pulmonary
TB cases. Investigations included an index patient interview and identification
of close contacts. Patient contacts were interviewed, and tuberculin tests
and chest radiography were performed if necessary. Cases identified through
the contact investigation were considered to have epidemiologic links to the
index case. Tuberculosis patients from Essex, Hudson, and Passaic counties
were defined in this study as residents of urban northeast New Jersey counties.
Analysis was carried out using SAS, version 6.12 (SAS Institute, Cary,
NC). Fisher exact and χ2 tests were used
to compare the proportions of categorical variables between groups. Crude
odds ratios were calculated with SAS; a value of 0.5 was added when tables
consisted of 0 values.
Geographic Information System mapping was carried out for all cases
included in this study. Mapping coordinates were abstracted from the New Jersey
Topologically Integrated Geographic Encoding and Referencing file (US Bureau
of Census and US Geological Survey) and linked to 1990 Census of Population
and Housing. Maps were generated using ARC/INFO (v 7.12; Environmental Systems
Research Institute Inc, Redlands, Calif).
Of 1207 New Jersey TB cases typed by IS6110
DNA fingerprinting, isolates from 433 cases (36%) had IS6110 hybridization patterns that were unlike any others in the New
Jersey collection or the PHRI TB Center archive (unique isolates). Among the
remaining 774 cases, 237 (31%) were assigned to 11 major strain types (defined
as ≥10 cases each during the 45-month study period). In addition, 179 isolates
fell into 40 strain types with 3 to 9 TB cases each and 90 cases segregated
into 45 strain types with 2 cases each.
Compared with the entire population of New Jersey TB patients, those
with unique isolates and those with isolates related to other cases were similar
in sex (men, 248/433 [57%] vs 459/774 [59%]; women, 185/433 [43%] vs 315/774
[41%]; P = .50) and proportion of Hispanic (95/433
[22%] vs 155/744 [20%]; P = .21) and white persons
(90/433 [21%] vs 155/744 [20%]; P = .77). Patients
with related isolates were younger than those with unique patterns (mean age,
47 vs 45 years; P<.02). Patients with unique isolates
were more likely to be Asian (146/433 [34%] vs 122/774 [16%]; P<.001), whereas those with related isolates were more likely to
be non-Hispanic black (102/433 [23%] vs 342/744 [44%]; P<.001). Among patients with known human immunodeficiency virus
(HIV) status, there was a higher percentage of HIV-seropositivity in the related-isolate
group (43/433 [10%] vs 175/744 [23%]; P<.001).
Unique isolates were more likely to come from non–US-born persons (287/433
[62%] vs 288/744 [37%]) and related isolates were more likely to be seen in
US-born patients (146/433 [38%] vs 486/774 [63%]; P<.001).
Related isolates were also more likely to come from residents of urban northeast
New Jersey counties (175/433 [40%] vs 426/774 [55%]; P<.001).
Solely on the basis of IS6110 DNA fingerprints
that resembled the 18-band W strain pattern signature, a total of 68 isolates
(6%) with 29 different patterns were assigned to the W family. Genotypic grouping,
multiplex polymerase chain reaction, and IS6110 insertion
site mapping, all molecular methods previously used to distinguish the W family,4 confirmed the identity of the W family isolates in
A sample of 234 isolates from New Jersey, including all strains with
limited copies of IS6110, were spoligotyped, and
the patterns were analyzed against the Wadsworth database, which contains
an additional 847 samples from the Northeast TB population. All 68 isolates
grouped to the W family had spoligotype S00034; this pattern was not found
among other New Jersey isolates analyzed.
In summary, all 68 W family isolates were genotypic group 1, had the
A1 IS6110 insertion in the origin of replication,
the single IS6110 copy in the NTF, and spoligotype
Among the 29 IS6110 hybridization patterns
similar to the index W strain (Figure 1),
2 subtypes, W4 and W69, represented 25 and 10 individual cases, respectively.
Their closely related fingerprint patterns had a common signature-banding
motif (the W4 motif) and an IS6110 copy number ranging
between 20 and 22 insertions. This motif was also identified in 5 additional
types (W79, W91, W150, W152, and W164) isolated from 1 to 3 cases each. These
7 types, cultured from a total of 43 patients, defined the relatively homogeneous
group A samples in our study. The group A strains are viewed as a distinct
branch of the W family.
Twenty-three additional patterns in 25 patients lacked the W4 motif
and therefore were assigned to group B. These isolates had 16 to 25 IS6110 insertions (Table
1 and Figure 1). The
MDR W strain prototype (ie, New York City outbreak strain) fingerprint pattern
and the Beijing family strains isolated throughout Asia were also assigned
to group B.
As shown in Table 1, the
PGRS and VNTR genotypes among the 29 IS6110 subtypes
divided the samples into groups A and B in agreement with the fingerprint
pattern distinctions. Only 2 PGRS hybridization patterns (42/43 isolates were
type P0002) were found among the group A strains; all group A strains had
the 32435 VNTR pattern. The 23 group B IS6110 subtypes
had 3 different VNTR subtypes and 14 PGRS patterns. Among group B strains,
only W65, isolated from a single individual, had the VNTR pattern typical
of group A.
Of the 68 cases associated with W family strains, 49 (72%) occurred
in men, 23 (34%) in non–US-born individuals, and 37 (54%) among non-Hispanic
blacks. Among the 33 persons for whom HIV status was known, 17 (52%) were
co-infected with HIV.
Table 2 shows a comparison
of characteristics of group A and B patients. Group A patients were more likely
to be non-Hispanic black (odds ratio [OR], 17.33; 95% confidence interval
[CI], 4.8-62.4) and US born (OR, 30.88; 95% CI, 9.2-103.4). Among cases for
whom data were available, group A patients were 21.7 times more likely to
be HIV-seropositive. Four group A patients were born outside the United States.
(Table 1), all of whom had known
risk factors for TB acquisition (defined as at least 1 of the following: HIV
seropositivity, history of incarceration, intravenous drug use) that suggested
Figure 2 illustrates the geographic
occurrence of W family cases in New Jersey. Group A and group B patients were
from 8 and 10 New Jersey counties, respectively. Eighty-one percent of group
A patients were from 3 neighboring northeast New Jersey counties (see "Methods"
section), of which 94% were present in 5 neighboring cities. In comparison,
only 16% of group B patients resided in these 3 counties. Forty-two percent
(18 cases) of all group A cases were prevalent in Paterson, NJ. Furthermore,
25% of all cases reported in this city (n = 71) were either W4 or a closely
associated variant (8, W4 cases; 8, W69; 1, W150; and 1, W152). Most counties
with group A cases were located in northeastern New Jersey. Conversely, group
B strains were geographically more dispersed throughout New Jersey.
In this study, a combination of molecular techniques segregated a large
population of related M tuberculosis strains into
2 epidemiologically significant groups. Among a genetically well-characterized
family of strains drawn from a population-based sample from New Jersey, we
showed that 1 set of closely related strain variants, group A, appeared on
molecular grounds to be a separate phylogenetic branch of the W family. Epidemiological
characteristics of group A isolates such as geographical aggregation, near
absence of non–US-born patients, and high prevalence of specific demographic
factors indicate a locally produced cluster. In contrast to group A, group
B variants comprised a heterogeneous set of distantly related isolates from
the W family. This group exhibited epidemiological correlates of an endemic
and globally prevalent disease as defined by geographical dispersion, high
proportion of non–US-born patients, and a lack of demographic uniformity.
This study emphasizes the usefulness of grouping strains with similar,
but not identical, IS6110 fingerprint patterns to
identify variants that may represent the extension of an outbreak. In New
Jersey, neither routine contact investigations nor relating strains by strict
IS6110 fingerprint interpretation would have recognized
the extent of the large group A cluster. The identification of group A was
made possible by a combination of W4-signature pattern analysis in conjunction
with additional molecular techniques.
The results presented in this study are of methodological significance.
They show that molecular methods for characterizing M tuberculosis are valid not only in elucidating transmission patterns when the epidemiological
situation is known, as in recognized outbreaks14- 16,26
or epidemic situations in small areas,7,8
but are relevant on a large-scale population level when outbreaks are not
suspected. Indeed, group A cases associated by genotype analysis display clustering
predominantly in black men in cities in urban northeast New Jersey counties
that is suggestive of an ongoing or recent outbreak in that area. This assertion
is further supported by the particularly strong local clustering of group
A cases (representing 25% of all reported cases) in 1 city. Significantly,
this clustering is in contrast to the background of diverse strain types associated
with other cases from this location (data not shown). Without molecular studies
this cluster would not have been suspected against the background of a generally
high TB case rate in that part of the state.
Our findings affirm and extend those of a recent study in which IS6110 fingerprinting was used in combination with geographic
analysis to assess M tuberculosis transmission in
Baltimore, Md.25 In the Baltimore study, strains
clustered by fingerprint analysis were primarily found in localized areas
of low socioeconomic status and in a patient population with high rates of
HIV infection, alcohol and drug abuse, and homelessness. In contrast, the
unclustered isolates were found in middle-class neighborhoods. Extensive contact
investigation identified only 24% of epidemiological links within the cluster
population. Based on these results, the investigators concluded that location-based
strain identification might render routine contact investigation more effective.
Similar conclusions were drawn from a recent Texas study showing the spread
of TB in frequented social settings.27 Our
findings, which cover fewer TB cases but refer to an even larger and more
diverse geographic area and invoke a wider array of molecular results for
strain characterization, support the conclusions of the Baltimore and Texas
These results also indicate that other factors, in addition to an MDR
phenotype, contributed to the appearance of the W strain outbreak. Indeed,
the outbreak strain that wreaked so much havoc in New York prisons5 and hospitals1,3
and across the United States6 in the early
1990s appears to have been a branch of a lineage that continues to develop,
evolving into variants that could be clearly differentiated. In this study,
the phylogenetic relatedness observed in group A isolates represents the local
spread and evolution of a particular strain variant. Individually, these variants
carry outbreak potential that can be augmented in certain situations, as was
the case with the W strain and drug resistance.
Consistent with the finding that these strains are geographically clustered,
only 3 TB cases from group A were identified in the New York population from
1993-1999. In addition, IS6110 fingerprint patterns
that define the group A cases have not been reported from the other states
that participate in the CDC National Tuberculosis Genotyping and Surveillance
Our study has a number of limitations. First, patient data linking cases
that appeared on molecular grounds to be related were not available to us.
This limits the inference that geographically or epidemiologically clustered
cases represent an outbreak or local spread. We purposely blinded ourselves
to the results of contact investigation to avoid biasing the analysis of the
correlation of surveillance data with molecular results. Second, we were only
able to fingerprint 77% of all culture-positive cases. Most of the cases with
no available isolates were reported from private clinics in southern New Jersey.
The demographic features of the patients may differ from those reported in
our study. However, this sampling bias is unlikely to alter the main conclusions
in this study, particularly the demographic contrasts between groups A and
B within the W family.
The integration of molecular and surveillance data has allowed public
health workers to focus their epidemiological investigation on patients infected
with related strains suspected to be the product of recent transmission vs
unique isolates that are most probably cases of reactivation. Consequently,
several molecularly guided cluster investigations have been initiated, including
a reinvestigation of group A cases. Since our study was completed, 5 additional
group A cases (4, W4; and 1, W69) from New Jersey have been identified. All
molecular and surveillance data are in agreement with the work presented here.
All 5 patients are US-born from urban northern New Jersey, and 3 of them are
In our study, molecular analysis identified the spread of M tuberculosis variants not previously recognized by classic epidemiology
and provided a better understanding of both endemic and outbreak strain transmission.
We believe that relating strains with the use of molecular typing may facilitate
a proactive approach to TB investigation that, guided by knowledge of strain
information, will allow a more rapid as well as rational application of traditional
contact investigation and better use of limited public health resources.