Figure 1. Number of cases analyzed in this study by state. Our patients were distributed throughout the United States, and our results would not have been influenced by any particular surgeon or pathologist, thus increasing the generalizability of our findings.
Figure 2. Lymph node counts in 918 colorectal cancers in our 2 US nationwide prospective cohort studies. A, Distribution of the negative node count. B, Distribution of the total node count. Both negative and total node counts approximately follow a gamma-Poisson–like distribution. In the boxplot above each graph, the vertical line in the middle of each box indicates the median, the diamond indicates the mean, and the left and right borders of the box mark the 25th and 75th percentiles, respectively. The whiskers extending from the left and right ends of the box mark the 5th and 95th percentiles, respectively. The points beyond the whiskers are outliers beyond the 5th or 95th percentile. C, Correlation between the negative node count and specimen length. D, Correlation between the negative node count and tumor size.
Morikawa T, Tanaka N, Kuchiba A, Nosho K, Yamauchi M, Hornick JL, Swanson RS, Chan AT, Meyerhardt JA, Huttenhower C, Schrag D, Fuchs CS, Ogino S. Predictors of Lymph Node Count in Colorectal Cancer ResectionsData From US Nationwide Prospective Cohort Studies. Arch Surg. 2012;147(8):715–723. doi:10.1001/archsurg.2012.353
Author Affiliations: Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School (Drs Morikawa, Kuchiba, Yamauchi, Meyerhardt, Schrag, Fuchs, and Ogino), Departments of Pathology (Drs Hornick and Ogino) and Surgery (Dr Swanson) and Channing Laboratory, Department of Medicine (Drs Chan and Fuchs), Brigham and Women's Hospital and Harvard Medical School, Gastrointestinal Unit, Massachusetts General Hospital (Dr Chan), and Department of Biostatistics, Harvard School of Public Health (Dr Huttenhower), Boston, Massachusetts; National Surgical Adjuvant Breast and Bowel Project Operations and Biostatistics Center, Pittsburgh, Pennsylvania (Dr Tanaka); and First Department of Internal Medicine, Sapporo Medical University, Sapporo, Japan (Dr Nosho).
Objective To identify factors that influence the total and negative lymph node counts in colorectal cancer resection specimens independent of pathologists and surgeons.
Design We used multivariate negative binomial regression. Covariates included age, sex, body mass index, family history of colorectal carcinoma, year of diagnosis, hospital setting, tumor location, resected colorectal length (specimen length), tumor size, circumferential growth, TNM stage, lymphocytic reactions and other pathological features, and tumor molecular features (microsatellite instability, CpG island methylator phenotype, long interspersed nucleotide element 1 [LINE-1] methylation, and BRAF, KRAS, and PIK3CA mutations).
Setting Two US nationwide prospective cohort studies.
Patients Patients with rectal and colon cancer (N = 918).
Main Outcome Measures The negative and total node counts (continuous).
Results Specimen length, tumor size, ascending colon location, T3N0M0 stage, and year of diagnosis were positively associated with the negative node count (all P ≤ .002). Mutation of KRAS might also be positively associated with the negative node count (P = .03; borderline significance considering multiple hypothesis testing). Among node-negative (stages I and II) cases, specimen length, tumor size, and ascending colon location remained significantly associated with the node count (all P ≤ .002), and PIK3CA and KRAS mutations might also be positively associated (P = .03 and P = .049, respectively, with borderline significance).
Conclusions This molecular pathological epidemiology study shows that specimen length, tumor size, tumor location, TNM stage, and year of diagnosis are operator-independent predictors of the lymph node count. These crucial variables should be examined in any future evaluation of the adequacy of lymph node harvest and nodal staging when devising individualized treatment plans for patients with colorectal cancer.
The presence of lymph node metastasis has important implications in the prognosis and treatment of patients with colorectal cancer.1 Observational studies indicate that the number of lymph nodes assessed by pathological examination (in particular, the negative node count) is associated with longer survival in colorectal cancer.2 Thus, along with disease stage and tumor molecular features, the node count is often used for treatment decision making by oncologists. However, the optimal number of lymph nodes that must be assessed remains controversial.2- 5 Although the average number of lymph nodes evaluated for colorectal cancer has increased in the past decade, it remains uncertain how the node count is influenced by demographic, clinical, and tumor molecular factors.6,7
Quiz Ref IDThe number of recovered lymph nodes may be influenced not only by surgeons and pathologists but also by factors independent of surgeons and pathologists. Those “operator-independent” factors include tumor location, disease stage, tumor size, host immune response,8 and tumor molecular features, such as microsatellite instability (MSI) and the CpG island methylator phenotype (CIMP).9 Molecular features of colorectal cancer and host immune response have been associated with the node count.9,10 Previous studies have examined the relationship between the recovered node count and various demographic and clinical features in population-based11- 16 and hospital-based17- 27 studies, but all those studies11- 26,28,29 lacked comprehensive data on specimen length, tumor size, host immune reaction to tumor, and tumor molecular features. Beyond surgeon- and pathologist-related (ie, operator) factors, it is important to identify patient-specific node count predictors (ie, clinical, pathological, or tumor molecular factors) to assess the adequacy of lymph node examination for each patient. To accomplish this aim, a comprehensive database of a large number of colorectal cancer cases with clinical, specimen, pathological, and molecular annotations is needed.
We therefore conducted this molecular pathological epidemiology8,30,31 study using a database of 918 colorectal cancer cases in 2 prospective cohort studies. Considering the overall distribution of the node count, we used negative binomial regression analysis to identify factors associated with the negative and total node counts. Because we used a US nationwide cohort database with clinical, specimen, pathological, and tumor molecular variables (including MSI, CIMP, and KRAS (HGNC 6407), BRAF (HGNC 1097), and PIK3CA (HGNC 8975) mutations), we could assess each node count predictor independent of operator (surgeon and pathologist) factors.
We used the databases of 2 prospective cohort studies: the Nurses' Health Study (consisting of 121 701 women who have been followed up since 1976) and the Health Professionals Follow-up Study (consisting of 51 529 men who have been followed up since 1986).32,33 Every 2 years, participants have been sent follow-up questionnaires to update information on potential risk factors and to identify newly diagnosed cancer in themselves and their first-degree relatives. For nonresponders, we searched the National Death Index to discover deaths and to ascertain the causes of death and any diagnosis of cancer. Study physicians reviewed medical records, including pathology reports, and recorded tumor location and pathological TNM (tumor-node-metastasis) stage, the positive and negative node counts,9 tumor size, circumferential growth along the bowel wall, and resected colorectal length (specimen length). We collected paraffin-embedded tissue blocks from hospitals where patients underwent tumor resections.33 We collected diagnostic biopsy specimens for patients with rectal cancer who received preoperative treatment to avoid artifacts or bias introduced by treatment. Based on the availability of data on the node count and tumor molecular features, we included a total of 918 colorectal cancer cases diagnosed up to 2006. Hospitals where our participants underwent colorectal resections were distributed throughout the United States (Figure 1). Informed consent was obtained from all study subjects. This study was approved by the Human Subjects Committees at Brigham and Women's Hospital and the Harvard School of Public Health.
Tissue blocks from all colorectal cancer cases were evaluated by a pathologist (S.O.). Tumor differentiation was categorized as poor vs well-moderate (≤50% vs >50% glandular areas). The presence and extent of mucinous and/or a signet ring cell component were recorded. Lymphocytic reaction patterns, such as peritumoral lymphocytic reaction and tumor infiltrating lymphocytes, were examined as previously described.34 A subset of cases (n > 100) was reviewed by another pathologist (T.M.), and concordance was as follows: κ = 0.72 for tumor differentiation, Spearman r = 0.87 for the percentage of mucin, Spearman r = 0.65 for the percentage of signet ring cells, and Spearman r = 0.65 for the summation score of peritumoral reaction and tumor infiltrating lymphocytes.
Quiz Ref IDWe extracted DNA from each tumor, and polymerase chain reaction (PCR) analysis and pyrosequencing targeted for KRAS (codons 12 and 13),35BRAF (codon 600),36 and PIK3CA (exons 9 and 20)37 were performed. The MSI status was determined using D2S123, D5S346, D17S250, BAT25, BAT26, BAT40, D18S55, D18S56, D18S67, and D18S487.38,39Quiz Ref IDWe defined MSI-high as the presence of instability in 30% or more of the markers and microsatellite stability/MSI-low as instability in 0% to 29% of the markers.
Sodium bisulfite treatment on DNA and real-time PCR (MethyLight) assays were validated and performed.40 We quantified promoter methylation in 8 CIMP-specific markers (CACNA1G [HGNC 1394], CDKN2A [HGNC 1787], CRABP1 [HGNC 2338], IGF2 [HGNC 5466], MLH1 [HGNC 7127], NEUROG1 [HGNC 7764], RUNX3 [HGNC 10473], and SOCS1 [HGNC 19383]).38,41,42 We defined CIMP-high as 6 or more methylated markers and CIMP-low/0 as 0 to 5 methylated markers according to the previously established criteria.38 To accurately quantify relatively high long interspersed nucleotide element 1 (LINE-1) methylation levels, we used pyrosequencing.43,44
We used SAS software (version 9.1.3; SAS Institute, Inc) for statistical analysis. All P values were 2-sided. Because of multiple hypothesis testing, a P value for significance was adjusted conservatively by Bonferroni correction to .0023 (P = .05/22). The χ2 test was used to assess the association between categorical variables, and analysis of variance was used to compare continuous variables across categories.
We adopted multivariate negative binomial regression analysis to assess predictors of the node count because the marginal distribution of the total or negative node count fit the gamma-Poisson–like distribution (Figure 2); overdispersion occurred with Poisson generalized linear models. The process of estimation was based on a negative binomial distribution that can be conceptualized as a mixture of a Poisson distribution and a gamma distribution.45 Variables initially included in a model were sex, age (continuous), prediagnosis body mass index (continuous), family history of colorectal cancer in any first-degree relative, year of diagnosis (continuous), hospital setting (academic vs nonacademic), tumor size (continuous), specimen length (continuous), circumferential growth (100% complete vs incomplete), tumor location and TNM stage (categorized as in Table 1 and Table 2), tumor differentiation (poor vs well-moderate), mucinous component (reported as a continuous percentage), signet ring cells (reported as a continuous percentage), peritumoral lymphocytic reaction and tumor infiltrating lymphocytes (absent/minimal vs mild vs moderate vs marked; ordinal), MSI (MSI-high vs microsatellite stability/MSI-low), CIMP (CIMP-high vs CIMP-low/0), LINE-1 methylation (continuous), and KRAS, BRAF, and PIK3CA (mutations vs wild-type variants). A backward elimination with a threshold of P = .1 was performed to select variables in the final model except for TNM stage and tumor location, for which all categories were forced into the model. For cases with missing data in a covariate, we carried out 2 separate analyses: the first analysis included all patients, with creation of a categorical indicator for missing responses (missing indicator method; the second analysis included all patients, with missing responses imputed (multiple imputation [MI]). The MI procedure in SAS was used to perform 20 imputations of all variables with missing cases by using the regression method. The results from the regression analysis were then appropriately combined by using the MIANALYZE procedure.
The total node count showed a skewed (gamma-Poisson–like) distribution (Figure 2): range, 0 to 54; mean, 12.0; median, 10; and interquartile range, 6 to 16 nodes. The negative node count also showed a skewed (gamma-Poisson–like) distribution: range, 0 to 54; mean, 10.5; median, 8; and interquartile range, 4 to 15 nodes.
Tables 1 and 2 show the clinical, pathological, and molecular features of colorectal cancers according to quartiles of the negative or total node count. The negative and total node counts were both positively associated with specimen length, tumor size, ascending colon location, T3N0M0 stage, MSI status, and CIMP status (all P < .001).
In a multivariate negative binomial regression model, factors independently associated with the negative node count included specimen length, tumor location, TNM stage, year of diagnosis, and tumor size (all P ≤ .002) (Table 3). In Table 3, for example, specimens with rectal cancer on average yielded a negative node count of approximately two-thirds (0.67) of that in specimens with ascending colon cancer after controlling for the effects of other variables. Quiz Ref IDA KRAS mutation appeared to be a predictor of the negative node count (P = .03), although multiple hypothesis testing should be considered and the finding confirmed by an independent data set.
Factors independently associated with the total node count included specimen length, tumor location, TNM stage, and tumor size after controlling for the effects of other variables (all P < .001) (Table 4). A KRAS mutation appeared to be a predictor of the total node count (P = .009), although multiple hypothesis testing should be considered.
In patients with stage I or II colorectal cancer, ascending colon tumor location, tumor size, and specimen length were positively associated with the node count after controlling for the effects of other variables (all P ≤ .002) (Table 5). Mutations of PIK3CA (P = .03) and KRAS (P = .049) also appeared to predict the node count, although multiple hypothesis testing should be considered.
We conducted this study to identify clinical, pathological, and tumor molecular variables that predict the node count in colorectal cancer resections independent of human operator factors (ie, surgeons and pathologists). We found that specimen length, tumor size, T3N0M0 stage, ascending colon tumor location, and year of diagnosis were positively associated with the negative node count. Mutation of KRAS might also be positively associated with the negative node count, but this finding should be confirmed by an independent data set. These results indicate that operator-independent variables influence the node count in colorectal resection and should be examined in any future study that assesses the adequacy of lymph node harvesting and staging.
Comprehensive assessment of clinical, pathological, and molecular features is important in cancer research.46- 49 Previous studies have reported that the recovered node count is positively associated with specimen length,14,18,19,21- 24 proximal colon cancer,13,18,19,25,28 larger tumor size,11,14,18,20 the number of vascular pedicles,50 and higher disease stage.17,18 However, those studies11,13,14,17- 24,28 produced no data on host immune response to tumor or tumor molecular features despite the possible influence of immune reaction and tumor molecular variables on the node count.9 Previous studies that examined the relationship between MSI and the node count lacked specimen length and molecular variables besides MSI.51,52 In contrast to all previous studies that examined potential predictors of the node count,11,13,14,17- 24,28 we have used a US nationwide cohort database with well-annotated clinical, specimen, pathological, and tumor molecular data, including MSI, CIMP, and KRAS, BRAF, and PIK3CA mutations, all of which are potential predictors of the node count.
With regard to the influence of medical care quality or socioeconomic status on the lymph node count,12,53 academic hospital status5,14,25 and the degree of practicing experience of the surgeons11,18 and pathologists11,13,17,18,54 have been associated with the node count (for review, see Storli et al55). However, all but three14,18,24 of those previous studies on medical care quality or socioeconomic status5,11,13,16,17,53,54 lacked data on specimen length.
Our ability to use the database of 2 US nationwide prospective cohort studies to assess operator-independent predictors of the node count provided advantages. First, cohort participants who developed cancer were treated at hospitals throughout the United States (Figure 1) and were more representative of colorectal cancer cases in the general US population than one might expect of patients in 1 to a few hospitals. Second, because of our study design, any particular surgeon or pathologist could not have influenced our results, increasing the generalizability of our findings. Third, our rich molecular pathological epidemiology8,30,31 database enabled us to simultaneously assess a number of variables and to adjust for potential confounding.
One weakness of our study is that participants of our cohort studies were US health professionals and predominantly non-Hispanic white individuals, thus constituting a rather homogeneous group, and the studies lacked other occupational and ethnic groups. One of the primary reasons for selecting health professionals as subjects in the cohort studies was that they have a good understanding of various diseases as well as of the value of the cohort studies, which increases the reliability and completeness of questionnaire-based follow-up and data collection. Second, we excluded a subset of cancer cases without available tumor tissue, which might cause bias. Nonetheless, the tumor specimen procurement rate has been 60% to 70% of attempts, and a previous study has shown that there is no substantial demographic or clinical difference between cases with and without tumor tissue analyzed.32
Our ultimate goal is to determine how many nodes must be harvested to attain optimal care when devising an individualized treatment plan in each case. To achieve this goal, we need to assemble an adequate database with prospective follow-up to record detailed outcome data, preferably in a trial setting. We do not have enough data now to recommend a specific number of nodes that should be examined for optimal patient care. Nonetheless, our unique data set, which has provided strong evidence of the effects of specimen length, tumor size, tumor location, and TNM stage on the node count, will likely serve as a guide for future trials.
Quiz Ref IDIn conclusion, our study has shown that specimen length, tumor size and location, and TNM stage are predictors of the lymph node count in colorectal cancer resections independent of operator (surgeon and pathologist) factors. In addition, some tumor molecular features, such as KRAS mutation, might influence the node count but must be confirmed by an independent data set. Our data suggest that these clinical, pathological, specimen, and molecular variables should be examined as crucial elements in any future evaluation of the adequacy of lymph node examination for patients with colorectal cancer.
Correspondence: Shuji Ogino, MD, PhD, MS(Epidemiology), Department of Medical Oncology, Center for Molecular Oncologic Pathology, Dana-Farber Cancer Institute, Brigham and Women's Hospital, 450 Brookline Ave, Room JF-215C, Boston, MA 02215 (firstname.lastname@example.org).
Accepted for Publication: January 30, 2012.
Published Online: April 16, 2012. doi:10.1001/archsurg.2012.353
Author Contributions: Drs Morikawa, Tanaka, Kuchiba, and Nosho contributed equally. Drs Tanaka and Ogino had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Swanson, Schrag, Fuchs, and Ogino. Acquisition of data: Morikawa, Nosho, Yamauchi, Chan, Schrag, Fuchs, and Ogino. Analysis and interpretation of data: Morikawa, Tanaka, Kuchiba, Hornick, Swanson, Chan, Meyerhardt, Huttenhower, Schrag, Fuchs, and Ogino. Drafting of the manuscript: Morikawa, Tanaka, Nosho, Yamauchi, Schrag, Fuchs, and Ogino. Critical revision of the manuscript for important intellectual content: Morikawa, Kuchiba, Hornick, Swanson, Chan, Meyerhardt, Huttenhower, Schrag, Fuchs, and Ogino. Statistical analysis: Tanaka, Kuchiba, Nosho, Huttenhower, Schrag, and Fuchs, Obtained funding: Chan, Schrag, Fuchs, and Ogino. Administrative, technical, and material support: Morikawa, Fuchs, and Ogino. Study supervision: Swanson, Huttenhower, Schrag, Fuchs, and Ogino.
Financial Disclosure: None reported.
Funding/Support: This work was supported by grants P01 CA87969, P01 CA55075, P50 CA127003 (Dr Fuchs), R01 CA151993 (Dr Ogino), and R01 CA137178 (Dr Chan) from the National Institutes of Health; the Bennett Family Fund for Targeted Therapies Research; and the Entertainment Industry Foundation through the National Colorectal Cancer Research Alliance.
Disclaimer: The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health.
Role of the Sponsor: The funding agencies had no role in the design of the study; in the collection, analysis, or interpretation of the data; in the decision to submit the manuscript for publication; or in the writing of the manuscript.
Additional Contributions: We thank the participants and staff of the Nurses' Health Study and the Health Professionals Follow-up Study for their valuable contributions, as well as the following state cancer registries for their help: Alabama, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Idaho, Illinois, Indiana, Iowa, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Nebraska, New Hampshire, New Jersey, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, Tennessee, Texas, Virginia, Washington, and Wyoming.