Association of Peritumoral Radiomics With Tumor Biology and Pathologic Response to Preoperative Targeted Therapy for HER2 (ERBB2)–Positive Breast Cancer

Key Points Question Can quantitative imaging features extracted from the tumor and tumor environment on breast magnetic resonance imaging characterize tumor biological features relevant to outcome of targeted therapy? Findings In this diagnostic study of 209 patients, among HER2 (ERBB2)-positive breast cancers, an intratumoral and peritumoral imaging signature capable of discriminating the response-associated HER2-enriched molecular subtype was identified. When evaluated among recipients of HER2-targeted therapy, this signature was found to be associated with response to neoadjuvant chemotherapy. Meaning Quantitative analysis of the tumor and its surroundings may provide valuable cues into breast cancer biological features and likelihood of response to targeted therapy.


eMethods. Details on Design Annotation Protocol
Lesions were annotated by 7 readers, 6 of whom are board certified radiologists with 6-29 years contributing to radiology (DP, BNB, PT, ME, DDBB, KG) and 1 of whom is a medical doctor with extensive experience in medical imaging (KB). All readers with fewer than 5 years as a practicing radiologist (ME, DDBB, KG, KB) performed annotations in consensus with a senior radiologist (DP, 25 years practicing; BNB, 11 years practicing; PL, 7 years practicing). Lesions were annotated on three adjacent slices of DCE-MRI scans chosen to maximize tumor size while avoiding biopsy markers and artifacts. Radiologists were instructed to annotate the outer boundaries of tumor enhancement, including spiculation and internal non-enhancing regions surrounded by tumor (e.g. necrotic core).

Additional MRI acquisition Details for the Discovery Cohort
One or more 3D fat-saturated T1-weighted images were collected in the axial plane following contrast agent injection using a 1.5T (n=37) or 3.0T magnet (n=5). Pixel dimensions in the axial plane ranged from .50 mm x .50 mm to 1.06 x 1.06 mm, with an average of .70 mm x .70 mm+/-.14 mm. Slice thickness was on average 1.93 mm +/-.40 mm (range: 1-3 mm). Patients were injected with an average of 15.5 mL (range: 8-36 mL) of gadolinium-based contrast agent (Magnevist, Multihance, Gadavist, Optimark, or Prohance). Contrast agent dose was unavailable for five patients and contrast agent brand was unavailable for two.

Expanded Radiomic Descriptor Information
Laws. (1) 25 2-D Laws filters are derived by computing the outer product of combinations of the following 1-D filter vectors designed to capture specific textural patterns within an image. Each filter vector spans five pixels and is denoted as "P5," where P is a letter representing the textural pattern captured by the filter. To obtain a feature vector, each filter is convolved with the image and the absolute value of filter response within all voxels contained within a region of interest are concatenated. Features are named by the combination of filters applied in the y and x axes, e.g. L5E5 is the product of a level detection filter in the y axis and an edge detection filter in the x axis. Gabor.
(2) 2-D Gabor filters are computed by modulating a Gaussian kernel function with one of 48 sinusoidal plane waves. Each sinusoidal plane wave corresponds to a unique combination of one of six spatial wavelengths (2 pixels, 4 pixels, 8 pixels, 12 pixels, 16 pixels, 32 pixels, 64 pixels) and one of eight orientations (0°, 22.5°, 45°, 67.5°, 90°, 112.5°, 135°, 157.5°). Each Gabor filter is then convolved with the original image and values corresponding to filter response within the region of interest are concatenated.
(3) Intensity values within the image are discretized into 64 bins from 0 to 63. Gray level co-occurrence matrices were computed within a sliding window of 5 pixels by 5 pixels. Intensity values outside the region of interest were ignored when computing GLCM statistics. The following GLCM descriptors were computed, as described in (3): entropy, energy, inertia, inverse difference moment, correlation, information measure of correlation 1, information measure of correlation 2, sum average, sum variance, sum entropy, difference average, difference variance, and difference entropy. GLCM statistics were concatenated within regions-of-interest, yielding 13 descriptor vectors per region.
Co-occurrence of local anisotropy gradients (CoLlAGe). (4) An image's intensity gradients in the x and y direction, Fx and Fy, are computed. Within a sliding 5 pixel square window, the dominant intensity gradient orientation (between 0°-360°) is computed via principal component analysis, resulting in a 2D array of equal size with the dominant gradient orientation value centered at the corresponding pixel of the original image. Metrics of the co-occurrence matrix are then applied to this gradient orientation image in the same manner as described above for Haralick GLCM features. The resulting 13 CoLlAGe descriptors are then the same 13 co-occurrence metrics (entropy, energy, inertia, inverse difference moment, correlation, information measure of correlation 1, information measure of correlation 2, sum average, sum variance, sum entropy, difference average, difference variance, and difference entropy) computed from the gradient orientation image, indicating the homogeneity of intensity gradient directionality within an image.

Permutation testing for ROC curve significance
A permutation testing framework was chosen to determine AUC significance due to its applicability in both a cross-validation and testing setting. For each individual model, permutation testing (5,6) was performed with Monte Carlo sample to assess whether model performance offered significant improvement over a random model. Each simulation consisted of 50,000 iterations.
 Testing: The locked down HER2-E classifier was applied to the validation cohort to obtain predicted probability of response. A test-statistic for the ROC curve was computed using posterior probability from the classifier and ground truth response labels. Next, across 50,000 iterations, the ground truth pCR labels were randomly permuted. A second test statistic was computed from the original posterior probabilities and the randomly permuted labels. The proportion of runs where the test statistic corresponding to the randomly permuted ROC curve was greater than the true test statistic yields the p-value (5). 95% confidence intervals are twice the standard deviation in both directions from the mean AUC using non-permuted labels, assuming a normal distribution.  Cross-Validation: for each iteration, DLDA classifiers were trained on both the original data and a data set with permuted class labels in a three-fold crossvalidation setting. Posterior probabilities corresponding to classifiers trained using both the original and permuted labels were compiled for all patients across the three validation folds. Test statistics were computed for the ROC curve of each classifier. The p-value is the proportion of runs where the test statistic of the permuted classifier was greater than a classifier trained and tested using the ground truth (5). 95% confidence intervals are computed using the empirical probability distribution of the test statistic across all permutations and its variance (6).

Automated Lymphocyte Detection Model
A previously developed automated deep learning-based nuclei and lymphocyte detection model (7)  AUC, area under the receiver operating characteristic curve; GLCM, Gray level cooccurrence matrix features. CoLlAGe, co-occurrence of local anisotropic gradient orientation features; px, pixels; w, width; θ, orientation. eTable 3. Repeated feature selection experiments for HER2+ molecular subtyping across all intra-and peri-tumoral regions using alternative, non-parametric feature pruning approaches. Feature selection was repeated across the combined intra-tumoral and peri-tumoral feature pool using the following feature pruning approaches: (a) eliminating features by Spearman correlation > .6 and (b) eliminating features using an Elastic Net penalized regression with equal contributions by L1 and L2 regularization (11,12). Features listed in red were not identified in a given location in any feature discovery experiment. Features listed in orange were not included in the original combined intra-+ peri-tumoral top feature set, but were identified as top features within individual annular regions. Repeating classification within the discovery cohort resulted in similar AUCs to feature pruning by Pearson correlation. Regardless of feature pruning approach, AUC remained strongly significant (p<.001).

PAM50 Region
Group Abbreviations: ER, estrogen receptor; PR, progesterone receptor; a Assessed by unpaired two-sided student's t-test. b Assessed by Pearson's chi-squared test. eFigure 5. Correlation of HER2-E-associated feature sets with lymphocyte density within tumor and peripheral tissue on pre-treatment biopsy by peri-tumoral distance. a) HER2-E-associated features 0-3mm from the tumor are significantly associated with TIL density (n=27). b) For patients whose biopsy contained peri-tumoral tissue (n=13), correlation between radiomic features and peripheral lymphocytic density was stronger with distance from the tumor, but was not significant.