Development and Validation of a Deep Learning Model for Brain Tumor Diagnosis and Classification Using Magnetic Resonance Imaging

This diagnostic study investigates tumor diagnosis and classification outcomes with and without use of a deep learning system based on magnetic resonance imaging data.


eMethods 1. Annotation of Tumors in Training Data
For the precise localization of tumor regions, the tumors were labeled on the MRI scans under the supervision of neuroradiologists. The tumor and cyst areas were annotated using a single binary annotation label. Labeling was done on the T2 axial volume, and T1 contrast-enhanced axial volume.

Design and training of the Deep Learning System
To detect and classify 18 different types of tumors, a two-staged deep learning system (DLS) was designed. Stage one of the DLS consisted of a segmentation network that segmented the tumor regions from the healthy tissue and the second stage classified the identified tumor into one of the 18 classes. This complete architecture of the DLS is presented in supplemental figure S2.
Before the processing with the DLS, all the available 3D MRI scans were preprocessed with intensity normalization to have zero mean and unit standard deviation. Next, the scans were spatially normalized with bilinear interpolation to have uniform axial slice dimensions of 256x256 pixels. Considering the high pixel spacing along the axial direction which would result in poor interpolation performance, the images were not interpolated along the axial direction and an original number of axial slices was retained. Also, when using data from multiple MRI sequences, no head alignment, skull stripping, or registration to a standard brain template was performed to avoid the introduction of any potential registration errors and to reduce the processing time. The first stage of the DLS was designed as a modified 2D U-Net1 architecture that performed a binary segmentation of tumor regions from the 2D axial slices of the preprocessed MRI sequences. The complete 3D tumor segmentation was obtained by concatenating the 2D predictions from all axial slices. As presented in supplemental figure S2, the modified UNet architecture used in this work consisted of 4 downsampling and corresponding 4-up-sampling convolution blocks. This stage of the DSL was trained with 30% of the training data (N = 11,716) for which the tumor regions were manually annotated. Of this data, 80% was used for segmentation network training and the remaining 20% was reserved for the internal testing of the segmentation network. With this data, the segmentation network was trained using Adam optimizer for 100 epochs with the dice loss function and an initial learning rate of 10-3. The learning rate was reduced by a factor of two if the dice loss on the internal validation set (20% of the segmentation training set) did not decrease for 10 consecutive epochs. The model with the best loss on the internal segmentation validation set was selected as a final segmentation model.
Stage two of the DLS, the classification network, was designed using 5 modified DenseNet2 blocks and it classified tumors into 18 classes using the 3D MRI scans and the segmentation network's output. The classification network accepted the 3D MRI scans of size 24x256x256 and produced the class probability of each of the tumor classes as an output. In cases where the number of axial slices in the MRI data was less than 24, the input scans were zero-padded to 24 slices. Whereas, for patients with more than 24 slices, data of 24 slices that included the tumor region at the center were analyzed. The classification network was trained using the entire training data with Adam optimizer for 100 epochs the cross-entropy loss function and an initial learning rate of 10-3. Similar to the segmentation network training, the learning rate was reduced by a factor of two if the loss on the internal validation set (20% of the complete training set) did not decrease for 10 consecutive epochs.
The model with the best loss on the internal validation set was selected as a final classification model. Moreover, in our dataset, as is the case with the real-world clinical data, all MRI series were not always available for all the patients. Therefore, to accommodate this real-world scenario, using available series from all the patients, we trained a total of 6 architecturally identical models that accepted either a single MRI sequence (T1WI, T2WI, T1C) or a combination of 2 MRI sequences stacked as an input (T1WI & T2WI, T1WI & T1C, T2WI &  T2C). At the time of testing, for patients with only one available series, the tumor class predicted by the corresponding series model was considered as a final prediction. For patients with two available series, the majority class predicted by the corresponding 3 models (2 single series models and one combined series model) was considered as a final prediction. For three series scenario, a majority voting was performed on the predictions of all 3 combined series models to determine the final predicted class.

eFigure 2. Structure of Deep Learning System Used for Brain Tumor Detection and Diagnosis
The structure of the DLS system including the segmentation and classification network. Each encoder block contains one or more convolution steps followed by max-pooling and down-sampling processes. Each time the feature maps are down-sampled, the number of output channels are increased. Each decoder block comprises of one deconvolution (transpose convolution) operation which up-samples the size of the feature maps and correspondingly reduces the number of output channels.