Machine learning algorithms have made strides in the classification of metastasis in tissue histopathology images. However, one of the major roadblocks faced by these algorithms is the difference in staining of tissue samples taken and scanned at different laboratories. Stain normalization works to standardize the color and intensity of stain patches to a reference image, hence bringing different laboratory stainings to a similar domain. We propose to compare different stain normalization methods in conjunction with color augmentation to evaluate the performance of combinations of techniques for binary classification of metastatic tissue slides taken from lymph nodes. We examine the accuracy, precision, recall, F1 score, and AUROC of a convolutional neural network (CNN) model trained and tested on images that have been normalized using the Macenko and Vahadane methods. Six different configurations that combine color augmentation and stain normalization were analyzed on the PatchCamelyon (PCAM) dataset consisting of over 300K images. Our analysis showed that the Macenko method of stain normalization improves model performance, and similarly, data augmentation shows general improvement, as the increased diversity amongst the data counters overfit in the model. Model accuracy with Macenko and color augmentation improved from baseline by 1.59% and F1 score for Macenko improved from baseline by 2.50%. The best performing combination was color augmentation with the Macenko method.
|