Home - Emerging Tech - Machine Learning Algorithm Outperforms Pathologists in Breast Cancer Metastases Detection Challenge
Research Brief

Machine Learning Algorithm Outperforms Pathologists in Breast Cancer Metastases Detection Challenge

Dutch researchers pit human against computer in an evaluation of algorithms for detecting breast cancer metastases in tissue sections

Meeri Kim, Contributor
Wednesday, December 27, 2017


A crucial part of breast cancer staging is determining the extent of lymph node metastases in tissue sections, which then informs clinical management of the patient. Although the task typically falls into the hands of pathologists, a recent study pitted human against computer to test whether a machine learning method could do a better job.

Dutch researchers organized a competition, the Cancer Metastases in Lymph Nodes Challenge 2016 (CAMELYON16), inviting research groups around the world to create an automated solution for breast cancer metastases detection. In the end, the best algorithm managed to outperform a panel of 11 pathologists with set time constraints in a diagnostic simulation. The results were published online by JAMA in December.

For CAMELYON16, participants first trained their algorithms using 270 whole-slide images of digitally scanned tissue sections, fewer than half of which had metastases. Then, the researchers evaluated the performance of the algorithms with a second set of 129 whole-slide images.

The final submissions included 32 algorithms, most of which were based on a branch of machine learning inspired by the neural networks of the brain. These deep learning-based algorithms performed better than other submissions as ranked by two criteria: the ability to identify specific metastatic foci in an image and the ability to discriminate between images with metastases versus those without.

The top five algorithms performed comparably to a pathologist interpreting the slides without time constraints. However, the best algorithm did slightly better than the panel of pathologists when limited to a two-hour session that mimicked routine workflow. Although the results show promise, a rigorous evaluation of the algorithms in a true clinical setting will be required as the next step.