Samuel G. Armato, Karen Drukker, Feng Li, Lubomir Hadjiiski, Georgia D. Tourassi, Justin S. Kirby, Laurence P. Clarke, Roger M. Engelmann, Maryellen L. Giger, George Redmond, Keyvan Farahani
Journal of Medical Imaging, Vol. 3, Issue 04, 044506, (December 2016) https://doi.org/10.1117/1.JMI.3.4.044506
TOPICS: Lung, Computed tomography, Calibration, Cancer, Medical imaging, Medical research, Algorithm development, Diagnostics, Computer aided diagnosis and therapy, Computer aided design
The purpose of this work is to describe the LUNGx Challenge for the computerized classification of lung nodules on diagnostic computed tomography (CT) scans as benign or malignant and report the performance of participants’ computerized methods along with that of six radiologists who participated in an observer study performing the same Challenge task on the same dataset. The Challenge provided sets of calibration and testing scans, established a performance assessment process, and created an infrastructure for case dissemination and result submission. Ten groups applied their own methods to 73 lung nodules (37 benign and 36 malignant) that were selected to achieve approximate size matching between the two cohorts. Area under the receiver operating characteristic curve (AUC) values for these methods ranged from 0.50 to 0.68; only three methods performed statistically better than random guessing. The radiologists’ AUC values ranged from 0.70 to 0.85; three radiologists performed statistically better than the best-performing computer method. The LUNGx Challenge compared the performance of computerized methods in the task of differentiating benign from malignant lung nodules on CT scans, placed in the context of the performance of radiologists on the same task. The continued public availability of the Challenge cases will provide a valuable resource for the medical imaging research community.