Journal article
Performance comparison of four published lung cancer prediction models applied to a cohort from the National Lung Screening Trial
Translational lung cancer research, Vol.14(9), pp.3577-3588
09/2025
DOI: 10.21037/tlcr-2025-439
PMCID: PMC12541660
PMID: 41132971
Abstract
Background: Mathematical prediction models (MPMs) based on clinical and radiologist-assessed features have been developed to assist with lung cancer risk assessment for imaging-detected lung nodules. However, MPMs were developed using different datasets, thresholds, and feature sets, making it difficult to cross-compare the published performance metrics and determine prospective performance stability. The aim of this study is to utilize a large lung cancer screening cohort with identified pulmonary nodules to compare the performance of four MPMs, at a standardized sensitivity value, to reduce the false positive rate for lung cancer screening exams.
Methods: This retrospective study utilized low-dose computed tomography (LDCT) identified lung nodules from the National Lung Screening Trial (NLST) to evaluate four MPMs [Mayo Clinic (MC), Veterans Affairs (VA), Peking University (PU), and Brock University (BU)]. For cross-comparison, a small NLST sub-cohort (n=270) was used to determine a calibrated decision threshold for each model, targeting a sensitivity for detecting lung cancer of 95%. Performance was evaluated using area under the receiver-operating-characteristic curve (AUC-ROC), area under the precision-recall curve (AUC-PR), sensitivity, and specificity. The calibrated threshold applied to the remaining NLST cohort (n=1,083) was used to demonstrate the stability of performance metrics.
Results: A total of 1,353 patients [mean ± standard deviation (SD) age, 62.3±5.2 years; 746 male] were included, of which 122 (9.0%) had a malignant nodule. At the target sensitivity of 95%, the highest testing specificity (correctly identified benigns) was seen in the BU and MC models (55% and 52%, respectively), compared to the VA (45%) and the PU (16%). The AUC-ROCs for BU (83%), MC (83%), PU (76%), and VA (77%) suggest high-moderate performance, while AUC-PR more accurately reflects that all the models have sub-optimal precision (27–33%).
Conclusions: Tuning calibration thresholds of existing MPM aids in performance comparison and stability for application in the lung cancer screening setting. However, targeting high sensitivity (95%), the achievable specificity of the MPMs is low (16–55%), which may limit clinical utility.
Details
- Title: Subtitle
- Performance comparison of four published lung cancer prediction models applied to a cohort from the National Lung Screening Trial
- Creators
- Kimberly E. SchroederKevin KnoernschildSarah L. AverillRichard M. HoffmanJessica C. Sieren
- Resource Type
- Journal article
- Publication Details
- Translational lung cancer research, Vol.14(9), pp.3577-3588
- DOI
- 10.21037/tlcr-2025-439
- PMID
- 41132971
- PMCID
- PMC12541660
- NLM abbreviation
- Transl Lung Cancer Res
- ISSN
- 2218-6751
- eISSN
- 2226-4477
- Publisher
- AME PUBLISHING COMPANY
- Number of pages
- 12
- Grant note
- National Institutes of Health: R01CA267820, T32HL144461, P30CA086862
Funding: This research was supported by the National Institutes of Health [No. R01CA267820 (to K.E.S., K.K., R.M.H., and J.C.S.) , No. T32HL144461 (to K.K.) , and No. P30CA086862 (to R.M.H. and J.C.S.) ] .
- Language
- English
- Date published
- 09/2025
- Academic Unit
- Roy J. Carver Department of Biomedical Engineering; Radiology; General Internal Medicine; Internal Medicine
- Record Identifier
- 9984969244502771
Metrics
33 Record Views