Radiology Lung Cancer Risk Model Using AI - EMJ

This site is intended for healthcare professionals

Machine Learning Supports Radiology Lung Cancer Screening

lung cancer

Radiology based lung cancer screening may be improved by a machine learning driven lung cancer risk prediction model that demonstrates better discriminative performance than traditional statistical approaches, according to data from a large perspective cohort in China. 

Rationale for Radiology Focused Risk Prediction 

Lung cancer screening relies heavily on radiological imaging, particularly low dose CT, making accurate pre-screening risk stratification essential for optimising radiology resources and identifying individuals most likely to benefit. Risk prediction models applied before imaging can help refine eligibility criteria and improve screening efficiency. Despite this, limited research has explored machine learning based risk models in China, where population specific risk factors may influence radiology screening outcomes. This study evaluated whether a machine learning algorithm could enhance lung cancer risk prediction compared with conventional logistic regression. 

Study Design and Model Construction 

Investigators analysed data from 11,708 participants enrolled in a prospective cohort within the Guangzhou Lung-Care Project Program. Using stratified random sampling, the dataset was divided into a training set comprising 70% of participants and a validation set comprising 30%. Key predictive variables were selected using least absolute shrinkage and selection operator regression. Two lung cancer risk prediction models were then developed in the training set, one using logistic regression and the other using an extreme gradient boosting algorithm known as XGBoost. Model performance was assessed in the validation set using area under the curve as a measure of discriminative ability. 

Performance Relevant to Radiology Screening 

In the validation set, the logistic regression-based lung cancer risk prediction model achieved an area under the curve of 0.647 (95% CI: 0.574–0.720). The XGBoost model demonstrated slightly improved discrimination, with an area under the curve of 0.658 (95% CI: 0.589–0.727). Although the absolute difference was modest, the machine learning model showed greater robustness and predictive accuracy, suggesting potential value when integrated into radiology screening pathways to guide referral for imaging. 

An additional finding of clinical relevance was the identification of childhood exposure to cooking fuels as an important risk factor for lung cancer. This variable has rarely been included in previous models and may be particularly pertinent in populations where early life exposure to solid fuels is common, with implications for long term lung cancer risk assessed before radiological screening. 

Implications for Radiology Practice 

The findings indicate that a lung cancer risk prediction model based on the XGBoost algorithm may better support risk assessment at the screening stage than logistic regression alone. Incorporating such models into radiology screening programmes could enhance selection of high-risk individuals, improve efficiency of imaging services, and support more targeted use of low dose CT. Further validation is needed before routine clinical adoption, but the study highlights the growing role of machine learning in radiology driven cancer prevention strategies. 

Reference 

Zhang T et al. Construction of a lung cancer screening risk prediction model based on machine learning algorithms. J Evid Based Med. 2026:e70104. 

Author:

Each article is made available under the terms of the Creative Commons Attribution-Non Commercial 4.0 License.

Rate this content's potential impact on patient outcomes

Average rating / 5. Vote count:

No votes so far! Be the first to rate this content.