Breast cancer remains one of the most common cancers worldwide, driving continued interest in artificial intelligence tools that can support earlier and more accurate diagnosis. A new study has found that the way medical images are prepared before analysis can have a significant impact on the performance of deep learning models used in breast cancer imaging.
The research focuses on breast image segmentation, a critical task that helps identify and outline areas of concern in medical scans. While advances in deep learning architectures have received considerable attention, the researchers argue that pre-processing steps have been largely understudied despite their direct influence on model performance.
Comparing pre-processing pipelines
The study examined two widely used public datasets, CBIS-DDSM for mammography and the Duke Breast Cancer MRI dataset, allowing the team to assess performance across different imaging modalities. A U-Net deep learning model was used to evaluate how various pre-processing techniques affected segmentation accuracy.
Researchers systematically tested commonly applied methods such as pixel intensity normalisation, resizing and padding, spacing harmonisation, and orientation standardisation. These techniques were organised into two distinct pipelines. The Domain Non-Specific pipeline applied general image processing methods commonly used in natural and medical image analysis. In contrast, the Domain Specific pipeline focused on preserving anatomical information by carefully using breast imaging metadata.
A detailed comparative analysis showed that these different approaches led to noticeable variations in segmentation outcomes, underlining the importance of tailoring pre-processing to the imaging domain.
Statistical insights and future impact
One of the most significant findings involved pixel intensity normalisation. Statistical testing using a three-way ANOVA F-test revealed clear differences in U-Net segmentation performance depending on how pixel intensities were handled. This suggests that even subtle choices made early in the deep learning pipeline can influence final results.
The authors acknowledge limitations related to dataset size and scope but emphasise that their findings provide valuable guidance for future research. By identifying pre-processing strategies better suited to breast imaging, the study offers a foundation for improving the accuracy and reliability of AI driven medical image analysis.
Reference
Catarino J et al. The impact of pre-processing techniques on deep learning breast image segmentation. Scientific Reports. 2025; https://doi.org/10.1038/s41598-025-30724-9.




