AI Tools Can Improve Accuracy in Radiograph Reporting - EMJ

This site is intended for healthcare professionals

AI Tools Can Improve Accuracy in Chest Radiograph Reporting

AI tools

AI-PREFILLED structured reporting (AI-SR) improves diagnostic accuracy in bedside chest radiograph reporting, a 2026 prospective study has found.

Compared with free-text reporting, structured reporting (SR) enhanced efficiency by directing visual attention to the image. AI-SR improved diagnostic accuracy.

Emergence of AI Tools in Radiology

Radiology has recently seen the introduction of structured reporting (SR) and AI tools.

SR can improve standardisation, reporting completeness, clinical decision-making, and information extraction. Alternatively, it uses rigid templates, can direct the radiologist’s focus to the reporting system over the diagnostic task, and might fail to capture the nuances of complex results.

AI tools can highlight overlooked findings and reduce workload but face practical challenges and raise concerns surrounding automation bias and overreliance among less experienced radiologists.

Impact of AI Tools on Diagnostic Accuracy and Efficiency

In the prospective, comparative reader study, eight readers analysed 35 bedside chest radiographs. There were four novice readers (resident radiologists in training and pregraduate medical students) and four non-novice readers (resident radiologists in training).

There was no evidence of a difference in diagnostic accuracy between free-text reporting and SR, but AI-SR improved diagnostic accuracy for all readers. Novice readers benefited most: AI assistance aligned their diagnostic accuracy with that of non-novice readers.

AI-SR also improved diagnostic efficiency. The mean reporting time per chest radiograph decreased from 88.1 seconds ± 38.4 with free-text reporting, to 37.3 seconds ± 18.2 for SR, and to 25 seconds ± 8.8 with AI-SR.

Whilst novice readers showed a large and incremental increase in efficiency with both SR and AI-SR compared with free-text reporting, there was no evidence of additional efficiency gains for non-novice readers in SR compared with AI-SR.

Methodological Considerations

A significant limitation of the study is the small sample size. Only 35 radiographs were analysed by eight readers.

All participants completed the three reading sessions in a fixed order using the same radiographs. Whilst residual learning and order effects cannot be ignored, the two-week period between sessions and lack of feedback were designed to mitigate these factors. Diagnostic accuracy also only improved with the introduction of AI tools in the final session.

The focus on chest radiographs and methodological differences across emerging studies (such as the used of dictation in reporting) limits generalisability.

The study highlighted algorithmic aversion: radiologists may override even correct AI suggestions when they are framed as experimental. Six readers reported low trust in the AI model.

Clinical Implications

SR and AI-SR are promising developments but remain contentious.

There are several factors beyond algorithmic accuracy that researchers have called for the assessment of. For example, AI output timing, differences between experimental and approved AI tools, and user interface design. The study recommended systematic assessment of the relevant factors, in the name of optimising human-AI collaboration in radiology.

References

Khoobi M et al. Effect of reporting mode and clinical experience on radiologists’ gaze and image analysis behaviour at chest radiography. Radiology. 2026;318(2). DOI: 10.1148/radiol.251348.

Gaube S. Do as AI say: susceptibility in deployment of clinical decision-aids. NPJ Digit Med. 2021;4(1):31.

Author:

Each article is made available under the terms of the Creative Commons Attribution-Non Commercial 4.0 License.

Rate this content's potential impact on patient outcomes

Average rating / 5. Vote count:

No votes so far! Be the first to rate this content.