BACKGROUND AND AIMS
Inflammatory bowel disease requires close monitoring to detect flare-ups early and guide treatment.1,2 While endoscopy is the gold standard, it is invasive, costly, and uncomfortable.1,2 Intestinal ultrasound (IUS) is a non-invasive alternative with a strong correlation to endoscopic findings, especially bowel wall thickness (BWT), which is a marker of disease activity.3 Nevertheless, IUS remains limited due to operator dependence and a steep learning curve.4
AI has improved consistency in other domains (e.g., endoscopic evaluation), but its application to IUS in inflammatory bowel disease remains underexplored. Prior models lack interpretability, require manual image cropping, or are trained on ideal, selected data, limiting clinical use.5,6
The authors aim to develop a deep-learning model that automatically identifies and paints the bowel wall, measuring BWT directly from raw, clinical IUS images.7
MATERIALS AND METHODS
A training dataset of 570 images from 144 IUS videos, as well as a testing dataset of 55 images from 55 separate exams, were created. All images were extracted from previously performed IUS examinations reflecting real-world variation, including, for example, unclear boundaries and artefacts.
All images were annotated by International Bowel Ultrasound Group (IBUS) certified experts, including outline paintings of the inner and outer bowel wall and two BWT measurements.
The AI consisted of a combination of convolutional neural networks and other image processing algorithms.
Evaluation included BWT error against the expert mean, classification accuracy using the standard IBUS 3 mm threshold,3 and a leave-one-out comparison with individual doctors.
RESULTS
The model produced predictions on 54/55 test images, and deviated from the gold standard mean by 0.98 mm (SD: 1.10 mm) per image on the regression task. The average distance to expert-defined bounds was 0.44 mm (SD: 0.89 mm), with 59% of predictions staying inside this range (Figure 1).

Figure 1: A box plot representing the measurements of the experts (green boxes) against the predicted bowel wall thickness (purple).
Mean is shown in gold. Background represents classification outcome.
BWT: bowel wall thickness.
For classification (using a 3 mm threshold), the model reached an accuracy of 0.77, a sensitivity of 0.69, a specificity of 0.94, and a Cohen’s Kappa of 0.56.
In the leave-one-out analysis, expert performance ranged from 0.89–0.93 accuracy and 0.79–0.83 Kappa. Depending on which expert was excluded from the gold standard, the model achieved 0.74–0.80 accuracy and 0.49–0.60 Kappa. Experts stayed within expert-defined bounds in 72–81% of cases, while the model did so in 46–55%.
CONCLUSION
Clinically, the model performed well. An error of 0.5–1.0 mm is negligible in practice and matches typical variation in manual measurements. Many real-world test images had BWT values near the 3 mm threshold, so small deviations led to misclassifications. In practice, a 2.5 mm reading could still raise concerns based on symptoms.3
While the system alone cannot provide expert-level BWT measurements, it could provide assistance to experts as well as non-expert and junior doctors, especially in locating the bowel, which is an essential part of IUS.
The model works on unprocessed clinical data, with the only selection criteria being that there is an identifiable bowel segment, measurable by an IBUS-certified doctor, allowing the images to reflect realistic conditions.
In summary, the authors developed an AI model that identifies the bowel and measures BWT with acceptable accuracy. The authors have already started the process of collecting video data to extend the AI’s functionality to a fully clinical setting.






