P156 A multicentric study on the development and application of a deep learning algorithm for automatic detection of ulcers and erosions in the novel PillCam™ Crohn’s capsule

Ribeiro, T.(1);Mascarenhas, M.(1);Afonso, J.(1);Cardoso, H.(1);Andrade, A.P.(1);Lopes, S.(1);Mascarenhas Saraiva, M.(2);Ferreira, J.(3);Macedo, G.(1);

(1)Centro Hospitalar de São João, Department of Gastroenterology, Porto, Portugal;(2)ManopH Gastroenterology Clinic, Department of Gastroenterology, Porto, Portugal;(3)Faculdade de Engenharia da Universidade do Porto, Department of Mechanical Engineering, Porto, Portugal;

Background

Crohn’s disease has the potential to involve all segments of the gastrointestinal tract, requiring frequent and judicious clinical and endoscopic surveillance. Capsule endoscopy (CE) has revolutionized the management of these patients, playing a role in diagnosis, classification, and assessment of response to treatment. In 2017, PillCam™ Crohn’s Capsule (PCC) was introduced to provide a panenteric assessment to this panenteric disease. However, reviewing each PCC exam is time-consuming and prone to errors as it implies examining thousands of images while significant lesions may only be represented in a small number of frames. Recently, the development of artificial intelligence (AI) algorithms for automatic interpretation of endoscopic images has generated a large interest. The performance of these tools, particularly convolutional neural networks have shown potential to overcome some of the main drawbacks of CE. This multicentric study aims to assess the performance of a deep learning algorithm for the automatic detection of small intestinal and colonic ulcers and erosions in PCC images.

Methods

We included a total of 59 PCC exams from two centres (ManopH Gastroenterology Clinic and Centro Hospitalar Universitário de São João) performed between 2017 and 2021. A total of 78415 frames were extracted. This pool of images was constituted by 14124 images depicting ulcers and erosions and 64291 with normal enteric and colonic mucosa. For the automatic identification of these findings, this image pool was split into training and validation datasets in a patient-based manner. Therefore, images from the same patient were restricted to a single dataset. A CNN model with transfer learning was developed using tensorflow and keras tools. 

Results

The neural architecture of our CNN was optimized for the detection of enteric and colonic ulcers and erosions. An example of the output provided by the CNN is shown in Figure 1.
The deep learning model had a sensitivity of 86.7%, a specificity of 98.6%, a positive predictive value of 92.3% and a negative predictive value of 97.4%. The overall accuracy of our network was 96.6% (Figure 2)
Figure 2 - Confusion matrix between the predictions of the algorithm (predicted label) and experts' consensus (true label). Abbreviations: N - normal; PUE - ulcers and erosions..
At a threshold value of 50%, our CNN had an area under the curve of 0.98 for the detection of ulcers and erosions (Figure 3).
Figure 3 - Receiver operating characteristic analysis with respective area under the curve (AUC). Abbreviations: PUE - ulcers and erosions.

Conclusion

In this multicentric study, our group has developed a CNN and assessed its performance in a patient-based design, which provides significant robustness to our results. Patient-split analyses are the first step towards real-life application of AI solutions which, allied to the high-performance results of our model, may significantly improve the diagnostic yield of this panendoscopic tool.