P699 Machine Learning-based Prediction for Early Progression in Korean Crohn’s disease: Results from the IMPACT Study

Chun, J.(1);Suji, K.(2);Kwang Sung, A.(3);Soo Kyung, P.(4);Sangsoo, K.(2);Dong Il, P.(4);

(1)Gangnam Severance Hospital- Yonsei University College of Medicine, Department of Internal Medicine, Seoul, Korea- Republic Of;(2)Soongsil University, Bioinformatics, Seoul, Korea- Republic Of;(3)Functional Genome Institute- PDXen Biosystems Inc., Genetics, Seoul, Korea- Republic Of;(4)Kangbuk Samsung Hospital- Sungkyunkwan University School of Medicine, Internal Medicine, Seoul, Korea- Republic Of;


Transcriptome-wide association studies (TWAS) improve to detect functionally relevant loci by leveraging expression quantitative trait loci (eQTLs) from reference panels in relevant tissues. Herein, we developed machine learning-based prediction models using a novel Korean TWAS model for early progression to stricturing or penetrating phenotypes in Korean patients with Crohn’s disease (CD).


A total of 431 patients diagnosed with CD were retrospectively enrolled from 15 referral hospitals in Korea. Single-nucleotide polymorphism genotype was analyzed using the Korea Biobank Array. Using the GTEx genotype and Korean chip genotype database from the terminal ileum, a novel Korean TWAS model was developed and compared with a European TWAS model. Predictive models for early progression were trained and cross-validated using logistic regression models and leave-one-out cross validation. Early progression was defined as transition to stricturing or penetrating phenotypes within 2 years from the diagnosis of CD.


Among the study population, early progression to stricturing and penetrating phenotype was detected in 60 (13.9%) and 73 patients (16.9%), respectively. Combined clinical-gene expression prediction models predicted early progression to stricturing (AUROC, 0.816) and penetrating phenotypes (AUROC, 0.801) more accurately than the models using clinical parameters alone. The following gene expression was significantly associated with early progression: CCDC154, FAM189A2, TAS2R19, FCSK, SP1, and KCNIP1 for stricturing phenotype and PUS7, CCDC146, MLXIP, LRGUK, UROS, and TAFA1 for penetrating phenotype.


The clinical-genetic machine learning models predicted progression to stricturing or penetrating phenotypes in the early stage of CD. Use of the comprehensive prediction models might enable clinicians to select patients requiring early aggressive therapeutic strategy to prevent disease-related complications.