OP15 Multi-omic data integration with network analysis reveals underlying molecular mechanisms driving Crohn’s disease heterogeneity

P. Sudhakar1,2,3, B. Verstockt1,4, J. Cremer5, S. Verstockt1, T. Korcsmaros2,3, M. Ferrante1,4, S. Vermeire1,4

1Department of Chronic Diseases, Metabolism and Ageing - TARGID, KU Leuven, Leuven, Belgium, 2Gut Microbes and Health, Quadram Institute, Norwich, UK, 3Korcsmaros Group, Organisms and Ecosystems, Earlham Institute, Norwich, UK, 4Department of Gastroenterology and Hepatology, KU Leuven, University Hospitals Leuven, Leuven, Belgium, 5Laboratory of Clinical Immunology, Department of Microbiology and Immunology, KU Leuven, Leuven, Belgium

Background

Crohn’s disease (CD) is a heterogeneous disease characterised by clinical phenotypes including differences in disease behaviour, disease location and extraintestinal manifestations. However, the molecular mechanisms which orchestrate CD heterogeneity are relatively unexplored. We tried to infer such mechanisms by integrating two -omic datasets (genomics and blood proteomics) generated from CD patients.

Methods

576 unique proteins were measured from blood isolated from CD patients (n = 98) using seven different Olink® panels. All patients were also genotyped using Immunochip. We integrated the above two datasets using an unsupervised data integration algorithm called Multi-Omics Factor Analysis (MOFA). MOFA identifies Latent Factors (LFs) which are hidden representative variables which capture the sources of variation in the provided -omic datasets. LFs capturing less than 2% of the variance were discarded. By using a regression model, we identified explanatory LFs which associate with clinical phenotypes. Proteins and mutations were ranked according to the scores assigned by the corresponding explanatory LF. Potential effects of mutations were inferred by analysing their impacts on coding and non-coding functions. Local network motifs which capture the direct and indirect effects of mutations on protein expression were identified by using the Cytoscape tool ISMAGS. Protein–protein and transcriptional regulatory relationships retrieved from the OmniPath and DoRothEA databases, respectively, were combined to compile the interaction networks used by ISMAGS.

Results

From the MOFA analysis, we identified five LFs associated with at least one clinical phenotype. Clustering patients along the explanatory LFs achieved meaningful separation of clinical phenotypes such as perianal penetrating disease. The top-ranking proteins associated with perianal-disease included those involved in inflammatory pathways, autophagy or already known to be involved in CD such as IL-8, Rho-GTPase activators, MIF, Caspase 8, TRIM5 and SNAP29. The networks corresponding to the top ranking proteins associated with the perianal phenotype could be broken down into 102 local network motifs. These local motifs pointed out control mechanisms by which a total of 7 mutations mapped to transcription factors (SMAD3, BACH2) and post-translational regulators (such as IFNGR2, IL10, IL2RA, SLC2A4RG and ZMIZ1) could potentially regulate perianal disease‘s pathophysiology and could, therefore, be considered novel drug targets.

Conclusion

By using integrated signature profiles generated from multiple -omic datasets, we identified molecular mechanisms which could potentially describe CD phenotypes such as the occurrence of perianal disease.