P375 A simulation study to evaluate the performance of a multiple imputation method to address missing data in an analysis of clinical effectiveness using the ImproveCareNow Registry of pediatric patients with Crohn’s disease

Zhang, N.(1,2);Liu, C.(1);King, E.(1,2);Chen, S.(1);Olano, K.(1);Steiner, S.(3);Colletti, R.(4);Saeed, S.(5);Strauss, R.(6);Volger, S.(6);Lo, K.H.(6);Wang, Y.(6);

(1)Cincinnati Children’s Hospital Medical Center, Department of Pediatrics, Cincinnati, United States;(2)University of Cincinnati College of Medicine, Division of Biostatistics and Epidemiology, Cincinnati, United States;(3)Riley Hospital for Children/Indiana University School of Medicine, Pediatric Gastroenterology, Indianopolis, United States;(4)University of Vermont Children’s Hospital, Pediatrics, Burlington, United States;(5)Dayton Children’s Hospital/Wright State University, Pediatrics, Dayton, United States;(6)Janssen Research and Development- LLC, Immunology, Spring House, United States


Patient registries are a source of real-world data (RWD) that are increasingly used to generate real-world evidence (RWE) for the safety and effectiveness of medical therapies. However, in registries, outcome data typically collected in clinical trials may be missing in some patients, creating a major challenge in evaluating RWD reliably. Clinical remission in pediatric Crohn’s disease (PCD), defined by the Short Pediatric Crohn’s disease Activity Index (sPCDAI) <10, is routinely collected in the ImproveCareNow (ICN) Registry, the world’s largest registry of pediatric patients with inflammatory bowel disease (IBD). A feasibility analysis of ICN as a RWD source to estimate the efficacy of therapies in PCD found about one-third of patients were missing at least 1 of the 6 components needed to calculate the sPCDAI scores at Week 52. We conducted a simulation study to evaluate the performance of the multiple imputation (MI) method in addressing the missing data issue.


A Starting Dataset from the ICN registry included CD patients age 2 to <18 yr when beginning treatment with a biologic agent other than ustekinumab during 2014 to 2019, who had a baseline visit and a “Week 52” visit. A Complete Dataset included only patients in the Starting Dataset who had all components required to calculate a sPCDAI score at the Week 52 visit. The true remission rate was calculated for the Complete Dataset. For the simulation, missing data patterns for individual sPCDAI components that were observed in the Starting Dataset were randomly imposed in the Complete Dataset. Variables that were predictive of sPCDAI scores, its components, and sPCDAI missingness were included. Ten thousand datasets were simulated for each of three scenarios: base case of the observed missingness in the Starting Dataset, 1.5 times the base case and 2 times the base case. The MI method was applied to the simulated datasets to estimate the remission rate. MI performance was measured by bias and coverage of the true remission rate from the Complete Dataset.


The Starting and Complete Datasets included 1,458 and 1,212 patients, respectively. The true clinical remission rate in the Complete Dataset was 75.1% (95% CI: 72.6%-77.5%). Analysis without MI for the missing sPCDAI data resulted in underestimation of clinical remission rate, with bias about -2% to -5% (Table 1). The MI method eliminated the bias and had 100% coverage in all simulation scenarios by using a large number of independent predictors.


The MI method improved the validity, reliability, and efficiency of using RWD for estimating clinical remission in pediatric patients with CD and could be valuable in other registry studies.