Bin Tu

and 5 more

Objective: To find out whether the prediction model using a machine learning approach can have comparable accuracy with the current state-of-the-art trisomy detection methods in extremely low-depth sequencing data. Verify the practical feasibility of being used for clinical auxiliary screening of fetal trisomy. Design: A public dataset with 144 samples is divided into training/validation/test (testA) set. A dataset with 270 sequencing samples was used for independent testing. Setting: Samples are from Hong Kong, China; London, England; Amsterdam, the Netherlands; and Beijing, China. Population: 414 maternal blood samples were analyzed for this study. Methods: The machine learning method for low-depth short sequencing data from maternal blood. Main Outcome Measures: Fetal karyotype was analyzed by interventional prenatal diagnosis or obtaining cord blood after birth. Results: We demonstrate the predictive ability of our method by testing on data from different sources. The final best model achieved an AUC of 99.85% in predicting T21 using chr21 features which are the DNA fragment concentrations. The AUC is 99.50%, and 97.70% in predicting T18 and T13 with all features from 24 chromosomes. PPV was 91.67%, 93.33%, and 83.33% in predicting T21, T18, and T13, respectively. The NPV to identify T21, T18, and T13 were 100%, 99.33%, and 98.70%, respectively. Our approach does not need to calculate fetal fraction (FF) and can handle samples from a wide range of gestational ages (GA), twin pregnancies and fetal mosaicism. We achieved high PPV with low-depth sequencing and robust performance in an independent dataset. Conclusion: Our approach can achieve comparable accuracy with the current best methods. Our pipeline can be an important aid for the detection of fetal trisomy in clinical NIPT.