Predictive Modeling and Prognostic Assessment of Distant Metastasis in
Small Intestine Neuroendocrine Neoplasms Using Machine Learning and
Multifactorial Cox Analysis
Abstract
Abstract Background: The incidence of small intestine neuroendocrine
neoplasms (SI-NEN) has increased significantly, posing challenges in
early diagnosis and effective management due to non-specific symptoms
and complex tumor biology, especially in predicting distant metastasis
(DM). Methods: This retrospective study analyzed 3,157 patients
diagnosed with SI-NEN from 2005 to 2015 using the Surveillance,
Epidemiology, and End Results (SEER) database. We employed
multifactorial logistic regression to identify independent risk factors
for DM and developed several machine learning models to predict its
occurrence. These included a suite of nine key models: XGBoost, Logistic
Regression, LightGBM, Random Forest, Complement Naive Bayes, Multi-Layer
Perceptron Classifier, Decision Tree, Gradient Boosting Decision Tree,
and Support Vector Machine, all validated through k-fold
cross-validation and hyperparameter optimization. Furthermore, we
extended our analysis to survival studies to identify prognostic factors
that may significantly influence patient outcomes. Results: Logistic
regression demonstrated the highest efficacy, achieving an AUC of 0.774
in the training set and 0.747 in the validation set, values which are
considered high, indicating superior performance in detecting early DM.
Additionally, a machine learning-enhanced clinical nomogram was
constructed, incorporating individual patient characteristics for
personalized treatment planning. Survival analysis identified key
prognostic indicators, and the resulting prognostic nomogram displayed
high predictive accuracy, validated through calibration curves and
decision curve analysis. Conclusion: The study underscores the utility
of advanced predictive models in enhancing the diagnostic and prognostic
assessment of SI-NEN, suggesting a framework for future clinical
application and continuous improvement of these models. The developed
predictive and prognostic nomograms provide crucial tools for clinical
decision-making, potentially improving overall survival and quality of
life by facilitating personalized treatment strategies based on detailed
risk profiles. Keywords: SI-NEN, distant metastasis, machine learning,
logistic regression, prognostic nomogram, survival analysis, SEER
database.