loading page

Sarcomatoid renal cell carcinoma prognosis prediction based on the machine learning algorithm
  • +7
  • Rui Zhang,
  • Xuefei Qin,
  • Xuelin Gao,
  • Yu Zheng,
  • Guangdong Hou,
  • Yueyue Zhang,
  • Yuchen Tian,
  • Yuliang Wang,
  • Fuli Wang,
  • Shuaijun Ma
Rui Zhang
Xijing Hospital
Author Profile
Xuefei Qin
University of Edinburgh School of Arts Culture and Environment
Author Profile
Xuelin Gao
Xijing Hospital
Author Profile
Yu Zheng
Xijing Hospital, Fourth Military Medical University
Author Profile
Guangdong Hou
Xijing Hospital
Author Profile
Yueyue Zhang
Xijing Hospital
Author Profile
Yuchen Tian
Xijing Hospital
Author Profile
Yuliang Wang
Xijing Hospital
Author Profile
Fuli Wang
Xijing Hospital, The Fourth Military Medical University

Corresponding Author:wangfuli98@163.com

Author Profile
Shuaijun Ma
Xijing Hospital
Author Profile

Abstract

Abstract Background There is currently no robust prognostic model for sarcomatous renal cell carcinoma (sRCC), which could help physicians make better decisions. Objectives To build an accurate predictive model for patients who have sRCC by investigating the important characteristics that influence the overall survival of patients. Design and Methods The Surveillance, Epidemiology and Results (SEER) database of the U.S. National Cancer Institute was used for gathering the dataset of sRCC patients. Following data preprocessing, the data was separated into the training set and the test set in an 8:2 ratio. Mann-Whitney U test and Chi-square test were used to verify whether the data set was evenly divided. Univariate Cox proportional hazard model, Kaplan-Meier analysis and machine learning (ML) algorithm were employed to identify the risk features on overall survival (OS). 10 reliable features were selected to construct six ML models. Model performance, predictive accuracy, and clinical benefits were evaluated by the receiver operating characteristic curves (ROC), calibration plots, and decision curve analysis (DCA) respectively. Results After data preprocessing, 692 patients with sRCC from 1975 to 2019 were included in this study. Ten variables including stage group, T stage, M stage, age, surgery, N stage, tumor size, chemotherapy, histological grade, and radiotherapy were selected as reliable features for machine learning model training. All the models show good prediction performance, among which XGBoost has the best prediction accuracy and stability. The DCA showed that all models except Adaboost could be used to support clinical decision-making with the 90-day, 1-, 2-, 3- and 5-year OS model. Conclusions Six machine learning models were developed to predict 90-day, 1-, 2-, 3- and 5-year overall survival in patients with sRCC. Model evaluations showed that the XGBoost model had the best predictive accuracy and clinical net benefit. These models can help make treatment decisions for patients with sRCC.
08 Jul 2023Submitted to Cancer Reports
10 Jul 2023Submission Checks Completed
10 Jul 2023Assigned to Editor
10 Jul 2023Review(s) Completed, Editorial Evaluation Pending
24 Jul 2023Reviewer(s) Assigned
25 Aug 2023Editorial Decision: Revise Major