Comprehensive Summary
In this study, researchers developed and validated an AI model to predict the risk of surgical site infection (SSI) in patients undergoing surgery for metastatic spinal disease. A total of 667 patients were included in the study, with 485 patients in the model-derivation cohort, and 182 in an external validation cohort. 6 machine learning algorithms were tested: support vector machine (SVM), gradient boosting machine (GBM), K-nearest neighbor (KNN), neural network (NN), decision tree (DT), and logistic regression (LR). Model performance was evaluated using several parameters, including Brier score, accuracy, precision, recall, AUC, f1 score, and precision-recall curve. Findings showed that the GBM model had the best performance, exhibiting high predictive capability (AUC = 0.986, precision & accuracy = 0.967) and low uncertainty (Brier score = 0.033, log loss = 0.137). Additionally, when compared to predictions made by board-certified surgeons on an independent set of 100 cases, the AI system outperformed clinicians in predicting SSI risk and outcomes. The final model was released as a user-friendly application that can generate individualized SSI probabilities and also categorizes patients into high or low-risk groups. Overall, this model has strong potential for clinical use, and further validation across diverse healthcare settings could support broader implementation.
Outcomes and Implications
The model’s ability to generate individualized SSI risk predictions enhances personalized patient care by identifying which patients are most vulnerable to postoperative infection. Classifying patients into high- or low-risk groups enables clinicians to plan proactively, implement specific prevention strategies, and anticipate potential complications before surgery. These tailored approaches can support more informed clinical decisions, optimize treatment plans, and improve overall risk assessment. This could ultimately help conserve resources, reduce healthcare costs, and improve patient outcomes. To strengthen the model’s generalizability and clinical applicability, future research should validate its performance across broader and more diverse patient populations.