Comprehensive Summary
In this paper, a novel machine learning (ML) algorithm framework involving SHapley Additive Explanations (SHAP) was proposed to improve predictive performance of current ML models. In particular, SHAP is used to interpret predictions and help detect the frequent misclassifications found in pre-existing ML models. Current models were found to have satisfactory accuracy and precision when used to classify data sets of antiproliferative compounds against three prostate cancers (F1-score above 0.8). Further analysis revealed that several misclassified compounds have features that have overlapping ranges with opposite classes. Using this, the researchers developed a misclassification-detection framework with additional filtering rules, three of which involve SHAP, and found that the RAW OR SHAP framework had the most success, retrieving upwards of 63% of the misclassified compounds in one particular data set. Overall, this paper provides a new direction in adding additional analysis components on top of current ML algorithms to more accurately classify data sets.
Outcomes and Implications
This paper has several health implications. The paper found that using SHAP as an additional analysis framework on top of current ML algorithms helps reduce misclassifications and improve AI reliability. This could help reduce time and resource waste on compounds that are not as promising as they appear. Additionally, SHAP ML algorithms could increase the potential in discovering or creating an effective anticancer drug. In the long-term, this model could contribute to a more personalized form of cancer treatment, improving patient care and patient outcomes.