Comprehensive Summary
This study by Haredasht et al. investigated whether machine learning models could predict treatment retention in patients receiving buprenorphine-naloxone (BUP-NAL) for opioid use disorder, with a focus on incorporating features extracted from unstructured clinical notes using large language models. The researchers used electronic health record data from Stanford Health Care for model development and the NeuroBlu behavioral health database for external validation, training both classification models (Logistic Regression, Random Forest, XGBoost) and survival models (CoxPH, Random Survival Forest, Survival XGBoost) on 206 features, 193 from structured data and 13 psychosocial features extracted from free-text notes using their CLinical Entity Augmented Retrieval (CLEAR) pipeline. XGBoost achieved the highest classification performance with an ROC-AUC of 0.65, while Random Survival Forest and Survival XGBoost reached C-index values around 0.65 in time-to-event analysis. Incorporating LLM-derived features improved model performance across all architectures, with the largest gains in simpler models like Logistic Regression, and SHAP analysis identified features such as chronic pain, liver disease, and major depression as key predictors. The authors acknowledge that while these models demonstrated moderate predictive performance with statistically significant improvements from including unstructured data features, the overall discriminative power remains imperfect.
Outcomes and Implications
This research addresses a critical challenge in opioid use disorder treatment: retention rates for buprenorphine therapy range widely from 20% to 82.5%, with many patients discontinuing within the first six months, and early discontinuation is associated with increased mortality. The work demonstrates clinically actionable insights, as the identification of high-risk patients could enable targeted interventions such as closer monitoring for individuals with liver disease, which is associated with higher attrition risk, or consideration of long-acting injectable buprenorphine formulations for those at elevated risk of early discontinuation. The researchers developed an interactive web-based tool that allows clinicians to input patient characteristics and generate personalized risk predictions with probability estimates of treatment retention, though they note that additional studies are needed to assess whether this risk assessment actually improves retention rates in real-world clinical settings before widespread implementation.