Comprehensive Summary
State-action-reward-state-action (SARSA) is a reinforcement learning framework that can reduce acute care use by recommending optimal interventions for patients with complex clinical and social needs. Basu et al., conducted a longitudinal cohort study of 3,175 Medicaid beneficiaries enrolled in care management programs in Washington and Virginia (2023 to 2024), comparing SARSA-guided and standard experience-based approaches for intervention prioritization. Outcomes were evaluated using clinical impact metrics, qualitative chart reviews where models differed, counterfactual causal interference to estimate reductions in acute care events, and equalized odds analysis to assess fairness across demographic groups. The SARSA-guided approach reduced acute care use by 12% (relative reduction ≈ 20.7%, p = 0.02) while improving fairness across all demographic groups and showing particular benefit among high-risk patients (number needed to treat = 8.3). Patients for whom SARSA identified key interactions—such as food insecurity in combination with diabetes or housing instability in conjunction with respiratory disease—derived the greatest benefit compared with standard care. The absolute (12%) and relative (20.7%) risk reductions achieved by SARSA are comparable to those seen with standard clinical interventions. The authors concluded that SARSA’s strength lies in interpreting optimal intervention sequences from longitudinal patient data rather than variable individual experiences.
Outcomes and Implications
As multidisciplinary care teams expand to address both clinical and social determinants of health, decision-support tools like SARSA may enhance coordination across specialties. The SARSA model shows promise in supporting complex care-management decisions while advancing equitable care delivery.