Comprehensive Summary
Socioeconomic inequality has been linked to poorer clinical outcomes and increased monetary spending on healthcare that could have been avoided. As a result, nationally-acredited agencies and organizations such as the Centers for Medicare and Medicaid (CMS) have been including an Area Deprivation Index (ADI) that takes into account social inequities when designing healthcare policies. This ends up creating higher quality and more equitable healthcare policies for all, including for those at socioeconomic disadvantage. The ADI was developed by looking at 17 different socioeconomic factors across different neighborhoods and was able to determine the health "level" of a neighborhood as well as predict - more broadly - whether a neighborhood would experience adverse health outcomes. This study was an attempt to take the functionality of the ADI even further, by creating a machine learning algorithm that could identify social risk and then predict specific surgical outcomes from identification of the risk. Data from hospitals across the country belonging to the ACS NSQIP registry was collected from Jan 1, 2018 to June 30, 2023; this data included patient demographics, operative and pre-operative factors, and post-operative outcomes within 30 days of surgery. Addresses of patients were converted to blocks within the census's block system, and these blocks were then assigned a percentile value with the greater percentile number reflecting a greater socioeconomic disadvantage. 14 postoperative outcomes occuring within 30 days of surgery were defined and a validated set of 20 clinical risk factors were also used, as determined by the ACS NSQIP online calculator. Half of the dataset was used for training the machine learning algorithm XGBoost: XGBoost would use this subset to create predictive models for each of the 14 postoperative outcomes by taking into account the 17 socioeconomic factors in a census block. The outcomes were then validated on the other half of the subset and against ADI-generated predictions, the latter being done by first converting ADI percentiles to log values that could predict risk of each specific postoperative outcome in the training subset, and then validating the predictive power of these log values on the validation data subset. In the end, 3,206,836 patients' data was used in all, half each for training and validation. XGBoost and ADI had equal model discrimination ability (i.e. were equally able to identify the people with a specific postoperative outcome vs those who didn't), but the odds ratio of the XGBoost models were higher than the odds ratio of the ADI, indicating a higher predictive power of the XGBoost model. Higher ADI values indicated greater socioeconomic disparity, and were, interestingly enough, associated with lowered risks of venous thromboembolism, urinary tract infections, and C. diff colitis. For the 14 XGBoost-derived models from the 14 postoperative conditions, the socioeconomic factors (amongst the original 17) that had the most impact varied. For all 14 ADI-derived models, housing units with incomplete plumbing was the socioeconomic factor that had the least effect, and for the venous thromboembolism, urinary tract infection, and C. diff colitis models median family income had less of an impact when compared to the other 11 models, where median family income was a consisently-impactful socioeconomic factor.
Outcomes and Implications
The current measures in place to address healthcare effects from socioeconmic inequality include the ADI. While the ADI is a helpful in highlighting how intertwined socioeconomic status and healthcare outcomes can be, its lack of specificty when predicting healthcare outcomes hinders it from giving a more in-depth analysis of the "health" of a neighborhood. The increased predictive power of the XGBoost machine-learning model from this study when determing the specific postoperative condition that could arise within a block is a point of interest. Implementation of the XGBoost or other similarly-efficacious models could allow for more specific and pointed measures and policies to combat health inequality and give residents of socioeconomically-disadvantaged areas more comprehensive, accurate care. Another interesting point to note is that the study has effectively highlighted that socioeconomic factors that were heavily weighted in the more generic ADI predictions tend to not show the same influence when looking at specific postoperative outcomes. Knowing which socioeconomic factors are risk factors for certain postoperative conditions could lead to better programs and more understanding of which factors to target as to avoid developing certain conditions. However, there are some limitations to the study, mainly that patient data was limited to those found only in ACS NSQIP hospitals - which could not be representative of the entire country - and those who had an address that could be placed within the census block system, effectively excluding homeless patients though homelessness is a significant socioeconomic factor.