Comprehensive Summary
This study, written by Lee et al., examines the development of an automated system converting free-text eligibility criteria (from ClinicalTrials.gov) into Observational Medical Outcomes Partnership Common Data Model (OMOP CDM)–compatible SQL queries, and they systematically evaluated hallucination patterns across multiple large language models (LLMs) to identify the optimal deployment strategies. They are attempting to automate criteria conversion to database queries more easily, but it is hindered by challenges related to ensuring high accuracy and generating clear, usable outputs. The researchers fed clinical-trial criteria into different LLMs and tested a three-stage processing pipeline (segmentation, filtering, and simplification). Through this, they asked the models to find OMOP codes and write the compatible SQL queries. Furthermore, they analyzed the results of the queries and whether they matched real patient data. Finally, they compared different LLMs and measured which models did best and made the fewest mistakes. They found that GPT-4 achieved a 48.5% concept mapping accuracy versus USAGI’s 32.0% with domain-specific performance ranging from 72.7% (drug) to 38.3% (measurement). While the open-source llama3: 8b model achieved the highest effective SQL rate (75.8%) compared to GPT-4 (45.3%), attributed to lower hallucination rates (21.1% vs 33.7%). Overall, LLMs can speed the eligibility criteria transformation, but substantial hallucination makes unguided deployment risky. To combat this, careful model selection and validation strategies are necessary. Their results surprisingly demonstrated that smaller, cost-effective models can outperform larger commercial alternatives.
Outcomes and Implications
Clinical trials are essential for medical advancement and drug development, yet they face significant challenges in participant recruitment. Currently, it is hard and time-consuming to find real patients who match a trial’s criteria, as the text has to be manually translated into computer queries. This study focuses on making advances in the development of trustworthy and cost-effective AI systems for clinical trial optimization. Moreover, improving medical research through integrating AI that can sort through data faster and find a match for clinical research.