Systematic Review on Large Language Models in Orthopaedic Surgery

Back

Orthopedics

Systematic Review on Large Language Models in Orthopaedic Surgery

Journal of Clinical Medicine

Research Authors: Kevin Mo, Rowen Lin, Evan Dunn, Gio Girgis, William Fang, John Walsh, Troy Watson, Nicole Banyai-Flores, Daniel Lee

AIIM Authors: Pia Sachdev, Nicholas Leonard

Approved by President Reda Riffi

Publication Date: Aug 20, 2025

Comprehensive Summary

This systematic review of 60 studies evaluates the current applications of Large Language Models (LLMs) in orthopaedics, focusing on standardized exam questions and common patient questions. Studies were identified through multiple databases and screened for relevance. All studies assessed ChatGPT, with fewer including Bard, PerplexityAI, and Bing. In 31 studies about standardized exam questions, ChatGPT 4.0 consistently outperformed other models. Accuracy scores ranged from 47% to 74% without images and 36% to 66% with images. However, orthopaedic residents achieved higher scores (74%-75%) on the same questions, highlighting the gap between LLMs and clinical training. 22 studies examined the LLMs’ responses to common patient questions, which were generally satisfactory (Likert and DISCERN scores were in the upper ranges, and readability ranged from high school to post-graduate levels). Comparative studies demonstrated that ChatGPT outperformed Bard, though findings on other LLMs remain limited. Findings show current research on LLMs in orthopaedics is concentrated on patient communication and exam-style assessments rather than clinical decision-making.

Outcomes and Implications

Using LLMs to answer common patient questions can provide accessible and satisfactory answers for patients, potentially improving patient education and reducing physician workload. With increasing accuracy on exam-style questions, LLMs could serve as a study aid for orthopaedic residents and medical students. While there is potential for clinical implementation, such as documentation, triage systems, and decision-making, this review highlights the gaps between LLMs and experienced clinicians, as well as the need for further research and improvement of LLMs for a wider scope of use.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.