Urology

Comprehensive Summary

This study was concerned with investigating the level of agreement between human urologists, Chat-GPT, and the European Association of Urology (EAU) guidelines when suggesting management and treatment decisions for patients with benign prostatic hyperplasia(BPH). This was a prospective, cross sectional study that presented 10 clinical scenarios to Chat-GPT as well as 140 board-certified urologists across Italy(121 out of which participated) and asked the participants to make decisions for management of the patients in the scenarios. The agreement of the decisions of both the human urologists and Chat-GBT to the EAU guidelines was determined. The study found a fairly low agreement between the human urologists and the EAU guidelines with a Cohen's Kappa score 0.125. On the other hand, the study found that there was a perfect agreement between Chat-GPT and the EAU guidelines with a Cohen's Kappa score of 1.000. The study also found that there was a very poor interrater agreement between the human urologists with a Fleiss’ Kappa score of -0.030. It was also found that the urologists over the age of 45 had a higher concordance with the EAU scores than those younger. Although there was discrepancy in the amount of concordance with the EAU guidelines between the individual demographics of the urologists, Chat-GPT outperformed the urologists in adhering to the EAU guidelines when making decisions.

Outcomes and Implications

This research is important because it shows how language models can become a useful tool for clinicians that can guide their decisions in a way that adheres to the guidelines of their field, especially if the clinicians are less familiar with evolving guidelines. Although other research has highlighted the limitations of the use of AI in medicine, specifically in its gender and racial biases, these limitations could be mitigated through strict oversight of the language models. Additionally, the decisions of language models like Chat-GPT are influenced by prompt formulation so they are not a substitute for clinical judgment in any context.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

AIIM Research

Articles

© 2025 AIIM. Created by AIIM IT Team

AIIM Research

Articles

© 2025 AIIM. Created by AIIM IT Team

AIIM Research

Articles

© 2025 AIIM. Created by AIIM IT Team