The evaluation of the performance of ChatGPT in the management of labor analgesia

Nada Ismaiel; Teresa Phuongtram Nguyen; Nan Guo; Brendan Carvalho; Pervez Sultan; Anthony Chau; Ronald George; Ashraf Habib; Arvind Palanisamy; Carolyn Weiniger; Cynthia Wong; ChatGPT study collaborators

doi:10.1016/j.jclinane.2024.111582

Back

The evaluation of the performance of ChatGPT in the management of labor analgesia

Journal article

Peer reviewed

The evaluation of the performance of ChatGPT in the management of labor analgesia

Nada Ismaiel, Teresa Phuongtram Nguyen, Nan Guo, Brendan Carvalho, Pervez Sultan, Anthony Chau, Ronald George, Ashraf Habib, Arvind Palanisamy, Carolyn Weiniger, …

Journal of clinical anesthesia, Vol.98, 111582

11/2024

DOI: 10.1016/j.jclinane.2024.111582

PMID: 39167880

View Online

Abstract

ChatGPT4 is a leading large language model (LLM) chatbot released by OpenAI in 2023. ChatGPT4 can respond to free-text queries, answer questions and make suggestions regarding virtually any topic. ChatGPT4 has successfully answered anesthesia and even obstetric anesthesia knowledge-based questions with reasonable accuracy. However, ChatGPT4 has yet to be challenged in obstetric anesthesia clinical decision-making. Study Objective: In this study, we evaluated the performance of ChatGPT4 in the management of clinical labor analgesia scenarios compared to expert obstetric anesthesiologists. Intervention: Eight clinical questions with progressively increasing medical complexity were posed to ChatGPT4. Measurements: The ChatGPT4 responses were rated by seven expert obstetric anesthesiologists based on safety, accuracy and completeness of each response using a five-point Likert rating scale. Main Results:ChatGPT4 was deemed safe in 73% of responses to the presented obstetric anesthesia clinical scenarios (27% of responses were deemed unsafe). None of the ChatGPT4 responses were unanimously deemed to be safe by all seven expert obstetric anesthesiologists. Moreover, ChatGPT4 responses were overall partly accurate (score 4 out of 5) and somewhat incomplete (score 3.5 out of 5). Conclusions: In summary, approximately one quarter of all responses by ChatGPT4 were deemed unsafe by expert obstetric anesthesiologists. These findings may suggest the need for more fine-tuning and training of LLMs such as ChatGPT4 specifically for clinical decision making in obstetric anesthesia or other specialized medical fields. These LLMs may come to play an important future role in assisting obstetric anesthesiologists in clinical decision making and enhancing overall patient care. •ChatGPT4 can provide clinically-relevant obstetric anesthesia knowledge•ChatGPT4 labor analgesia responses were mostly accurate but partly incomplete•Approximately one quarter of ChatGPT4 responses were deemed unsafe by experts•None of the ChatGPT4 responses were unanimously deemed safe by all experts•ChatGPT4 may have a role in clinical decision making in obstetric anesthesiology

Anesthesiology

Safety

Analgesia

Chatbot

ChatGPT4

Obstetric

Details

Title: Subtitle: The evaluation of the performance of ChatGPT in the management of labor analgesia
Creators: Nada Ismaiel - El Camino Hospital
Teresa Phuongtram Nguyen - Stanford University School of Medicine
Nan Guo - Stanford University
Brendan Carvalho - Stanford University
Pervez Sultan - Stanford University
Anthony Chau - University of British Columbia
Ronald George - University of Toronto
Ashraf Habib - Duke Medical Center
Arvind Palanisamy - Washington University in St. Louis School of Medicine
Carolyn Weiniger - Tel Aviv Sourasky Medical Center
Cynthia Wong - University of Iowa
ChatGPT study collaborators
Resource Type: Journal article
Publication Details: Journal of clinical anesthesia, Vol.98, 111582
Publisher: Elsevier Inc
DOI: 10.1016/j.jclinane.2024.111582
PMID: 39167880
ISSN: 0952-8180
eISSN: 1873-4529
Language: English
Date published: 11/2024
Academic Unit: Anesthesia
Record Identifier: 9984699053602771

Metrics

4 Record Views

5 Times Cited - Web of Science