FRONTIERS IN HEALTH SERVICES, sa.6, ss.1-10, 2026 (ESCI, Scopus)
Introduction:
Artificial intelligence–based chatbots are increasingly used by patients to obtain medical information before healthcare encounters. However, the reliability and safety of chatbot-generated responses remain uncertain, particularly for topics involving procedural sedation.
Methods:
This study evaluated the quality, safety, and confabulation risk of responses generated by ChatGPT, Gemini, and Copilot to common patient questions about procedural sedation. A set of standardized patient-oriented questions was submitted to each chatbot, and responses were independently evaluated by clinical experts using predefined criteria assessing informational quality, clinical safety, and the presence of confabulated or misleading content.
Results:
The results demonstrated variability in response quality across chatbots, with several answers containing incomplete information, safety omissions, or potentially misleading statements. Although many responses provided generally understandable explanations, important clinical details relevant to patient safety were inconsistently addressed.
Discussion:
These findings suggest that while AI chatbots may support patient education, their responses regarding procedural sedation may contain safety gaps and confabulated content that limit their reliability as standalone sources of medical information. Careful oversight and clinician-guided use of AI-generated health information may therefore be necessary to ensure safe and accurate patient communication.