Joint Diseases and Related Surgery, cilt.35, sa.1, ss.169-176, 2024 (SCI-Expanded)
Objectives: This study presents the first investigation into the
potential of ChatGPT to provide medical consultation for patients
undergoing orthopedic interventions, with the primary objective
of evaluating ChatGPT’s effectiveness in supporting patient
self-management during the essential early recovery phase at
home.
Materials and methods: Seven scenarios, representative of
common situations in orthopedics and traumatology, were presented
to ChatGPT version 4.0 to obtain advice. These scenarios and
ChatGPT̓s responses were then evaluated by 68 expert orthopedists
(67 males, 1 female; mean age: 37.9±5.9 years; range, 30 to 59 years),
40 of whom had at least four years of orthopedic experience, while
28 were associate or full professors. Expert orthopedists used a
rubric on a scale of 1 to 5 to evaluate ChatGPTʼs advice based on
accuracy, applicability, comprehensiveness, and clarity. Those who
gave ChatGPT a score of 4 or higher considered its performance as
above average or excellent.
Results: In all scenarios, the median evaluation scores were
at least 4 across accuracy, applicability, comprehensiveness,
and communication. As for mean scores, accuracy was
the highest-rated dimension at 4.2±0.8, while mean
comprehensiveness was slightly lower at 3.9±0.8. Orthopedist
characteristics, such as academic title and prior use of ChatGPT,
did not influence their evaluation (all p>0.05). Across all
scenarios, ChatGPT demonstrated an accuracy of 79.8%, with
applicability at 75.2%, comprehensiveness at 70.6%, and a 75.6%
rating for communication clarity.
Conclusion: This study emphasizes ChatGPT̓s strengths in
accuracy and applicability for home care after orthopedic
intervention but underscores a need for improved
comprehensiveness. This focused evaluation not only sheds light
on ChatGPT̓s potential in specialized medical advice but also
suggests its potential to play a broader role in the advancement
of public health.
Keywords: ChatGPT, expert evaluation, large language models, medical
consultation, orthopedic interventions, rubric.