Application of ChatGPT as a content generation tool in continuing medical education: acne as a test topic
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
Authors
The large language model (LLM) ChatGPT can answer open-ended and complex questions, but its accuracy in providing reliable medical information requires a careful assessment. As part of the AICHECK (Artificial Intelligence for CME Health E-learning Contents and Knowledge) Study, aimed at evaluating the potential of ChatGPT in continuous medical education (CME), we compared ChatGPT-generated educational contents to the recommendations of the National Institute for Health and Care Excellence (NICE) guidelines on acne vulgaris. ChatGPT version 4 was exposed to a 23-item questionnaire developed by an experienced dermatologist. A panel of five dermatologists rated the answers positively in terms of “quality” (87.8%), “readability” (94.8%), “accuracy” (75.7%), “thoroughness” (85.2%), and “consistency” with guidelines (76.8%). The references provided by ChatGPT obtained positive ratings for “pertinence” (94.6%), “relevance” (91.2%), and “update” (62.3%). The internal reproducibility was adequate both for answers (93.5%) and references (67.4%). Answers related to issues of uncertainty and/or controversy in the scientific community scored the lowest. This study underscores the need to develop rigorous evaluation criteria for AI-generated medical content and for expert oversight to ensure accuracy and guideline adherence.
How to Cite

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.