Artificial intelligence (AI) is progressing rapidly, with various new technologies emerging. Among these, AI chatbots have attracted considerable attention in both AI and medical fields. Concerns regarding their predictive use, accuracy, safety, and social acceptance in treatment remain pertinent. We follow the topic “Use of artificial intelligence chatbots in clinical management of immune-related adverse events” in the Journal for ImmunoTherapy of Cancer. This study compared the performance of two major AI chatbots, ChatGPT and Bard, in managing immune-related adverse events (irAEs). We commend the authors for their contributions. However, we believe there are several areas for potential improvement. We aim to initiate a discussion on the issues associated with using AI chatbots in the clinical management of irAEs.
The AI chatbots were used to generate feedback for various queries related to irAEs in this study. They developed 50 distinct questions with answers in available guidelines surrounding 10 irAE categories.1 Their queries are based on questions with fixed answers derived from guidelines. Such queries can also be handled by traditional search engines, and referencing the guidelines can provide more authoritative answers. However, large language models (LLMs) have inherent limitations. They rely on extensive training data and real-time updates, which require significant time for acquisition and processing.2 Consequently, LLMs may not always align with the latest clinical guidelines, resulting in less accurate responses. The advantage of LLMs lies in their ability to understand the context of natural language, generate new content based on user intent, and answer complex questions. In the medical field, their primary advantage is offering more specialized recommendations for various clinical scenarios compared with traditional search engines.3 Therefore, it is crucial to evaluate LLM capabilities in clinical scenarios. Although this study analyzed 20 clinical scenarios, the results were presented too briefly to demonstrate the AI chatbots’ proficiency in interpreting clinical vignettes and answering related questions comprehensively.
Moreover, the GPT-4o is the latest generation model, which demonstrates enhanced reasoning and comprehension capabilities, particularly in handling complex dialogs and multimodal inputs, such as images. Compared with ChatGPT-4.0, ChatGPT-4o exhibits a more precise understanding of contextual information, enabling it to provide more relevant and in-depth responses. Additionally, Claude 3.5, a leading AI chatbot based on LLMs, is noted for its exceptional performance and broad applicability. Further research should also consider incorporating GPT-4o, Bard, and Claude 3.5, as these models are more advanced, relevant, and aligned with recent developments in AI technology.
Furthermore, the use of AI chatbots in the medical field presents significant challenges, including issues of explainability, ethics, data security, bias, and generalizability. The American Medical Association also thinks explainability, transparency, confabulation, liability, privacy and safety are important challenges and risks faced by AI chatbots. These concerns necessitate further investigation into the application and research of LLMs. AI chatbots can handle free-text queries without specialized training, which generates both enthusiasm and apprehension regarding their deployment in healthcare. In the management of irAEs, AI chatbots can assist in diagnosis, disease prediction, and case analysis, and although they are not yet fully capable of interpreting radiographs, ongoing research continues to advance their capabilities.4 However, their use requires human oversight and ethical considerations. AI chatbots should support rather than replace clinicians, as they are not a substitute for professional medical advice.5 The widespread adoption of AI chatbots on smartphones has further enhanced the accessibility and convenience for clinicians to use them in supporting medical decision-making. Professional supervision is crucial when employing AI chatbots in medical practice. Operators must interpret and process the generated results, make informed decisions based on clinical judgment, and verify the authenticity of the information provided.
In conclusion, we commend the authors for their research and concur that AI chatbots have demonstrated excellent performance, highlighting their potential in clinical applications. In this digital era, where global connectivity is ever-increasing, AI chatbots have the potential to revolutionize travel medicine by mitigating health risks. However, further research is necessary to validate their effectiveness, accuracy, and reliability, particularly in the surgical field.
Data availability statementData are available upon reasonable request. The data are not publicly available due to privacy or ethical restrictions.
Ethics statementsPatient consent for publicationNot applicable.
Comments (0)