Large Language Models in Neurology Treatment Decision-Making: a Scoping Review

Khare R, Leaman R, Lu Z. Accessing biomedical literature in the current information landscape. Methods Mol Biol Clifton NJ. 2014;1159:11–31.

Article Google Scholar

Landhuis E. Scientific literature: Information overload. Nature. 2016 July;535(7612):457–8.

Article PubMed Google Scholar

Denecke K, May R, Rivera Romero O. Potential of Large Language Models in Health Care: Delphi Study. J Med Internet Res. 2024;26:e52399.

Article PubMed PubMed Central Google Scholar

Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med. 2023;29(8):1930–40.

Article CAS PubMed Google Scholar

OpenAI. ChatGPT. Version 1.2025.043, GPT-4; 2025.

Romano MF, Shih LC, Paschalidis IC, Au R, Kolachalama VB. Large Language Models in Neurology Research and Future Practice. Neurology. 2023;101(23):1058–67.

Article PubMed PubMed Central Google Scholar

Nógrádi B, Polgár TF, Meszlényi V, Kádár Z, Hertelendy P, Csáti A, et al. ChatGPT M.D.: Is there any room for generative AI in neurology? PLOS ONE. 2024;19(10):e0310028.

Article PubMed PubMed Central Google Scholar

Dimmick AA, Su,Charlie C., Rafiuddin ,Hanan S., and Cicero DC. Evaluating ChatGPT for neurocognitive disorder diagnosis: a multicenter study. Clin Neuropsychol. 0(0):1–16.

Cano-Besquet S, Rice-Canetto T, Abou-El-Hassan H, Alarcon S, Zimmerman J, Issagholian L, et al. ChatGPT4’s diagnostic accuracy in inpatient neurology: A retrospective cohort study. Heliyon. 2024;10(24):e40964.

Article PubMed PubMed Central Google Scholar

Hewitt KJ, Wiest IC, Carrero ZI, Bejan L, Millner TO, Brandner S, et al. Large language models as a diagnostic support tool in neuropathology. J Pathol Clin Res. 2024;10(6):e70009.

Article PubMed PubMed Central Google Scholar

OpenEvidence [Internet]. 2025 [cited 2025 Mar 22]. OpenEvidence - About. Available from: https://www.openevidence.com

Google. Gemini. 2025 [cited 2025 Mar 22]. ‎Gemini. Available from: https://gemini.google.com

Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation. Ann Intern Med. 2018;169(7):467–73.

Article PubMed Google Scholar

Anderson S. Library Guides: Systematic Reviews: Define the question [Internet]. [cited 2024 Dec 9]. Available from: https://libguides.jcu.edu.au/systematic-review/define

Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A. Rayyan—a web and mobile app for systematic reviews. Syst Rev [Internet]. 2016 [cited 2024 Nov 17];5(1). Available from: https://link.springer.com/epdf/https://doi.org/10.1186/s13643-016-0384-4

Study Quality Assessment Tools | NHLBI, NIH [Internet]. [cited 2025 June 12]. Available from: https://www.nhlbi.nih.gov/health-topics/study-quality-assessment-tools

Haemmerli J, Sveikata L, Nouri A, May A, Egervari K, Freyschlag C, et al. ChatGPT in glioma adjuvant therapy decision making: ready to assume the role of a doctor in the tumour board? BMJ Health Care Inform. 2023 June;30(1):e100775.

Article PubMed PubMed Central Google Scholar

Guo E, Gupta M, Sinha S, Rössler K, Tatagiba M, Akagami R, et al. neuroGPT-X: toward a clinic-ready large language model. J Neurosurg. 2024;140(4):1041–53.

Article PubMed Google Scholar

Fonseca Â, Ferreira A, Ribeiro L, Moreira S, Duque C. Embracing the future—is artificial intelligence already better? A comparative study of artificial intelligence performance in diagnostic accuracy and decision-making. Eur J Neurol. 2024;31(4):e16195.

Article PubMed PubMed Central Google Scholar

Chen TC, Couldwell MW, Singer J, Singer A, Koduri L, Kaminski E, et al. Assessing the clinical reasoning of ChatGPT for mechanical thrombectomy in patients with stroke. J NeuroInterventional Surg. 2024;16(3):253–60.

Article Google Scholar

Sanderson K. GPT-4 is here: what scientists think. Nature. 2023;615(7954):773–773.

Article CAS PubMed Google Scholar

Benary M, Wang XD, Schmidt M, Soll D, Hilfenhaus G, Nassir M, et al. Leveraging Large Language Models for Decision Support in Personalized Oncology. JAMA Netw Open. 2023;6(11):e2343689.

Article PubMed PubMed Central Google Scholar

Bivard A, Churilov L, Parsons M. Artificial intelligence for decision support in acute stroke — current roles and potential. Nat Rev Neurol. 2020;16(10):575–85.

Article PubMed Google Scholar

Shamsudhin N, Jotterand F. Social Robots and Dark Patterns: Where Does Persuasion End and Deception Begin? In: Jotterand F, Ienca M, editors. Artificial Intelligence in Brain and Mental Health: Philosophical, Ethical & Policy Issues [Internet]. Cham: Springer International Publishing; 2021 [cited 2024 Dec 22]. p. 89–110. Available from: https://doi.org/10.1007/978-3-030-74188-4_7

Fogg BJ. Persuasive technologies: introduction. Commun ACM. 1999;42(5):26–9.

Article Google Scholar

Norori N, Hu Q, Aellen FM, Faraci FD, Tzovara A. Addressing bias in big data and AI for health care: A call for open science. Patterns. 2021;2(10):100347.

Article PubMed PubMed Central Google Scholar

Echefu G, Shah R, Sanchez Z, Rickards J, Brown SA. Artificial intelligence: Applications in cardio-oncology and potential impact on racial disparities. Am Heart J Plus Cardiol Res Pract. 2024;48:100479.

Article Google Scholar

Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447–53.

Article CAS PubMed Google Scholar

Wang DY, Ding J, Sun AL, Liu SG, Jiang D, Li N et al (2023) Artificial intelligence suppression as a strategy to mitigate artificial intelligence automation bias. J Am Med Inform Assoc JAMIA. 2023 Sept 25(10):1684–92

Article CAS Google Scholar

Ji Z, Lee N, Frieske R, Yu T, Su D, Xu Y, et al. Survey of Hallucination in Natural Language Generation. ACM Comput Surv. 2023;55(12):248:1–248:38.

Article Google Scholar

Alber DA, Yang Z, Alyakin A, Yang E, Rai S, Valliani AA, et al. Medical large language models are vulnerable to data-poisoning attacks. Nat Med. 2025;31(2):618–26.

Article CAS PubMed PubMed Central Google Scholar

Rosenblatt M, Tejavibulya L, Jiang R, Noble S, Scheinost D. Data leakage inflates prediction performance in connectome-based machine learning models. Nat Commun. 2024;15(1):1829.

Article CAS PubMed PubMed Central Google Scholar

Cestonaro C, Delicati A, Marcante B, Caenazzo L, Tozzo P. Defining medical liability when artificial intelligence is applied on diagnostic algorithms: a systematic review. Front Med. 2023;10:1305756.

Article Google Scholar

View original article

JOURNAL OF MEDICAL SYSTEMS

Like

Share Bookmark

0 0 0 0 0 0 0

More from this channel

Large Language Models in Neurology Treatment Decision-Making: a Scoping Review

Comments (0)