Automated Resectability Classification of Pancreatic Cancer CT Reports with Privacy-Preserving Open-Weight Large Language Models: A Multicenter Study

Blackford AL, Canto MI, Klein AP, Hruban RH, Goggins M (2020) Recent trends in the incidence and survival of stage 1A pancreatic cancer: a surveillance, epidemiology, and end results analysis. JNCI: Journal of the National Cancer Institute 112(11):1162–1169

Article  PubMed  PubMed Central  Google Scholar 

Cao Y-Y, Guo K, Zhao R, Li Y, Lv X-J, Lu Z-P, Tian L, Ren S, Wang Z-Q (2023) Untargeted metabolomics characterization of the resectable pancreatic ductal adenocarcinoma. Digital Health 9:20552076231179007

Article  PubMed  PubMed Central  Google Scholar 

Ren S, Qian L-C, Cao Y-Y, Daniels MJ, Song L-N, Tian Y, Wang Z-Q (2024) Computed tomography-based radiomics diagnostic approach for differential diagnosis between early-and late-stage pancreatic ductal adenocarcinoma. World Journal of Gastrointestinal Oncology 16(4):1256

Article  PubMed  PubMed Central  Google Scholar 

Ren S, Song L, Tian Y, Zhu L, Guo K, Zhang H, Wang Z (2021) Emodin-conjugated PEGylation of Fe3O4 nanoparticles for FI/MRI dual-modal imaging and therapy in pancreatic cancer. International Journal of Nanomedicine 16:7463–7478

Strobel O, Neoptolemos J, Jaeger D, Buechler MW (2019) Optimizing the outcomes of pancreatic cancer surgery. Nature reviews Clinical oncology 16(1):11–26. https://doi.org/10.1038/s41571-018-0112-1

Article  PubMed  CAS  Google Scholar 

Wallis A, McCoubrie P (2011) The radiology report—are we getting the message across? Clinical radiology 66(11):1015–1022. https://doi.org/10.1016/j.crad.2011.05.013

Article  PubMed  CAS  Google Scholar 

Nobel JM, van Geel K, Robben SG (2022) Structured reporting in radiology: a systematic review to explore its potential. European radiology 32(4):2837–2854. https://doi.org/10.1007/s00330-021-08327-5

Article  PubMed  Google Scholar 

Visser BC, Ma Y, Zak Y, Poultsides GA, Norton JA, Rhoads KF (2012) Failure to comply with NCCN guidelines for the management of pancreatic cancer compromises outcomes. Hpb 14(8):539–547. https://doi.org/10.1111/j.1477-2574.2012.00496.x

Article  PubMed  PubMed Central  Google Scholar 

Marcal LP, Fox PS, Evans DB, Fleming JB, Varadhachary GR, Katz MH, Tamm EP (2015) Analysis of free-form radiology dictations for completeness and clarity for pancreatic cancer staging. Abdominal imaging 40(7):2391–2397

Article  PubMed  Google Scholar 

Gu K, Lee JH, Shin J, Hwang JA, Min JH, Jeong WK, Lee MW, Song KD, Bae SH (2024) Using GPT-4 for LI‐RADS feature extraction and categorization with multilingual free‐text reports. Liver International 44(7):1578–1587. https://doi.org/10.1111/liv.15891

Article  PubMed  Google Scholar 

Adams LC, Truhn D, Busch F, Kader A, Niehues SM, Makowski MR, Bressem KK (2023) Leveraging GPT-4 for post hoc transformation of free-text radiology reports into structured reporting: a multilingual feasibility study. Radiology 307(4):e230725. https://doi.org/10.1148/radiol.230725

Article  PubMed  Google Scholar 

Bhayana R, Nanda B, Dehkharghanian T, Deng Y, Bhambra N, Elias G, Datta D, Kambadakone A, Shwaartz CG, Moulton C-A (2024) Large Language Models for Automated Synoptic Reports and Resectability Categorization in Pancreatic Cancer. Radiology 311(3):e233117. https://doi.org/10.1148/radiol.233117

Article  PubMed  Google Scholar 

Yao Y, Duan J, Xu K, Cai Y, Sun Z, Zhang Y (2024) A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. High-Confidence Computing 4(2):100211

Choi YB, Capitan KE, Krause JS, Streeper MM (2006) Challenges associated with privacy in health care industry: implementation of HIPAA and the security rules. Journal of medical systems 30:57–64. https://doi.org/10.1007/s10916-006-7405-0

Article  PubMed  Google Scholar 

Abdin M, Jacobs SA, Awan AA, Aneja J, Awadallah A, Awadalla H, Bach N, Bahree A, Bakhtiari A, Behl H (2024) Phi-3 technical report: A highly capable language model locally on your phone. October 1, 2024.

Dubey A, Jauhri A, Pandey A, Kadian A, Al-Dahle A, Letman A, Mathur A, Schelten A, Yang A, Fan A (2024) The llama 3 herd of models. arXiv:2407.21783 [preprint]. https://doi.org/10.48550/arXiv.2407.21783. October 1, 2024.

Jiang AQ, Sablayrolles A, Mensch A, Bamford C, Chaplot DS, Casas Ddl, Bressand F, Lengyel G, Lample G, Saulnier L (2023) Mistral 7B. arXiv:2310.06825 [preprint]. https://doi.org/10.48550/arXiv.2310.06825. October 1, 2024.

Riviere M, Pathak S, Sessa PG, Hardin C, Bhupatiraju S, Hussenot L, Mesnard T, Shahriari B, Ramé A (2024) Gemma 2: Improving open language models at a practical size. arXiv:2408.00118 [preprint]. https://doi.org/10.48550/arXiv.2408.00118. October 1, 2024.

Woźnicki P, Laqua C, Fiku I, Hekalo A, Truhn D, Engelhardt S, Kather J, Foersch S, D’Antonoli TA, Pinto dos Santos D, Baeßler B, Laqua FC (2024) Automatic structuring of radiology reports with on-premise open-source large language models. European Radiology. https://doi.org/10.1007/s00330-024-11074-y

Article  PubMed  PubMed Central  Google Scholar 

Nowak S, Wulff B, Layer YC, Theis M, Isaak A, Salam B, Block W, Kuetting D, Pieper CC, Luetkens JA (2025) Privacy-ensuring open-weights large language models are competitive with closed-weights GPT-4o in extracting chest radiography findings from free-text reports. Radiology 314(1):e240895. https://doi.org/10.1148/radiol.240895

Article  PubMed  Google Scholar 

Savage CH, Kanhere A, Parekh V, Langlotz CP, Joshi A, Huang H, Doo FX (2025) Open-Source Large Language Models in Radiology: A Review and Tutorial for Practical Research and Clinical Deployment. Radiology 314(1):e241073. https://doi.org/10.1148/radiol.241073

Article  PubMed  Google Scholar 

Gunes YC, Cesur T (2025) The diagnostic performance of large language models and general radiologists in thoracic radiology cases: a comparative study. Journal of Thoracic Imaging 40(3):e0805

Article  PubMed  Google Scholar 

Kikuchi T, Nakao T, Nakamura Y, Hanaoka S, Mori H, Yoshikawa T (2024) Toward improved radiologic diagnostics: investigating the utility and limitations of GPT-3.5 Turbo and GPT-4 with quiz cases. American Journal of Neuroradiology 45(10):1506–1511

PubMed  Google Scholar 

Kim H, Kim B, Choi MH, Choi J-I, Oh SN, Rha SE (2025) Conversion of mixed-language free-text ct reports of pancreatic cancer to national comprehensive cancer network structured reporting templates by using GPT-4. Korean journal of radiology 26(6):557

Article  PubMed  PubMed Central  Google Scholar 

Al-Hawary MM, Francis IR, Chari ST, Fishman EK, Hough DM, Lu DS, Macari M, Megibow AJ, Miller FH, Mortele KJ (2014) Pancreatic ductal adenocarcinoma radiology reporting template: consensus statement of the Society of Abdominal Radiology and the American Pancreatic Association. Radiology 270(1):248–260. https://doi.org/10.1148/radiol.13131184

Article  PubMed  Google Scholar 

Kakar S, Pawlik TM, Allen PJ, Vauthey J-N (2017) Exocrine pancreas. In: Amin MB, Edge SB, Greene FL, Byrd DR, Brookland RK, Washington MK, Gershenwald JE, Compton CC, Hess KR, Sullivan DC, Jessup JM, Brierley JD, Gaspar LE, Schilsky RL, Balch CM, Winchester DP, Asare EA, Madera M, Gress DM, Meyer LR (eds) AJCC Cancer Staging Manual, 8th edn. Springer International Publishing, New York, NY, pp 338

Google Scholar 

National Comprehensive Cancer Network (2024) NCCN guidelines: pancreatic adenocarcinoma. https://www.nccn.org/professionals/physician_gls/pdf/pancreatic.pdf. Accessed 21 June 2024

Chiang W-L, Zheng L, Sheng Y, Angelopoulos AN, Li T, Li D, Zhang H, Zhu B, Jordan M, Gonzalez JE (2024) Chatbot arena: An open platform for evaluating llms by human preference. arXiv:2403.04132 [preprint]. https://doi.org/10.48550/arXiv.2403.04132. October 1, 2024.

Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M (2020) Transformers: State-of-the-art natural language processing. Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations. https://doi.org/10.18653/v1/2020.emnlp-demos.6

Willard BT, Louf R (2023) Efficient Guided Generation for LLMs. arXiv:2307.09702 [preprint]. https://doi.org/10.48550/arXiv.2307.09702. October 1, 2024.

Dettmers T, Lewis M, Belkada Y, Zettlemoyer L (2022) LLM. int8(): 8-bit Matrix Multiplication for Transformers at Scale. arXiv:2208.07339 [preprint]. https://doi.org/10.48550/arXiv.2208.07339. October 1, 2024.

Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV, Zhou D (2022) Chain-of-thought prompting elicits reasoning in large language models. arXiv:2201.11903 [preprint]. https://doi.org/10.48550/arXiv.2201.11903. October 1, 2024.

Brown TB (2020) Language models are few-shot learners. arXiv:2005.14165 [preprint]. https://doi.org/10.48550/arXiv.2005.14165. October 1, 2024.

Bates D, Mächler M, Bolker B, Walker S (2015) Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software 67(1):1–48. https://doi.org/10.18637/jss.v067.i01

Article  Google Scholar 

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011) Scikit-learn: Machine learning in Python. the Journal of machine Learning research 12:2825–2830

Google Scholar 

Seabold S, Perktold J (2010) Statsmodels: econometric and statistical modeling with python. SciPy 7(1):92–96

Article  Google Scholar 

Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J (2020) SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods 17(3):261–272

Article  PubMed  PubMed Central  CAS  Google Scholar 

Park SH, Suh CH, Lee JH, Kahn CE, Moy L (2024) Minimum Reporting Items for Clear Evaluation of Accuracy Reports of Large Language Models in Healthcare (MI-CLEAR-LLM). Korean J Radiol 25(10):865–868. https://doi.org/10.3348/kjr.2024.0843

Article  PubMed  PubMed Central  Google Scholar 

Joo I, Lee JM, Lee ES, Son J-Y, Lee DH, Ahn SJ, Chang W, Lee SM, Kang H-J, Yang HK (2019) Preoperative CT classification of the resectability of pancreatic cancer: interobserver agreement. Radiology 293(2):343–349. https://doi.org/10.1148/radiol.2019190422

Article  PubMed  Google Scholar 

Comments (0)

No login
gif