Classifying Cancer Stage with Open-Source Clinical Large Language Models

Chia-Hsuan Chang; Mary M Lucas; Grace Lu-Yao; Christopher C Yang

doi:10.48550/arxiv.2404.01589

Back

Classifying Cancer Stage with Open-Source Clinical Large Language Models

Preprint

Open access

Classifying Cancer Stage with Open-Source Clinical Large Language Models

Chia-Hsuan Chang, Mary M Lucas, Grace Lu-Yao and Christopher C Yang

ArXiv.org

01 Apr 2024

DOI: https://doi.org/10.48550/arxiv.2404.01589

Files and links (1)

url

https://doi.org/10.48550/arxiv.2404.01589View

Preprint (Author's original)arXiv.org - Non-exclusive license to distribute, Open

Abstract

Computer Science - Artificial Intelligence

Computer Science - Computation and Language

Cancer stage classification is important for making treatment and care management plans for oncology patients. Information on staging is often included in unstructured form in clinical, pathology, radiology and other free-text reports in the electronic health record system, requiring extensive work to parse and obtain. To facilitate the extraction of this information, previous NLP approaches rely on labeled training datasets, which are labor-intensive to prepare. In this study, we demonstrate that without any labeled training data, open-source clinical large language models (LLMs) can extract pathologic tumor-node-metastasis (pTNM) staging information from real-world pathology reports. Our experiments compare LLMs and a BERT-based model fine-tuned using the labeled data. Our findings suggest that while LLMs still exhibit subpar performance in Tumor (T) classification, with the appropriate adoption of prompting strategies, they can achieve comparable performance on Metastasis (M) classification and improved performance on Node (N) classification.

Metrics

14 Record Views

Details

Title: Classifying Cancer Stage with Open-Source Clinical Large Language Models
Creators: Chia-Hsuan Chang
Mary M Lucas
Grace Lu-Yao
Christopher C Yang
Publication Details: ArXiv.org
Resource Type: Preprint
Language: English
Academic Unit: Information Science
Other Identifier: 991021865013504721

Classifying Cancer Stage with Open-Source Clinical Large Language Models

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media